Skip to content
This repository has been archived by the owner on Feb 6, 2023. It is now read-only.

HTMLClickTracker - avoid tracking #anchors urls #51

Closed
wants to merge 1 commit into from

Conversation

sluc23
Copy link

@sluc23 sluc23 commented Dec 30, 2019

fixes issue #43

@mattwire
Copy link
Contributor

@totten this is good to merge

return preg_replace_callback(
';(\<[^>]*href *= *\')([^\'>]+)(\');', $callback, $tmp);
';(\<[^>]*href *= *\')([^#\'>]+)(\');', $callback, $tmp);
Copy link
Member

@totten totten Feb 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I reading correctly that this ignores links that have a # anywhere in the URL?

For example, suppose one composes a mailblast on behalf of civicrm.org which includes some discussion of a new feature and its documentation:

<p>
CiviCRM v9 adds the "Relationship" tab!  Check out the
<a href="https://docs.civicrm.org/user/en/latest/organising-your-data/contacts/#relationships-tab">new docs</a>.
</p>

(This is by no means exclusive to civicrm.org docs - anchors are common enough on the web at large. It's just an example off the top of my head.)

If I wrote+sent a message like that, then it would be quite surprising (to me) for the link-tracking to be disabled just because my link included a #. After skimming some docs and trying some throw-away PHP scripts, it seems entirely legal for a PHP script to send Location: https://docs.civicrm.org/user/en/latest/organising-your-data/contacts/#relationships-tab. So I don't see a technical reason to skip those URLs.

Aside: It might be good update HtmlClickTrackerTest to include some examples with anchors -- e.g. assert that (1) internal anchors are not tracked and (2) external links with anchors are tracked.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you right @totten , the idea of this PR was to ignore from tracking only local anchors urls that start with # , no an absolute url with an anchor at the end..

I'll review the PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any update on this Luciano?

Copy link
Member

@totten totten May 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sluc23 @MikeyMJCO - After re-reading this, I think it might be better to put the instruction somewhere else -- after the $url has been extracted. Adjusting this regex (to distinguish internal-vs-external anchors) feels scary.

If you trace what happens after the regex, these two spots look promising:

  • Inside of the $replace(string $url) function (code). It has a similar skip rule for {token}-based URLs.
  • Inside of the getTrackerURL(string $url,...) function (code). It has a similar skip rule for images+CSS URLs

Skipping internal anchors might be as simple as if ($url[0] === '#') { return $url; } ? (But I haven't tried it.)

The only difference between $replace and getTrackerURL(): the first is Flexmailer-only; the second is Flexmailer+MailingBAO.

I'm gonna close this PR but hope you get a chance to try out one of the other spots.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants