-
Notifications
You must be signed in to change notification settings - Fork 22
HTMLClickTracker - avoid tracking #anchors urls #51
Conversation
@totten this is good to merge |
return preg_replace_callback( | ||
';(\<[^>]*href *= *\')([^\'>]+)(\');', $callback, $tmp); | ||
';(\<[^>]*href *= *\')([^#\'>]+)(\');', $callback, $tmp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I reading correctly that this ignores links that have a #
anywhere in the URL?
For example, suppose one composes a mailblast on behalf of civicrm.org
which includes some discussion of a new feature and its documentation:
<p>
CiviCRM v9 adds the "Relationship" tab! Check out the
<a href="https://docs.civicrm.org/user/en/latest/organising-your-data/contacts/#relationships-tab">new docs</a>.
</p>
(This is by no means exclusive to civicrm.org
docs - anchors are common enough on the web at large. It's just an example off the top of my head.)
If I wrote+sent a message like that, then it would be quite surprising (to me) for the link-tracking to be disabled just because my link included a #
. After skimming some docs and trying some throw-away PHP scripts, it seems entirely legal for a PHP script to send Location: https://docs.civicrm.org/user/en/latest/organising-your-data/contacts/#relationships-tab
. So I don't see a technical reason to skip those URLs.
Aside: It might be good update HtmlClickTrackerTest
to include some examples with anchors -- e.g. assert that (1) internal anchors are not tracked and (2) external links with anchors are tracked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you right @totten , the idea of this PR was to ignore from tracking only local anchors urls that start with #
, no an absolute url with an anchor at the end..
I'll review the PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any update on this Luciano?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sluc23 @MikeyMJCO - After re-reading this, I think it might be better to put the instruction somewhere else -- after the $url
has been extracted. Adjusting this regex (to distinguish internal-vs-external anchors) feels scary.
If you trace what happens after the regex, these two spots look promising:
- Inside of the
$replace(string $url)
function (code). It has a similar skip rule for{token}
-based URLs. - Inside of the
getTrackerURL(string $url,...)
function (code). It has a similar skip rule for images+CSS URLs
Skipping internal anchors might be as simple as if ($url[0] === '#') { return $url; }
? (But I haven't tried it.)
The only difference between $replace
and getTrackerURL()
: the first is Flexmailer-only; the second is Flexmailer+MailingBAO.
I'm gonna close this PR but hope you get a chance to try out one of the other spots.
fixes issue #43