Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opt-in to hide text fragment directive from scripts #234

Open
zcorpan opened this issue Oct 3, 2023 · 15 comments
Open

Opt-in to hide text fragment directive from scripts #234

zcorpan opened this issue Oct 3, 2023 · 15 comments

Comments

@zcorpan
Copy link

zcorpan commented Oct 3, 2023

See eligrey/fragment-directives#1

Currently, the fragment directive is hidden from most APIs but in Chromium is (unintentionally) exposed in performance.getEntries().

Most search engines don't expose the search query to the linked page, but if they use text fragments to highlight matching text, and if the text fragment directive is directly available via an API, the page has a rough idea what the search query was.

I assume that search engines generally have an interest in using text fragments (for better user experience) and also want to hide the search query from the pages it links to in its search results (for privacy).

But there's evidently also interest in accessing the text directive in other situations, e.g. custom scrolling to text fragment.

I suggest adding a fragment directive to hide the text directive from the page's script:

#:~:text=Something&hide-text-fragment-from-script

(naming TBD)

A search engine (or other) can then feature-detect both text fragment support and support for the hiding opt-in:

if (document.fragmentDirective?.supportsHideTextFragmentFromScript) {
  // ok to use text fragments in links
}
@annevk
Copy link
Collaborator

annevk commented Oct 4, 2023

I don't understand what this is proposing. Are you suggesting text fragments would be exposed by default unless folks opt-out?

@zcorpan
Copy link
Author

zcorpan commented Oct 4, 2023

Yes.

@annevk
Copy link
Collaborator

annevk commented Oct 4, 2023

That does not seem good from a privacy perspective.

@zcorpan
Copy link
Author

zcorpan commented Oct 4, 2023

Do you think the text fragment should never be exposed to script?

@annevk
Copy link
Collaborator

annevk commented Oct 4, 2023

I'm not sure I'd go that far. When the navigation was made by script exposing it to script seems reasonable.

@bokand
Copy link
Collaborator

bokand commented Oct 4, 2023

An opt-in seems more appropriate in this case. A referrer that decides there's no privacy risk in sharing the text fragment with the destination can opt-in to allow the destination to see it. By default, it's hidden. Seems similar to referrer-policy to me.

@annevk
Copy link
Collaborator

annevk commented Oct 4, 2023

Note that most browsers clamped down on Referer beyond referrer policy. E.g., unsafe-url in practice only works same-site in a number of cases. So we want to tread with care.

@simon-friedberger
Copy link

simon-friedberger commented Oct 5, 2023

This is a bit apropos but afaict the spec also doesn't include any justification for hiding the fragments from the site. @annevk could you elaborate a bit on what the privacy concern is?

And, as mentioned in the issue linked above, since the site can already detect its scroll position it will probably have a good idea what the search text may have been.

@bokand
Copy link
Collaborator

bokand commented Oct 5, 2023

It's true that the initial motivation was to prevent the page from breaking due to seeing unexpected content in the fragment portion of the URL. However, there are privacy considerations as well. They're not discussed (as we assumed they were obviated by the URL stripping mechanism) though I probably should have at least mentioned it somewhere. For example, if a user is sent to a page via a search engine, users don't expect the page to be able to infer their search query.

While the page author could probably guess roughly, even without scroll-to-text, based on their page's overall content, and now via the scroll position, having the exact text directive used does increase the granularity of that signal.

@simon-friedberger
Copy link

Thanks David! I certainly agree that the leakage is slightly worse.

On the other hand…

  1. There is some desire to have the data accessible from scripts for custom searches (not sure how that is supposed to interact with the built-in browser search!) or marginalia.
  2. I am not sure the additional complexity of distinguishing hidden and non-hidden directives as discussed in https://github.com/eligrey/fragment-directives is a good trade-off.
  3. It was common practice to include search terms in the referrer in the past. And this is still possible. And the new fragments don't require including search terms.
  4. It's not clear that users would expect a site to not be aware of what is being searched for either.

At this point, I am not convinced that the fragment directives should be hidden from scripts. I am even less convinced that it should be configurable.

To provide a different point of view: URL has always been available to the server, and URL including fragment has always been available to scripts. Do we really want to introduce a new concept by having parts of the URL hidden from server and scripts and only available to the UA? (Not a rhetorical question.)

@annevk
Copy link
Collaborator

annevk commented Oct 6, 2023

It was common practice to include search terms in the referrer in the past. And this is still possible.

It's not really, as I pointed out above. Unless you somehow make them part of the target URL, potentially modifying the target response as a result.

Do we really want to introduce a new concept by having parts of the URL hidden from server and scripts and only available to the UA?

I think that is a decision we made when we added text fragments, i.e., yes.

@simon-friedberger
Copy link

Thanks for clarifying!

I agree with @bokand that it's worth spelling out in the spec what is hidden and why. It's important for the follow-up discussion on making that hiding optional as proposed in https://github.com/eligrey/fragment-directives.

@noamr
Copy link

noamr commented Jul 1, 2024

Coming back to this... given that URLs from search engines can be shared across browsers, I don't think we can assume that the browser where the user consumes the URL is going to hide a text fragment produced by a browser that does support fragment hiding. If it's in the URL, it's going to potentially be accessible at the receiving end, and also create breakage if the website expects a particular fragment format.

Given that due to the navigation timing quirk the fragment directive was never really hidden, I suggest to unhide it completely from script and expose it in document.URL etc, and encourage browsers to perform some soft hiding such as omitting it when sharing links using browser UI etc.

@annevk
Copy link
Collaborator

annevk commented Jul 1, 2024

Wouldn't that rather badly regress on the privacy properties of this feature? Presumably Chromium is planning on addressing the bug that inadvertently exposed this information?

@noamr
Copy link

noamr commented Jul 1, 2024

Wouldn't that rather badly regress on the privacy properties of this feature? Presumably Chromium is planning on addressing the bug that inadvertently exposed this information?

My point is that the privacy properties of this feature are not a web platform concern, but rather a concern for the linking site (e.g. search engine) or browser UI (e.g. hiding this info when copying/pasting the link). Anything else would give a false sense of privacy as hiding the text fragment doesn't work like a progressive enhancement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants