Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider removing url test containing utf16 surrogates #46941

Open
Wuelle opened this issue Jun 29, 2024 · 0 comments
Open

Consider removing url test containing utf16 surrogates #46941

Wuelle opened this issue Jun 29, 2024 · 0 comments

Comments

@Wuelle
Copy link
Contributor

Wuelle commented Jun 29, 2024

urltestdata.json contains the following test:

{
    "input": "http://example.com/\uD800\uD801\uDFFE\uDFFF\uFDD0\uFDCF\uFDEF\uFDF0\uFFFE\uFFFF?\uD800\uD801\uDFFE\uDFFF\uFDD0\uFDCF\uFDEF\uFDF0\uFFFE\uFFFF",
    "base": null,
    "href": "http://example.com/%EF%BF%BD%F0%90%9F%BE%EF%BF%BD%EF%B7%90%EF%B7%8F%EF%B7%AF%EF%B7%B0%EF%BF%BE%EF%BF%BF?%EF%BF%BD%F0%90%9F%BE%EF%BF%BD%EF%B7%90%EF%B7%8F%EF%B7%AF%EF%B7%B0%EF%BF%BE%EF%BF%BF",
    "origin": "http://example.com",
    "protocol": "http:",
    "username": "",
    "password": "",
    "host": "example.com",
    "hostname": "example.com",
    "port": "",
    "pathname": "/%EF%BF%BD%F0%90%9F%BE%EF%BF%BD%EF%B7%90%EF%B7%8F%EF%B7%AF%EF%B7%B0%EF%BF%BE%EF%BF%BF",
    "search": "?%EF%BF%BD%F0%90%9F%BE%EF%BF%BD%EF%B7%90%EF%B7%8F%EF%B7%AF%EF%B7%B0%EF%BF%BE%EF%BF%BF",
    "hash": ""
  }

The interesting part here is the \uD801\uDFFE - that's a UTF-16 surrogate.

The behaviour in this case is undefined as per the URL specification1, where the input to the url parsing algorithm is a scalar value string2 (meaning a string containing neither leading nor trailing surrogate characters).

url/README.md states:

resources/urltestdata.json contains URL parsing tests suitable for any URL parser implementation.

Therefore, the suite should only test behaviour defined in the url specification3.

I would like to hear the thoughts of more qualified people on this before I make a PR for it (:

Footnotes

  1. https://url.spec.whatwg.org/#url-parsing

  2. https://infra.spec.whatwg.org/#scalar-value-string

  3. https://url.spec.whatwg.org

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
@Wuelle and others