Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parse_unicode_url function #882

Open
alandefreitas opened this issue Nov 11, 2024 · 1 comment
Open

parse_unicode_url function #882

alandefreitas opened this issue Nov 11, 2024 · 1 comment

Comments

@alandefreitas
Copy link
Member

We should have a function to handle unicode urls:

result<url> parse_unicode_url(...)

To convert to a valid URL, the host would be converted with punycode, and other components would percent-escape when possible. Errors are still possible.

We can identify components by identifying possible delimiters (like urls::format does) or we can provide functions that create a URL from its components:

result<url>
make_url(
  string_view scheme,
  utf8_string_view authority,
  utf8_string_view path,
  utf8_string_view query,
  utf8_string_view fragment)
{
    url u;
    u.reserve(...);
    u.set_scheme(scheme);
    u.set_authority(authority);
    u.set_path(path);
    // ...
}

result<url>
make_iri(
  string_view scheme,
  utf8_string_view authority,
  utf8_string_view path,
  utf8_string_view query,
  utf8_string_view fragment)
{
    url u;
    u.reserve(...);
    u.set_scheme(scheme);
    u.set_authority(detail::parse_punycode(authority));
    u.set_path(detail::pct_encode(path));
    // ...
}
@luz-arreola
Copy link

It is probably a very good idea to use ICU for this. I currently use ICU to create and parse unicode URLs. It works perfectly with all types of Unicode characters. Adding ICU support for Boost.URL would be excellent! Alternatively, there are some very lightweight unicode libraries that I think would work for this, but ICU is the only fully featured unicode library. People who work with unicode very likely already use many of the features only provided by ICU I believe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants