-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non ASCII characters are not allowed in the path #40
Comments
cf. ruby/webrick#110, especially this comment ruby/webrick#110 (comment). This is because URI doesn't support RFC 3987 (Internationalized Resource Identifier (IRI)). |
No, a URI path is not allowed to contain arbitrary UTF-8 characters. Non-ASCII UTF-8 characters must be percent encoded, and even some ASCII characters must be percent encoded. It's true that the URI library doesn't support IRIs. That's not a bug, there should probably be a separate library used for IRIs. |
IRIs have not been integrated into URIs to keep the retro-compatibility. But IRI is extending URI.
Ruby has a huge Unicode support (in strings, regexp, etc.) so not supporting Unicode in uri module is an exception. If one does not want to change the behavior of the default As IRI is extending URI and deeply linked to it I would more see IRI support integrated in new methods in the URI module rather than having a separate module only for URI. But that's just my POV and I may not be the better suited nor more experienced here.
I agree, that more a feature request to support modern usage where Unicode is widely spread and massively democratized. |
Just ran into this today... noraj's comments above seem spot-on to me. |
Hi,
I'm getting such error:
I thought that the path component is allowed to contain any UTF-8 character.
The text was updated successfully, but these errors were encountered: