Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support uppercase characters in host #1441

Closed
4 of 5 tasks
xZise opened this issue Nov 30, 2024 · 6 comments · Fixed by #1442
Closed
4 of 5 tasks

Support uppercase characters in host #1441

xZise opened this issue Nov 30, 2024 · 6 comments · Fixed by #1442

Comments

@xZise
Copy link

xZise commented Nov 30, 2024

Please confirm the following

  • I understand this is open source software provided for free and that I might not receive a timely response.
  • I am positive I am NOT reporting a (potential) security
    vulnerability, to the best of my knowledge. (These must be shared by
    submitting this report form instead, if
    any hesitation exists.)
  • I am willing to submit a pull request with reporoducers as xfailing test cases or even entire fix. (Assign this issue to me.)

Describe the bug

When using uppercase characters in the hostname they get reported as invalid:

>>> URL.build(scheme="http", host="A", port=port, path="/")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "...\yarl\_url.py", line 386, in build
    _host = _encode_host(host, validate_host=True)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "...\yarl\_url.py", line 1496, in _encode_host
    raise ValueError(
ValueError: Host 'A' cannot contain 'A' (at position 0)
>>> URL.build(scheme="http", host="a", port=port, path="/")
URL('http://a/')

It appears to me, that there needs to be a conversion into lowercase, for example the host-property says it gets converted to lowercase (which is the case when using __init__()) and the comment at NOT_REG_NAME mentions that it only accepts lowercase ASCII values:

# this pattern matches anything that is *not* in those classes. and is only used
# on lower-cased ASCII values.

When using a non-ASCII string with host, it gets encoded so it seems weird that ASCII uppercase strings aren't "encoded" into lowercase. There is #386, but there it is using __init__() which works correctly.

To Reproduce

  1. Install yarl
  2. Call yarl.URL.build(scheme="http", host="A", port=port, path="/")

Expected behavior

A valid URL which is identical to the build URL using the lowercase host.

Logs/tracebacks

>>> URL.build(scheme="http", host="A", port=port, path="/")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "...\yarl\_url.py", line 386, in build
    _host = _encode_host(host, validate_host=True)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "...\yarl\_url.py", line 1496, in _encode_host
    raise ValueError(
ValueError: Host 'A' cannot contain 'A' (at position 0)
>>> URL.build(scheme="http", host="a", port=port, path="/")
URL('http://a/')

Python Version

$ python --version
Python 3.12.2

multidict Version

$ python -m pip show multidict
Name: multidict
Version: 6.1.0
Summary: multidict implementation
Home-page: https://github.com/aio-libs/multidict
Author: Andrew Svetlov
Author-email: [email protected]
License: Apache 2
Location: ..\Lib\site-packages
Requires:
Required-by: yarl

propcache Version

$ python -m pip show propcache
Name: propcache
Version: 0.2.0
Summary: Accelerated property cache
Home-page: https://github.com/aio-libs/propcache
Author: Andrew Svetlov
Author-email: [email protected]
License: Apache-2.0
Location: ..\Lib\site-packages
Requires:
Required-by: yarl

yarl Version

$ python -m pip show yarl
Name: yarl
Version: 1.18.0
Summary: Yet another URL library
Home-page: https://github.com/aio-libs/yarl
Author: Andrew Svetlov
Author-email: [email protected]
License: Apache-2.0
Location: ..\Lib\site-packages
Requires: idna, multidict, propcache
Required-by:

OS

Windows 10

Additional context

No response

@xZise xZise added the bug label Nov 30, 2024
@bdraco
Copy link
Member

bdraco commented Nov 30, 2024

We do lowercase the host after the validation so its an order problem.

@bdraco
Copy link
Member

bdraco commented Nov 30, 2024

This is a regression introduced in #954

@bdraco
Copy link
Member

bdraco commented Nov 30, 2024

#1442 will fix this.

Unfortunately we cannot do a release right now due to pypa/gh-action-pypi-publish#307

@bdraco
Copy link
Member

bdraco commented Dec 1, 2024

It looks like whatever problem that was slowing down the uploads enough to fail has been resolved as propcache just released successfully. I'll try to get a .3 published with this fix today

@bdraco
Copy link
Member

bdraco commented Dec 1, 2024

1.18.3 publishing now. 🤞 that the upload problem doesn't bite us

https://github.com/aio-libs/yarl/actions/runs/12108501591

@bdraco
Copy link
Member

bdraco commented Dec 1, 2024

1.18.3 now available with the fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants