Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't install torch due to 503 during download. #15799

Closed
MrJoy opened this issue Apr 16, 2024 · 4 comments
Closed

Can't install torch due to 503 during download. #15799

MrJoy opened this issue Apr 16, 2024 · 4 comments
Labels
bug 🐛 requires triaging maintainers need to do initial inspection of issue

Comments

@MrJoy
Copy link

MrJoy commented Apr 16, 2024

Describe the bug

My deploy process runs a pip install during image build time on AWS. This process has now failed a couple times in a row with the following error:

ERROR: THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them.
    torch==2.1.1 from https://files.pythonhosted.org/packages/72/d0/8e7157fe416f657e38736a42d9b0b82ef7f7af00398516200b59ebb5995e/torch-2.1.1-cp311-cp311-manylinux2014_aarch64.whl (from -r requirements.txt (line 84)):
        Expected sha256 61b51b33c61737c287058b0c3061e6a9d3c363863e4a094f804bc486888a188a
             Got        db7f567ef4ee64ffdb28fe1cc71206584bdddc70e1e4a92e26b3671de6f9e32b

I tried fetching the file locally and got the correct SHA. So, I spun up an instance in EC2, connected to it, and fetched the file. That fetch failed after precisely 39MiB (a bit under half the expected size) with a 503 error. The resulting file had the hash db7f567ef4ee64ffdb28fe1cc71206584bdddc70e1e4a92e26b3671de6f9e32b.

All subsequent attempts to fetch the file from that instance succeeded.

I don't have enough samples to be conclusive about it, but it appears to be an oddly deterministic failure in which the first attempt to fetch this file from an EC2 node results in the same incomplete file being produced consistently.

I did get a successful download on the third image build attempt, so it's not perfectly consistent (thankfully). Honestly, I wouldn't bother reporting it except that it happened 3 times and produced the same incorrect hash every time.

Expected behavior

I would expect the server to not be sending me just under half the expected file on the first attempt. Alternatively, I'd expect pip to properly retry (help screen says it should make 5 attempts by default, and I'm not overriding that).

To Reproduce

pip3.11 install --requirement requirements.txt

Where requirements.txt contains the line torch=2.1.1, and the command is performed from a Linux instance on EC2 in Amazon's us-east-1 region.

My Platform

We're using Debian 11, with Python 3.11.4 (SHA 85c37a265e5c9dd9f75b35f954e31fbfc10383162417285e30ad25cc073a0d63) built from source.

Additional context

@MrJoy MrJoy added bug 🐛 requires triaging maintainers need to do initial inspection of issue labels Apr 16, 2024
@miketheman
Copy link
Member

This appears to be highly similar to pypa/pip#11153 - which discusses the issue surfacing as an interrupted download - which you've confirmed as likely, seeing about half of the download size before the error surfaces.

There's also some discussion there as to retry logic.

Without some more details on the actual HTTP connections from the output, such as with python -m pip install -vvv ... it's hard to make a determination as to why this may have happened.

Do you have those kinds of logs or context?

@MrJoy
Copy link
Author

MrJoy commented Apr 18, 2024

I'm afraid I do not. I will add -vvv to our image build process, and if there's another failure will provide the relevant data.

@di
Copy link
Member

di commented Apr 18, 2024

It looks like our object storage provider was having a small outage around this time, which might explain this, and is now resolved. If you're still able to reproduce this, let us know!

@MrJoy
Copy link
Author

MrJoy commented Apr 19, 2024

I have not seen it come up again, but will keep an eye out! TY for the update!

@MrJoy MrJoy closed this as completed Apr 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 requires triaging maintainers need to do initial inspection of issue
Projects
None yet
Development

No branches or pull requests

3 participants