Introduce resumable downloads with --resume-retries #12991

gmargaritis · 2024-10-04T20:26:53Z

Resolves #4796

Introduced the --resume-retries option in order to allow resuming incomplete downloads incase of dropped or timed out connections.

This option additionally uses the values specified for --retries and --timeout for each resume attempt, since they are passed in the session.

Used 0 as the default in order to keep backwards compatibility.

This PR is based on #11180

The downloader will make new requests and attempt to resume downloading using a Range header. If the initial response includes an ETag (preferred) or Date header, the downloader will ask the server to resume downloading only when it is safe (i.e., the file hasn't changed since the initial request) using an If-Range header.

If the server responds with a 200 (e.g. if the server doesn't support partial content or can't check if the file has changed), the downloader will restart the download (i.e. start from the very first byte); if the server responds with a 206 Partial Content, the downloader will resume the download from the partially downloaded file.

- Added —resume-retries option to allow resuming incomplete downloads - Setting —resume-retries=N allows pip to make N attempts to resume downloading, in case of dropped or timed out connections - Each resume attempt uses the values specified for —retries and —timeout internally Signed-off-by: gmargaritis <[email protected]>

gmargaritis · 2024-10-04T20:49:02Z

I'm guessing the CI fails because of the new linter rules introduced in 102d818

thk686 · 2024-10-04T21:01:04Z

Does this do rsync-style checksums? That would increase reliability.

notatallshaw · 2024-10-04T23:42:21Z

I'm guessing the CI fails because of the new linter rules introduced in 102d818

This is CI fix, failing until it's merged: #12964

Signed-off-by: gmargaritis <[email protected]>

gmargaritis · 2024-11-18T20:04:56Z

Hey @notatallshaw 👋

Is there anything that I can do to move this one forward?

notatallshaw · 2024-12-11T18:49:49Z

Is there anything that I can do to move this one forward?

A pip maintainer needs to take up the task of reviewing it, as we're all volunteers it's a matter of finding time.

I think my main concern would be the behavior when interacting with index servers that behave badly, e.g. give the wrong content length (usually 0). Your description looks good to me, but I haven't had time to look over the code yet.

gmargaritis · 2024-12-11T21:45:47Z

A pip maintainer needs to take up the task of reviewing it, as we're all volunteers it's a matter of finding time.

Yeah, I know how it goes, so no worries!

If you need any clarifications or would like me to make changes, I'd be happy to help!

art-ignatev · 2025-01-13T08:42:30Z

any chances that it'll be merged soon?

notatallshaw · 2025-02-01T05:59:02Z

I've had an initial cursory glace at this PR and it appears to be sufficiently high quality.

I've also run the functionality locally (select a large wheel to download and then disconnect my WiFi midway through the download) and it has a good UX.

My main concern, although this is a ship that has probably sailed, is it would be nice for pip not to have to directly handle HTTP intricacies and leave that to a separate library.

I can’t promise a full review or other maintainers will agree, but I am adding it to the 25.1 milestone for it to be tracked.

pfmoore · 2025-02-01T10:39:07Z

The PR looks good, although I’m not a http expert so I can’t comment on details like status and header handling. Like @notatallshaw I wish we could leave this sort of detail to a 3rd party library, but that would be a major refactoring. Add this PR (along with cert handling, parallel downloads, etc) to the list of reasons we should consider such a refactoring, but in the meantime I’m in favour of adding this.

pfmoore · 2025-02-01T10:41:31Z

There isn’t an “approve with conditions” button, but I approve this change on the basis that someone who understands http should check the header and status handling.

ichard26 · 2025-02-01T19:23:07Z

I'll tack this onto my to-do list. Not sure if I can call myself a HTTP expert, but I've done a fair bit of webdev as a hobby so I'm decently familiar with HTTP statuses and header handling.

Sorry for taking so long to review. Large PRs like these are appreciated since they do often implement major improvements, but they're also tedious to review and pretty daunting. Not really a good excuse, but that's how it feels. Thanks @notatallshaw for the initial pass and confirming this is worth the look.

gmargaritis · 2025-02-01T21:17:48Z

Awesome! Thank you for all your efforts!

Don’t worry about it, I know how it feels! Let me know if you need anything ✌️

yichi-yang and others added 3 commits September 26, 2024 21:26

Add support to resume incomplete download

0617d7c

Better incomplete download error message

a091ca1

gmargaritis force-pushed the introduce-resuming-downloads branch from 16fb735 to dbc6a64 Compare October 4, 2024 20:29

psf-chronographer bot added the bot:chronographer:provided label Oct 4, 2024

gmargaritis mentioned this pull request Oct 4, 2024

[Improvement] Pip could resume download package at halfway the connection is poor #4796

Open

gmargaritis added 5 commits October 9, 2024 15:51

Merge branch 'main' into introduce-resuming-downloads

7e9ea50

Merge branch 'main' into introduce-resuming-downloads

889ac6b

Merge branch 'main' into introduce-resuming-downloads

0b86d14

Add initial_progress to _raw_progress_bar

2cfd8fe

Signed-off-by: gmargaritis <[email protected]>

Merge branch 'main' into introduce-resuming-downloads

1a9c23b

Merge branch 'main' into introduce-resuming-downloads

64bd385

gmargaritis added 5 commits December 18, 2024 23:32

Merge branch 'main' into introduce-resuming-downloads

d265d53

Merge branch 'main' into introduce-resuming-downloads

d4e2da2

Merge branch 'main' into introduce-resuming-downloads

0dbb4bd

Merge branch 'main' into introduce-resuming-downloads

9b0bb5d

Merge branch 'main' into introduce-resuming-downloads

68a7b05

gmargaritis added 2 commits January 26, 2025 10:32

Merge branch 'main' into introduce-resuming-downloads

a6576b3

Merge branch 'main' into introduce-resuming-downloads

eb6a8db

notatallshaw added this to the 25.1 milestone Feb 1, 2025

ichard26 self-requested a review February 1, 2025 19:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce resumable downloads with --resume-retries #12991

Introduce resumable downloads with --resume-retries #12991

gmargaritis commented Oct 4, 2024 •

edited

Loading

gmargaritis commented Oct 4, 2024

thk686 commented Oct 4, 2024

notatallshaw commented Oct 4, 2024

gmargaritis commented Nov 18, 2024

notatallshaw commented Dec 11, 2024 •

edited

Loading

gmargaritis commented Dec 11, 2024

art-ignatev commented Jan 13, 2025

notatallshaw commented Feb 1, 2025

pfmoore commented Feb 1, 2025

pfmoore commented Feb 1, 2025

ichard26 commented Feb 1, 2025 •

edited

Loading

gmargaritis commented Feb 1, 2025

Introduce resumable downloads with --resume-retries #12991

Are you sure you want to change the base?

Introduce resumable downloads with --resume-retries #12991

Conversation

gmargaritis commented Oct 4, 2024 • edited Loading

gmargaritis commented Oct 4, 2024

thk686 commented Oct 4, 2024

notatallshaw commented Oct 4, 2024

gmargaritis commented Nov 18, 2024

notatallshaw commented Dec 11, 2024 • edited Loading

gmargaritis commented Dec 11, 2024

art-ignatev commented Jan 13, 2025

notatallshaw commented Feb 1, 2025

pfmoore commented Feb 1, 2025

pfmoore commented Feb 1, 2025

ichard26 commented Feb 1, 2025 • edited Loading

gmargaritis commented Feb 1, 2025

gmargaritis commented Oct 4, 2024 •

edited

Loading

notatallshaw commented Dec 11, 2024 •

edited

Loading

ichard26 commented Feb 1, 2025 •

edited

Loading