HTTP/S `buildfetch` streams in chunks #3598

mtalexan · 2023-09-05T22:53:18Z

Keep RAM usage down when using buildfetch. Existing code was downloading the entire file into RAM before writing it to disk, but now that ociarchive files are included that are usually 4-8 GiB that's no longer a good idea. Lower-RAM systems choke on having too little RAM available to continue operating normally.

This instead sets a somewhat arbitrary 30 MiB chunk limit. If the server itself is already chunk encoding it will use whatever that says instead, but in a lot (most?) cases the server probably isn't.

It's confirmed to fix a problem with downloading 4 GiB ociarchive files as part of a buildfetch on a desktop system with only 8 GB of RAM. The system was crashing every time during the download due to the OOM killing it for eating too much of the available RAM. We ideally want the system to be able to do builds with ociarchives of up to 8 GiB.

Effect on download performance for existing uses should be relatively negligible. It might slightly impact download time if you have a really huge network pipe, lots of RAM, slow disk IO, and small disk IO cache, but otherwise it won't affect much.

In one timing test on a system with a 1 Gbps network pipe, an NVMe disk, 64 GB of RAM, and a 4GB ociarchive file the timing results were:

chunked:
    0.58s user 0.47s system 0% cpu 4:05.30 total
unchunked/original:
    0.57s user 0.41s system 0% cpu 4:24.63 total

So the chunking pretty clearly doesn't have an impact since the unchunked (original) code actually took longer in the case where it should have been optimal, indicating the difference is within the margin of testing error.

/closes #3597

openshift-ci · 2023-09-05T22:53:29Z

Hi @mtalexan. Thanks for your PR.

I'm waiting for a coreos member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

dustymabe

Seems reasonable to me. @jlebon @ravanelli - WDYT?

src/cmd-buildfetch

cgwalters · 2023-09-06T00:25:46Z

src/cmd-buildfetch

+
+                # If the HTTP headers have encoded the file transfer as chunks already, respect those instead
+                # of our hardcoded max size.
+                if 'chunked' in r.headers.get('transfer-encoding', list()):


Is this something you copied from elsewhere? Is it best practice?

Does any of this really matter for us versus just calling iter_content(chunk_size=None)?

The code was inspired by this StackOverflow answer and its comments: https://stackoverflow.com/a/16696317/388082
Reading some of the requests documentation though, the code block on StackOverflow is all but an example straight from the documentation.

According to the documentation, if you set stream=True in the get request, it allows you to use iter_content(). The parameter chunk_size is technically optional, but setting it to None means to use whatever the server said for the chunk sizes. If the server didn't decide to do chunk encoding, it will give you the same result as not using iter_content() at all (the very problem I'm trying to solve).

So the logic here is setting a variable to a sane default chunk size, then if the encoding of the reply from the server indicates it is using chunked encoding, we reset to None so the server-side chunk sizes are used instead.

"Best practice" would seem to be to use urllib instead of the lower level requests module. While I can't personally confirm one way or another, the urllib.urlretrieve() function is designed to do exactly what is being done here in an optimized way. I don't know if it has an equivalent of the exists_impl functionality though, which might be why requests was used here instead.

src/cmd-buildfetch

jlebon

This makes a lot of sense to me. We've actually had OOM crashes from cosa buildfetch too in the past in the pipeline (though there we use the S3 APIs, which when I last dug into it I think did do chunking but I'll have to recheck). Though we do also use the HTTPFetcher in some places which might be helped by this. Thanks for the contribution!

Minor nit: can you squash the flake8 fixup into the main commit?

…ks, or declared chunk encoding to reduce RAM usage

mtalexan · 2023-09-06T20:17:20Z

Minor nit: can you squash the flake8 fixup into the main commit?

Done.

cgwalters · 2023-09-06T20:28:50Z

/ok-to-test

dustymabe

LGTM

openshift-ci · 2023-09-07T00:29:07Z

@mtalexan: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/rhcos	`2d7e8c5`	link	true	`/test rhcos`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

dustymabe · 2023-09-07T02:24:44Z

I'm pretty sure rhcos CI doesn't cover this case. Merging...

cgwalters · 2023-09-07T12:36:58Z

A minor procedural note that I didn't notice before merge; you have text content in the PR but not in the commit message. In most of our repositories we try hard to follow modern Git conventions.

Instead of:

Stream GET request for HttpFetcher download, and write in 30 MiB chunks, or declared chunk encoding to reduce RAM usage

I would have written e.g.:

buildfetch: Stream downloads instead of buffering entirely in memory

This means that cosa buildfetch which can download a large .ociarchive
can be more reliable on low memory systems.

Notice the "topic prefix". I also dropped the "in 30 MiB chunks" from the title as it's an unimportant implementation detail.

And I repeat the term buildfetch in the body too because it's really the focus.

openshift-ci bot added the needs-ok-to-test label Sep 5, 2023

dustymabe reviewed Sep 5, 2023

View reviewed changes

src/cmd-buildfetch Outdated Show resolved Hide resolved

src/cmd-buildfetch Outdated Show resolved Hide resolved

cgwalters approved these changes Sep 6, 2023

View reviewed changes

jlebon approved these changes Sep 6, 2023

View reviewed changes

Stream GET request for HttpFetcher download, and write in 30 MiB chun…

2d7e8c5

…ks, or declared chunk encoding to reduce RAM usage

mtalexan force-pushed the buildfetch-streamed-http branch from 6e1d1c7 to 2d7e8c5 Compare September 6, 2023 20:16

jlebon approved these changes Sep 6, 2023

View reviewed changes

jlebon enabled auto-merge (rebase) September 6, 2023 20:26

openshift-ci bot added ok-to-test and removed needs-ok-to-test labels Sep 6, 2023

dustymabe approved these changes Sep 6, 2023

View reviewed changes

dustymabe disabled auto-merge September 7, 2023 02:24

dustymabe merged commit 52774c1 into coreos:main Sep 7, 2023
2 checks passed

mtalexan deleted the buildfetch-streamed-http branch September 14, 2023 18:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HTTP/S `buildfetch` streams in chunks #3598

HTTP/S `buildfetch` streams in chunks #3598

mtalexan commented Sep 5, 2023

openshift-ci bot commented Sep 5, 2023

dustymabe left a comment

cgwalters Sep 6, 2023

mtalexan Sep 6, 2023 •

edited

Loading

jlebon left a comment

mtalexan commented Sep 6, 2023

cgwalters commented Sep 6, 2023

dustymabe left a comment

openshift-ci bot commented Sep 7, 2023

dustymabe commented Sep 7, 2023

cgwalters commented Sep 7, 2023

HTTP/S buildfetch streams in chunks #3598

HTTP/S buildfetch streams in chunks #3598

Conversation

mtalexan commented Sep 5, 2023

openshift-ci bot commented Sep 5, 2023

dustymabe left a comment

Choose a reason for hiding this comment

cgwalters Sep 6, 2023

Choose a reason for hiding this comment

mtalexan Sep 6, 2023 • edited Loading

Choose a reason for hiding this comment

jlebon left a comment

Choose a reason for hiding this comment

mtalexan commented Sep 6, 2023

cgwalters commented Sep 6, 2023

dustymabe left a comment

Choose a reason for hiding this comment

openshift-ci bot commented Sep 7, 2023

dustymabe commented Sep 7, 2023

cgwalters commented Sep 7, 2023

HTTP/S `buildfetch` streams in chunks #3598

HTTP/S `buildfetch` streams in chunks #3598

mtalexan Sep 6, 2023 •

edited

Loading