-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for repodata.json.zst #675
base: main
Are you sure you want to change the base?
Conversation
With the tests added in #677 I could easily add a test in this MR. |
Doing some tests locally, I noticed that this change makes the download of the repodata.json from a proxy channel very slow (when the file is big). The initial download from the remote repo is actually not the biggest issue as I thought in #660. Problem is the compression which is quite slow for big files. Compressing conda-forge/linux-64/repodata.json file to gz, then bz2 and now zst takes several seconds. The download of the file is blocked during that time and explains the time-out I saw on the client side. Will look if the compression can be done in the background. And maybe add options to disable that compression (when using quetz as internal conda server, network between clients and server is usually fast). |
You might want to look at how to use the asynchronous capabilities of the package store. I fixed (#626) an oversight in how packages were uploaded some time ago. It was using synchronous "filesystem" calls, and it was blocking for too long for big files. |
I don't think async will help in this case. The compression is done in In the meantime, I added a new compression section in the config to enable/disabled bz2, gz and zst compressions. Note that for the tests, I re-used what I implemented in #677. So I rebased this PR on the other one branch. Hoping #677 can be merged soon. |
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #675 +/- ##
==========================================
+ Coverage 83.61% 83.90% +0.28%
==========================================
Files 79 79
Lines 6233 6324 +91
==========================================
+ Hits 5212 5306 +94
+ Misses 1021 1018 -3 ☔ View full report in Codecov by Sentry. |
rattler is very efficient to download repodata
serve_repo_data fixture copied from rattler
dummy_remote_session_object wasn't cleaning after itself (using return instead of yield)
Test with migration failed with: Error: The action 'Testing server' has timed out after 5 minutes.
New version of mamba requests repodata.json.zst first. The compressed files are created locally when downloading the non compressed version. Quetz should always check if the repodata.json file needs to be re-downloaded so that all files stay consistent.
Add options to enable bzip2, gzip and zstandard compression
Adding support for
repodata.json.zst
(fix #573).repodata.json.zst
is now created as well as the.bz2
and.gz
versionsrepodata.json
repodata.json.zst
was requested and existed locally, it was never updated. Even with the support added in first commit, it would only be updated when someone was requesting therepodata.json
file (which triggers the creation of all compressed files). Note that I didn't find an easy way to write a test for that, which would be nice.