-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AGASC supplement HDF5 is bloated #127
Comments
Question: How does the "time machine" idea relate to the bloating? |
I had a thought that we might mitigate problems with reproducibility by storing the actual AGASC supplement used in load generation (by the FOT) in the backstop load products. But the current size is way too huge, so I got started trying to think of another solution to the reproducibility problem => time machine / snapshots. Anyway the time machine idea got moved to a new issue. |
Getting back to this again related to #128, I read in each of the tables using astropy and wrote back out with no compression:
This gives a file that is 1.8 Mb instead of the current 19 Mb. I'd suggest that the code that finally writes out the new supplement h5 file should be modified as above. |
For some reason, maybe related to the PyTables update process, the
agasc_supplement.h5
file size is a factor of 10 larger than I would expect. Most of the size is from themags
table, and that should be about 1.7 Mb uncompressed:The other two tables are less than 100 kb. I confirmed this size estimate by writing an H5 file of mags using astropy. With compression on it is close to 1.0 Mb.
I was just wondering about tossing a version of the supplement in the backstop tarball, but 2 Mb might be a bit much. But we should get back to the idea of maintaining a git-based time machine of previous versions #128.
The text was updated successfully, but these errors were encountered: