Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use pooch for downloading the model: #36

Closed
GenevieveBuckley opened this issue Jun 3, 2023 · 4 comments
Closed

Use pooch for downloading the model: #36

GenevieveBuckley opened this issue Jun 3, 2023 · 4 comments

Comments

@GenevieveBuckley
Copy link
Collaborator

We could also think about using pooch for downloading the model: https://github.com/computational-cell-analytics/micro-sam/blob/master/micro_sam/util.py#L39-L63

Originally posted by @constantinpape in #35 (comment)

@hmaarrfk
Copy link
Contributor

FYI, i've found that SHA256 is quite slow at computing hashes for large files. (i.e. 2GB models...)

you might want to play around with maybe cryptographically non-secure hashes. I've used XXH128 with great success.

@constantinpape
Copy link
Contributor

Hi @hmaarrfk,

I think that would make sense. I don't think we require the hashing to be secure since the goal here is not to avoid malicious attacks, but just to make sure that files do not change.

In case you want to look into this: contributions are very welcome. (If not: no worries, but this would be on my list of lower priorities, so I can't really promise when we'd have time to look into this).
For details on how to contribute see https://github.com/computational-cell-analytics/micro-sam/blob/master/doc/contributing.md.

@GenevieveBuckley
Copy link
Collaborator Author

That should be straightforward to do, pooch supports the python-xxhash bindings.

The docs even have this nice example about how to use it, you just append xxh128: to the start of the hash.
https://www.fatiando.org/pooch/latest/hashes.html#other-supported-hashes

import datetime
import pooch

# Get the current data to store the files in separate folders
CURRENT_DATE = datetime.datetime.now().date()

GOODBOY = pooch.create(
    [...],
    registry={
        "store.zip": "xxh128:6a71973c93eac6c8839ce751ce10ae48",
    },
)

@constantinpape
Copy link
Contributor

Implemented in #276

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants