-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support xxHash algorithm #409
Comments
Still a very much needed feature! |
We have someone working on it now. The performance gain is not yet as much as we would have expected. Please stay tuned for updates. |
That's weird...Hopefully it will be optimized! 👍 |
How is your project going along? I am CPU bottle-necked using hashdeep, and would greatly love a "xxhashdeep" or similar. Even small improvements would be helpful. |
Hi there,
at Juelich Supercomputing Centre, we've recently been researching convenient tools to generate and verify hash sums of large collections of data. The amounts we're typically talking about are in the area of several TB to PB. We've found hashdeep to be convenient and providing a good interface including parallelisation options that may be important to checksum and verify many small files.
We've also come across the xxHash algorithm, which has been specifically designed to create checksums over extremely large amounts of data.
We have found the commandline tools provided for xxHash to lack some functionality offered by hashdeep. Therefore, we propose to integrate xxHash into hashdeep to improve the support for use cases dealing with extremely large volumes of data. Moreover, we also support the idea of integrating Blake3, as mentioned in #397.
In the spirit of Open Source, we do offer our full support in doing the integration ourselves, but would like to learn about your willingness to include the code in the main branch afterwards. Additionally, if there were good reasons to omit algorithms such as xxHash or Blake3, please let us know about them.
In order to support our request in numbers, here's a comparison of various algorithms supported in hashdeep and xxHash on a 155GB data set of two files.
As you can see, xxHash it at least 5 times faster than the fastest algorithm supported by hashdeep.
The text was updated successfully, but these errors were encountered: