Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sha2): implement intrinsic based version for sha256 #90

Closed

Conversation

dignifiedquire
Copy link
Member

This is not optimal yet when processing multiple blocks, but it does give a considerable speedup over the non intrinsic version already

Measurements were taken on an AMD Ryzen 7 3700X 8-Core.

Baseline

test bench1_10    ... bench:          36 ns/iter (+/- 0) = 277 MB/s
test bench2_100   ... bench:         314 ns/iter (+/- 4) = 318 MB/s
test bench3_1000  ... bench:       3,096 ns/iter (+/- 16) = 322 MB/s
test bench4_10000 ... bench:      30,935 ns/iter (+/- 297) = 323 MB/s

With asm-hashes

test bench1_10    ... bench:          33 ns/iter (+/- 0) = 303 MB/s
test bench2_100   ... bench:         291 ns/iter (+/- 1) = 343 MB/s
test bench3_1000  ... bench:       2,863 ns/iter (+/- 14) = 349 MB/s
test bench4_10000 ... bench:      28,597 ns/iter (+/- 191) = 349 MB/s

With Intrinsics

test bench1_10    ... bench:           8 ns/iter (+/- 0) = 1250 MB/s
test bench2_100   ... bench:          54 ns/iter (+/- 0) = 1851 MB/s
test bench3_1000  ... bench:         511 ns/iter (+/- 8) = 1956 MB/s
test bench4_10000 ... bench:       5,079 ns/iter (+/- 29) = 1968 MB/s

@dignifiedquire
Copy link
Member Author

I have also added feature detection at runtime, based on how blake2_simd implements it (https://github.com/oconnor663/blake2_simd/blob/master/blake2s/src/guts.rs)

As otherwise it gets tricky to deploy this without creating many different builds

@tarcieri
Copy link
Member

tarcieri commented Nov 5, 2019

Impressive numbers

@tarcieri
Copy link
Member

tarcieri commented Nov 5, 2019

re: build failure, looks like target_feature is unsupported on 1.21. Personally I think it makes sense to increase the MSRV to be able to take advantage of it /cc @newpavlov

@newpavlov
Copy link
Member

This is why I wanted to add those intrinsics only in v0.9. For now I think we can use a disabled by default feature or wait until I will get time to work on RustCrypto/traits#43 (probably only in January).

@tarcieri
Copy link
Member

@dignifiedquire can you rebase? we've bumped MSRV to 1.41+

@Fleshgrinder
Copy link

@dignifiedquire are you planning to implement the same for the other SHA algos?

@newpavlov
Copy link
Member

@Fleshgrinder
See #166. x86 SHA extension only covers SHA-1 and SHA-256.

@dignifiedquire
Copy link
Member Author

I have rebased the code and pulled in the changes I had in https://github.com/filecoin-project/rust-sha2ni

Fyi rust-sha2ni just got a security audit through NCC (the sha2-256 part), will publishe the report soon.

@newpavlov
Copy link
Member

@dignifiedquire
I've drafted the alternative PR partially using yours as a reference. I think it has turned out quite nice and compact. Also it should have the same performance as code in this PR. Can you please check it out?

@newpavlov
Copy link
Member

#167 is merged, so I will close this PR.

@newpavlov newpavlov closed this Jun 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants