Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: switch from num-bigint-dig to crypto-bigint #394

Open
wants to merge 71 commits into
base: master
Choose a base branch
from

Conversation

dignifiedquire
Copy link
Member

@dignifiedquire dignifiedquire commented Nov 29, 2023

Very, very WIP

Not anymore, this is ready for review.
Replaces all usage of num-bigint-dig based BigInt usage with the new crypto-bigint crate, using BoxedUint

Current known issue is that we do have a performance regression, which will be able to get rid of over time:

# crypto-bigint

# macbook m1
test bench_rsa_2048_pkcsv1_decrypt      ... bench:   7,184,387.50 ns/iter (+/- 425,598.69)
test bench_rsa_2048_pkcsv1_sign_blinded ... bench:  13,453,579.10 ns/iter (+/- 686,276.31)

# AMD
test bench_rsa_2048_pkcsv1_decrypt      ... bench:   9,260,832.80 ns/iter (+/- 30,013.38)
test bench_rsa_2048_pkcsv1_sign_blinded ... bench:  16,610,079.40 ns/iter (+/- 251,292.53)

# master

# macbook m1
test bench_rsa_2048_pkcsv1_decrypt      ... bench:   1,117,479.15 ns/iter (+/- 31,334.30)
test bench_rsa_2048_pkcsv1_sign_blinded ... bench:   1,337,437.55 ns/iter (+/- 88,624.39)

# AMD
test bench_rsa_2048_pkcsv1_decrypt      ... bench:   1,414,348.80 ns/iter (+/- 12,585.71)
test bench_rsa_2048_pkcsv1_sign_blinded ... bench:   1,685,650.00 ns/iter (+/- 11,105.71)

TODOs

  • switch internal storage for RsaPrivateKey
  • switch internal storage for RsaPublicKey
  • switch all code to use the new decrypt implementation
  • update public traits using BigUint to return owned versions
  • fix blinding implementation
  • switch decryption algorithm with precompute to use crypto-bigint ops
  • go through other algorithms and update what can be done without having primality checks implemented
  • review & update code for constant time operation
  • review & update code for performance
  • benchmarks

src/algorithms/rsa.rs Outdated Show resolved Hide resolved
src/algorithms/rsa.rs Outdated Show resolved Hide resolved
@dignifiedquire
Copy link
Member Author

CI is finally green again, time for lots of review and cleanup

@dignifiedquire
Copy link
Member Author

@zeerooth nostd builds should be working again

@dignifiedquire dignifiedquire marked this pull request as ready for review December 2, 2024 09:16
@dignifiedquire dignifiedquire changed the title [WIP]: switch to crypto-bigint for decryption [WIP]: feat: switch from num-bigint-dig to crypto-bigint Dec 2, 2024
@dignifiedquire dignifiedquire changed the title [WIP]: feat: switch from num-bigint-dig to crypto-bigint feat: switch from num-bigint-dig to crypto-bigint Dec 2, 2024
let mut pi = prime_limit / (prime_limit.ln() - 1f64);
#[cfg(not(feature = "std"))]
let mut pi = prime_limit / (libm::logf(prime_limit as f32) as f64 - 1f64);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not entirely correct or optimal, but I am not sure what other way there is

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fjarri is there anything in crypto-primes for this?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, nothing like that currently

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dignifiedquire so if I understand correctly, the purpose of this calculation is ultimately for a check that keygen terminates in a reasonable amount of time?

@tarcieri
Copy link
Member

@dignifiedquire seems like you can remove "Very, very WIP" now

@dignifiedquire
Copy link
Member Author

@dignifiedquire seems like you can remove "Very, very WIP" now

🤣 yeah, finally

@Snowiiii
Copy link

Snowiiii commented Jan 2, 2025

@dignifiedquire how it's going ?

@tarcieri
Copy link
Member

tarcieri commented Jan 2, 2025

He's waiting on review, which I'll try to do soon

@tarcieri tarcieri self-requested a review January 2, 2025 14:39
digest = { version = "=0.11.0-pre.9", default-features = false, features = ["alloc", "oid"] }
pkcs1 = { version = "0.8.0-rc.0", default-features = false, features = ["alloc", "pkcs8"] }
pkcs8 = { version = "0.11.0-rc.0", default-features = false, features = ["alloc"] }
signature = { version = "=2.3.0-pre.4", default-features = false, features = ["alloc", "digest", "rand_core"] }
spki = { version = "0.8.0-rc.1", default-features = false, features = ["alloc"] }
zeroize = { version = "1.5", features = ["alloc"] }
crypto-bigint = { version = "0.6.0-rc.6", default-features = false, features = ["zeroize", "alloc"] }
crypto-primes = { version = "0.6.0-pre.2", default-features = false }
libm = "0.2"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it would be nice to avoid this, especially if its only used for no_std prime counting

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If all you need from there is logf(x), you can probably just approximate it with x.bits() / log2(e), pre-calculating log2(e).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even take log2(e) == 1 if the approximation allows it :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bits() seems to not be around either..

README.md Outdated
@@ -81,7 +81,7 @@ You can follow our work on mitigating this issue in [#390].

## Minimum Supported Rust Version (MSRV)

This crate supports Rust 1.72 or higher.
This crate supports Rust 1.73 or higher.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is crypto-bigint's latest MSRV:

Suggested change
This crate supports Rust 1.73 or higher.
This crate supports Rust 1.83 or higher.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed all around

@@ -222,7 +223,6 @@ extern crate alloc;
#[cfg(feature = "std")]
extern crate std;

pub use num_bigint::BigUint;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps consider a BoxedUint re-export here, possibly using a shorter name:

pub use crypto_bigint::BoxedUint as Uint;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Comment on lines +166 to +167
// TODO: reenable, currently slow
// key_generation!(key_generation_multi_16_1024, 16, 1024);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😢

Comment on lines 67 to 71
let dp = priv_key.dp().expect("precomputed");
let dq = priv_key.dq().expect("precomputed");
let qinv = priv_key.qinv().expect("precomputed");
let p_params = priv_key.p_params().expect("precomputed");
let q_params = priv_key.q_params().expect("precomputed");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous pattern-based approach seemed a little more elegant than this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines +276 to +278
let two = NonZero::new(BoxedUint::from(2u64))
.expect("2 is non zero")
.widen(bits);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something which makes this pattern better would be nice. We have various NonZero::new_unwrap constructors which allow you to infallibly construct a NonZero.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

278 |     let two = NonZero::new_unwrap(2).widen(bits);
    |                        ^^^^^^^^^^ multiple `new_unwrap` found
    |
    = note: candidate #1 is defined in an impl for the type `crypto_bigint::NonZero<Limb>`
    = note: candidate #2 is defined in an impl for the type `crypto_bigint::NonZero<crypto_bigint::Uint<LIMBS>>`

seems to not exist for theBoxedUint version

@@ -93,14 +95,14 @@ pub(crate) fn generate_multi_prime_key_with_exp<R: CryptoRngCore + ?Sized>(

let n = compute_modulus(&primes);

if n.bits() != bit_size {
if n.bits() as usize != bit_size {
// This should never happen for nprimes == 2 because
// gen_prime should set the top two bits in each prime.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment should be updated for the new name of the function.

Also I don't quite understand "gen_prime should set the top two bits in each prime" - what was that behavior for? generate_prime_with_rng() only sets one top bit, to ensure that the returned primes are of the requested bit size.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a bit curious about that myself. I saw @FiloSottile mention the same thing in this post:

https://words.filippo.io/dispatches/rsa-keygen-bench/

There is almost nothing special to selecting prime candidates. You draw an appropriately sized random number from a CSPRNG, and to avoid wasting time, you set the least significant bit and the two most significant bits: large even numbers are not prime, and setting the top two guarantees N won’t come out too small.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you set only the top bit of a n/2 bits number, you get a value >= 2^(n/2-1). If you multiply two of them, you get a value >= 2^(n-2), which is at a minimum n-1 bits long, one too short for a n-bit modulus.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, that makes sense. Do you think it should be the default behavior in crypto-primes?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you mainly need the primes for RSA, yes, you are only wasting CPU rejecting small moduli after you found the primes. If you're using the primes for something else, it might depend on what they are for.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fjarri are you planning on changing the behaviour?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it, and I don't think it should be the default - too specialized. It can be achieved with custom sieves, but I need to make a release first

src/encoding.rs Outdated
Comment on lines 54 to 77
let n = BigUint::from_bytes_be(pkcs1_key.modulus.as_bytes());
let e = BigUint::from_bytes_be(pkcs1_key.public_exponent.as_bytes());
let d = BigUint::from_bytes_be(pkcs1_key.private_exponent.as_bytes());
let prime1 = BigUint::from_bytes_be(pkcs1_key.prime1.as_bytes());
let prime2 = BigUint::from_bytes_be(pkcs1_key.prime2.as_bytes());
let key_malformed = pkcs8::Error::KeyMalformed;

let bits =
u32::try_from(pkcs1_key.modulus.as_bytes().len()).map_err(|_| key_malformed)? * 8;

let n = BoxedUint::from_be_slice(pkcs1_key.modulus.as_bytes(), bits)
.map_err(|_| key_malformed)?;
let n = Option::from(Odd::new(n)).ok_or(key_malformed)?;

let bits_e = u32::try_from(pkcs1_key.public_exponent.as_bytes().len())
.map_err(|_| key_malformed)?
* 8;
let e = BoxedUint::from_be_slice(pkcs1_key.public_exponent.as_bytes(), bits_e)
.map_err(|_| key_malformed)?;
let e = Option::from(e).ok_or(key_malformed)?;

let d = BoxedUint::from_be_slice(pkcs1_key.private_exponent.as_bytes(), bits)
.map_err(|_| key_malformed)?;

let prime1 = BoxedUint::from_be_slice(pkcs1_key.prime1.as_bytes(), bits)
.map_err(|_| key_malformed)?;
let prime2 = BoxedUint::from_be_slice(pkcs1_key.prime2.as_bytes(), bits)
.map_err(|_| key_malformed)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I quote understand why these conversions got more convoluted but perhaps there are upstream APIs which could be added to assist?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving this to a method would clean it up a bunch. Call out to it and map_err from the call

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleaned up a bit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.