Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use seed as private key format for ML-KEM #1994

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bifurcation
Copy link

This PR is a WIP demonstrating an approach to storing ML-KEM decapsulation keys in the form of a 64-byte seed, as opposed to the expanded form currently. Currently, the proposed scheme is implemented (a) for ML-KEM-512 only, and (b) in-place, as opposed to defining a new scheme. I am posting it in this state to get feedback on whether the structure of the change looks good, because the remainder of the PR will basically just be copy/pasting to cover the other parameter sets and maybe make a new KEM scheme.

If maintainers could answer the following questions, I can implement the rest of the PR:

  • Does the approach shown here look correct from an OQS structure / coding standards point of view?
  • Should this be implemented in the current ML-KEM provider, or as a new provider?
    • If the latter, it would be nice to key them both off of the same compile-time switches
    • For example, this could avoid having to build two copies of the PQCrystals library, which could cause symbol conflicts
    • ... since you could have one provider own the back-end library, and the other just link against it.

Fixes #1985

Tests are currently failing because they are providing / looking for the expanded private key. The ACVP test vectors do provide the seed values (d and z), so it should be possible to fix the tests simply by using those values instead of the expanded values (dk).

  • Does this PR change the input/output behaviour of a cryptographic algorithm (i.e., does it change known answer test values)? (If so, a version bump will be required from x.y.z to x.(y+1).0.)
  • Does this PR change the list of algorithms available -- either adding, removing, or renaming? Does this PR otherwise change an API? (If so, PRs in fully supported downstream projects dependent on these, i.e., oqs-provider will also need to be ready for review and merge by the time this is merged.)
    • Not currently, since the changes are made in-place.
    • If we go with the backward-compatible approach, it will define a new algorithm.

@kvanals
Copy link

kvanals commented Nov 15, 2024

I'm tracking this closely, as we are heavily invested in ML-KEM at SoftIron. I'm not really sure what we could do here that would be backwards compatible with existing expanded keys.

As a side note, I'm suspecting changes will be needed in oqs-provider as well?

# openssl genpkey -algorithm mlkem1024 -outform pem -out mlkem.key malloc(): unaligned tcache chunk detected Aborted (core dumped)

@bifurcation
Copy link
Author

@kvanals - The idea of doing it with backward compatibility would be to define a separate scheme using the same upstream library. And yes, there would probably be changes needed in oqs-provider as well, to enable callers to reference the additional scheme. You would have something like openssl genpkey -algorithm mlkem1024seed ....

@kvanals
Copy link

kvanals commented Nov 15, 2024

@kvanals - The idea of doing it with backward compatibility would be to define a separate scheme using the same upstream library. And yes, there would probably be changes needed in oqs-provider as well, to enable callers to reference the additional scheme. You would have something like openssl genpkey -algorithm mlkem1024seed ....

Completely sensible! Sorry I wasn't tracking that.

@baentsch
Copy link
Member

Does the approach shown here look correct from an OQS structure / coding standards point of view?

Yes. A full implementation of course would have to use the code generation concepts of "copy_from_upstream" and pass all NIST tests.

Should this be implemented in the current ML-KEM provider, or as a new provider?

The overarching goal should be to create as little new original code (as opposed to generated code) as possible / be orthogonal to the current logic.

I'm tracking this closely, as we are heavily invested in ML-KEM at SoftIron.

In that case, allow me to invite "SoftIron" to contribute to the work in this project. Right now this project's mission is to support research and prototyping (only) as we don't have the resources to be "everything for everyone", e.g., maintain backwards compatibility (or other "product level" qualities). The proposal @bifurcation kindly works on here is the result of the discussion at #1985 with the goal to work with these limitations. Either way, currently, there is absolutely no guarantee that OQS will keep supporting both types of keys: For example, if the upstream ceases to support both modes, so will OQS. If the general consensus is to only use the seed-based approach, support for expanded keys most likely will vanish.

As a side note, I'm suspecting changes will be needed in oqs-provider as well?

This seems to indicate you want to use this code via OpenSSL. If so, you may want to follow the discussion (or --again-- even better, contribute to the project) at openssl/openssl#25885 where we're using an entirely different code base and aim to only implement the relevant standards for exactly the same reason: Provide the best possible code with the smallest amount of development and maintenance resources possible.

@kvanals
Copy link

kvanals commented Nov 16, 2024

I'm tracking this closely, as we are heavily invested in ML-KEM at SoftIron.

In that case, allow me to invite "SoftIron" to contribute to the work in this project. Right now this project's mission is to support research and prototyping (only) as we don't have the resources to be "everything for everyone", e.g., maintain backwards compatibility (or other "product level" qualities). The proposal @bifurcation kindly works on here is the result of the discussion at #1985 with the goal to work with these limitations. Either way, currently, there is absolutely no guarantee that OQS will keep supporting both types of keys: For example, if the upstream ceases to support both modes, so will OQS. If the general consensus is to only use the seed-based approach, support for expanded keys most likely will vanish.

Cheers @baentsch, happy to upstream anything where applicable. Presently we're using oqs-provider as-is and only maintain one out-of-tree patch to liboqs which I don't believe would meet your copy_from_upstream requirements to build for some of our ARM build targets that are lacking some of the requisite ARM NEON intrinsics. Let me know if there is anything you'd like us to sign or review in this regard.

As a side note, I'm suspecting changes will be needed in oqs-provider as well?

This seems to indicate you want to use this code via OpenSSL. If so, you may want to follow the discussion (or --again-- even better, contribute to the project) at openssl/openssl#25885 where we're using an entirely different code base and aim to only implement the relevant standards for exactly the same reason: Provide the best possible code with the smallest amount of development and maintenance resources possible.

I absolutely am tracking that. And correct, we do very little work today with liboqs directly, but rather abstract its use behind high level OpenSSL EVP functions.

@baentsch
Copy link
Member

Let me know if there is anything you'd like us to sign or review in this regard.

Regarding OQS contributions, all there is is (acceptance of) the license (by way of the DCO "signoff"; #1760). If you'd like to see your code/contributions also become used in OpenSSL, see here.

rather abstract its use behind high level OpenSSL EVP functions.

Good approach. Will allow use of the most well-maintained PQC code base -- whichever that is. FWIW, I like the approach of supporting both key formats via OSSL_PARAM. Which one(s) gets an encoder/decoder will probably be a matter of which format ultimately gets an official/standardized OID assigned. If both do, the approach taken in this PR (creating a "twin" alg set that each could get its own OIDs) is probably the most simple to integrate&use.

@bifurcation
Copy link
Author

A full implementation of course would have to use the code generation concepts of "copy_from_upstream" and pass all NIST tests.
The overarching goal should be to create as little new original code (as opposed to generated code) as possible / be orthogonal to the current logic.

@baentsch - Happy to work along these lines, but I'm not seeing documentation of how to do this. I see the copy_from_upstream script, but not what the concepts are or how to use it. Likewise, unclear to me what in this code base is generated vs. original code. Any pointers you could provide would be helpful. I'm also in the libOQS Discord if that would be a more convenient place to chat.

@kvanals
Copy link

kvanals commented Nov 22, 2024

It looks like @baentsch has been working on getting ML-KEM into OpenSSL from BoringSSL's implementation in https://github.com/openssl/openssl/tree/feature/ml-kem

If I'm reading this right, this supports using "OSSL_PKEY_PARAM_MLKEM_SEED" in the manner described above via an OSSL_PARAM.

@baentsch
Copy link
Member

It looks like @baentsch has been working on getting ML-KEM into OpenSSL from BoringSSL's implementation in https://github.com/openssl/openssl/tree/feature/ml-kem

If I'm reading this right, this supports using "OSSL_PKEY_PARAM_MLKEM_SEED" in the manner described above via an OSSL_PARAM.

This is a community effort, but yes, that's what we're doing there. As always, feedback welcome to the approach taken (documentation proposal up at openssl/openssl#26037).

@kvanals
Copy link

kvanals commented Dec 3, 2024

@bifurcation FWIW, I ended up basing at least one implementation on the work in this PR as a bit of a one-off. If useful for anyone else, I fixed a few issues with the work done in this PR and applied the logic to all three ML-KEM variants and attached an updated patch to this comment.

For my use this works, since the ASN.1 data will have the ML-KEM seed (as private key) and the ML-KEM public key as well, albeit concatenated in the private key since liboqs/oqs-provider do not yet support key derivation from the private key. If/when this becomes more standardized, I can always rewrite the ASN.1 structure to match with the raw data needed and I avoid the issue of losing the seeds when persistent keys needs to exist.

I appreciate the initial work on this, but I also realize this isn't anything that can actually be adopted upstream.

Attached patch: liboqs-mlkem-coins-as-secret.patch

@baentsch
Copy link
Member

In a year-end "clean up" effort, is this PR still active/intended to be completed @bifurcation ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support 64-byte seed form of ML-KEM private keys
3 participants