fix: Apply Nomos library code review recommendations #898

romanzac · 2024-10-31T06:31:51Z

1. What does this PR implement?

This PR proposes implementation changes for KZG-related functionality based on a code review of the Nomos library.

Check for division by zero polynomial
Check for minimal domain size of 64
Check for hiding bound of 128 versus length of powers_of_gamma_g
Randomness added to:
commit_polynomial, generate_element_proof, verify_element_proof

2. Does the code have enough context to be clearly understood?

Code review document

3. Who are the specification authors and who is accountable for this PR?

Specification author: @ramsesfv
PR accountable: @romanzac

4. Is the specification accurate and complete?

Code review itself provides basis for changes in the codebase. Final changes to be made depend on review from project developers.

5. Does the implementation introduce changes in the specification?

Specification - code review is a static document. Unless another round of code review is requested, no changes to the spec are expected.

Checklist

1. Description added.
2. Context and links to Specification document(s) added.
3. Main contact(s) (developers and specification authors) added
4. Implementation and Specification are 100% in sync including changes. This is critical.
5. Link PR to a specific milestone.

danielSanchezQ · 2024-10-31T19:59:30Z

nomos-da/kzgrs/src/kzg.rs

+    // Set the hiding bound to the security parameter
+    let hiding_bound = 128;
+    let mut rng = OsRng;
+
+    // Ensure that powers_of_gamma_g has enough powers for the hiding bound
+    if global_parameters.powers_of_gamma_g.len() < hiding_bound + 1 {
+        return Err(KzgRsError::HidingBoundTooLarge);
+    }
+
+    // Convert from B-tree map
+    let powers_of_gamma_g: Vec<_> = global_parameters
+        .powers_of_gamma_g
+        .values()
+        .cloned()
+        .collect();
+
+    // Initialize Powers with sufficient powers_of_gamma_g


I;m not sure this is gonna work, as later is not used for proofs. I would really not do this. But pinging @megonen here to double check.

I am attempting to bring conversation up about randomness and proof quality. I got basically 2 takeaways from @ramsesfv review:

Add randomness to anything which is supposed to prevent disclosure of original data - if there is a plan for inequality in hiding validator's stake amount for operators vs content of transaction data for commons I would like to know reasoning behind this.

Make domain parameters large enough so the needle (valuable information) could effectively hide in the haystack (I am still working on this one).

I actually noted this in the document as well. On our DA side, we don’t expect zero-knowledge or hiding properties. In fact, hiding would prevent verification. Here, we’re only using KZG’s binding property. The validator's stake amount and the content of transaction data will already be hidden by other methods. Our approach here is simply to ensure the integrity of these values, regardless of whether they’re private or not. It’s sufficient for DA that the generated commitments and proofs cannot be altered.

About domain parameter size: I understand that the domain parameter size was chosen this way for testing purposes only; it will be sufficiently large in the main protocol.

I actually noted this in the document as well. On our DA side, we don’t expect zero-knowledge or hiding properties. In fact, hiding would prevent verification. Here, we’re only using KZG’s binding property. The validator's stake amount and the content of transaction data will already be hidden by other methods. Our approach here is simply to ensure the integrity of these values, regardless of whether they’re private or not. It’s sufficient for DA that the generated commitments and proofs cannot be altered.

About domain parameter size: I understand that the domain parameter size was chosen this way for testing purposes only; it will be sufficiently large in the main protocol.

Could you please list methods which would be used to hide "validator's stake amount and the content of transaction data" ?

Domain parameter size is calculated from "other values". I think we need to have in place a check if this dynamically calculated size is secure and potentially change this calculation if not ? I regard only to the main protocol. I don't think testing needs large domain size, unless domain size calculation is subject to test.

Let's move towards clarity and adjust or discuss.

IMO Domain size should not be a problem. It is in the interest of the encoder to use proper sizes. Unless we refactor everything to use constant sized and we catch that in compilation time it may not really be worth it. If it is a general purpose library it should be up to the caller to check that. Otherwise you are free to use whatever you need, it being 0 or usize::MAX

This part will be ensured through the CL side’s designed PoV (proof of validator) and private transaction mechanisms. Full integration hasn’t been completed yet.

As for the domain size, as known, we store DA data in a matrix representation, and the rows will serve at least 2048 subnets (I assumed this but @danielSanchezQ , could you confirm the current status?). This means that for row commitments, our minimum domain size is already set at this level and it is secure. For column commitments, however, it will vary depending on the data length. Suppose the domain size for column commitments turns out to be small. If an attacker makes a change to the column commitment polynomial, this will also affect the row commitment polynomial, thus failing row commitment verification and detect this attack. But still, if we want to establish a lower bound without impacting performance, a minimum of 64 or 128 could be set, no problem.

I would be in favor to have minimum of 64. I think it would be great, if we leave freedom in the library to the upside, but protect the software user from downside risk. Machines would ever get only faster, so that lower bound would eventually need to get higher. It would be great to have this "noted" directly in the code. Not in documentation, which is otherwise an alternative. @danielSanchezQ WDYT ?

Hi @danielSanchezQ , I have added domain size check and hiding bound check only for non test code. We can take even softer approach and only log warning, when we detect "weak" security settings. Which choice would you pick ?

I've got a bit stuck with adding randomness to fk20_batch_generate_elements_proofs. It seems some bechmarking and tuning would need to done around this problem to pick the right solution. My current solution is not compatible with generate_element_proof yet. If you don't want to add randomness to proofs, I can revert these parts. If you want to keep it in this PR for later reference, and take over in another PR, it is also an option. We don't need to merge this PR.

I would like to finish this PR for completeness (and personal interest), but I need to go back to writing tests.

I don't think we should add any check though. Its up to the caller of the library to chose what to encode. Adding a synthetic lower bound doesn't really makes much sense to me, sincerely.
I do not think we need the randomness neither. The fk20 doesn't account for it iirc, render it incompatible. Also I wouldn't have the bandwidth to deal with it if possible. So we are a bit stuck on this.

There is no need to deal with changes for security immediately. I would say however, without sound security there is no privacy. Lower bounds, or recommendations are very common. Nobody would use MD5, sha1 or RSA keys shorter than 1024b for example. We have 7. Security and Privacy Testing planed later. Perhaps to happen next year, so we can reopen this topic at more calmer time for you?

danielSanchezQ · 2024-10-31T19:59:47Z

nomos-da/kzgrs/src/kzg.rs

+    if u.is_zero() {
+        return Err(KzgRsError::DivisionByZeroPolynomial);
+    };


Yeah, this is fine.

fix: zero polynomial check

7d97a4e

github-actions bot assigned romanzac Oct 31, 2024

romanzac added 4 commits October 31, 2024 14:34

fix: formatting

03eeb20

fix: check for division by zero polynomial earlier in the process

6609e83

fix: introduce hiding bound and random generator

a1766c3

fix: clippy happy

57bac6b

danielSanchezQ reviewed Oct 31, 2024

View reviewed changes

romanzac added 11 commits November 1, 2024 15:04

fix: add randomness to generate_element_proof

3f37b1a

fix: clippy happy

4597a7e

Merge branch 'master' into fix-kzg-review-proposal

1d87ccf

fix: include random_v in proof validation

7774134

fix: add domain size check for production mode

db2d76b

fix: formatting

7c1dbef

fix: skip IDING_BOUND check for tests

a184135

fix: exclude HIDING_BOUND in commit params for tests

6c22d99

test: add randomness to batch generate proofs

312ba2a

Merge branch 'master' into fix-kzg-review-proposal

0f9f9ad

Merge branch 'master' into fix-kzg-review-proposal

8e92376

romanzac marked this pull request as ready for review November 5, 2024 14:55

romanzac requested review from megonen and danielSanchezQ November 5, 2024 14:55

Merge branch 'master' into fix-kzg-review-proposal

6413096

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Apply Nomos library code review recommendations #898

fix: Apply Nomos library code review recommendations #898

romanzac commented Oct 31, 2024 •

edited

Loading

danielSanchezQ Oct 31, 2024

romanzac Nov 1, 2024 •

edited

Loading

megonen Nov 1, 2024

romanzac Nov 1, 2024

danielSanchezQ Nov 1, 2024

megonen Nov 1, 2024

romanzac Nov 4, 2024

romanzac Nov 5, 2024 •

edited

Loading

danielSanchezQ Nov 7, 2024

romanzac Nov 8, 2024 •

edited

Loading

danielSanchezQ Oct 31, 2024

fix: Apply Nomos library code review recommendations #898

Are you sure you want to change the base?

fix: Apply Nomos library code review recommendations #898

Conversation

romanzac commented Oct 31, 2024 • edited Loading

1. What does this PR implement?

2. Does the code have enough context to be clearly understood?

3. Who are the specification authors and who is accountable for this PR?

4. Is the specification accurate and complete?

5. Does the implementation introduce changes in the specification?

Checklist

Choose a reason for hiding this comment

romanzac Nov 1, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

romanzac Nov 5, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

romanzac Nov 8, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

romanzac commented Oct 31, 2024 •

edited

Loading

romanzac Nov 1, 2024 •

edited

Loading

romanzac Nov 5, 2024 •

edited

Loading

romanzac Nov 8, 2024 •

edited

Loading