Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Note objects and structure refactoring #350

Closed
bobbinth opened this issue Dec 8, 2023 · 20 comments
Closed

Note objects and structure refactoring #350

bobbinth opened this issue Dec 8, 2023 · 20 comments
Assignees
Labels
refactoring Code clean-ups, improvements, and refactoring
Milestone

Comments

@bobbinth
Copy link
Contributor

bobbinth commented Dec 8, 2023

We have quite a few structs which describe different aspects of notes. Specifically: Note, NoteEnvelope, NoteOrigin, RecordedNote, NoteInclusionProof, NoteStub, CreatedNotes, ConsumedNotes, ConsumedNoteInfo, and maybe others. I think naming and relations between these structs have becomes somewhat unintuitive and also we have some obsolete concepts which we need to remove and some changes we'd like to make. So, in this issue I'll try to describe which changes I think we could make to the overall note structure. Any feedback is welcome.

Note struct

Currently this struct looks like so:

pub struct Note {
    script: NoteScript,
    inputs: NoteInputs,
    vault: NoteVault,
    serial_num: Word,
    metadata: NoteMetadata,
}

I think the structure is fine, but what bothers me a little is how we treat the metadata field. Specifically, here metadata is considered to be a part of the note, but when we insert notes into the note tree, metadata goes into a different leaf. I think we should come up with a name for entity defined by hash(note_hash, metadata) and then this entity should be inserted into a single leaf in the notes tree.

Open question: how should we name this entity?

NoteStub struct

The Note struct described above contains all information about a note. However, sometimes we may have minimal info about a note. I would call a struct which contains such info a NoteStub. This struct could look like so:

pub struct NoteStub {
    hash: Digest,
    metadata: NoteMetadata,
}

NoteLocation struct

I would rename the current NoteOrigin struct into NoteLocation struct as I think it better describes the data contained in this struct (i.e., the location of the note in the chain). The struct would look like this:

pub struct NoteLocation {
    block_num: u32,
    note_index: NodeIndex,
}

InputNote struct

I would replace RecordedNote and NoteInclusionProof structs with a single InputNote struct which would look like so:

pub struct InputNote {
    note: Note,
    location: NoteLocation,
    proof: MerklePath, // in the future we can replace this with `SparseMerklePath`
}

The purpose of this struct would be to provide note info for notes consumed in a transaction.

This is slightly different from the combination of the two mentioned structs in that it doesn't include sub_hash and note_root fields, but I think these should not be a part of the note info anyway.

OutputNote struct

This struct would replace the current NoteStub struct, and it would look as follows:

pub struct OutputNote {
    recipient: Digest,
    vault: NoteVault,
    metadata: NoteMetadata,
}

InputNotes and OutputNotes structs

These structs could replace ConsumedNotes and CreatedNotes structs. They could look like so:

pub struct InputNotes {
    notes: Vec<InputNote>,
    commitment: OnceCell<Digest>,
}

pub struct OutputNotes {
    notes: Vec<OutputNote>,
    commitment: OnceCell<Digest>,
}

Nullifier struct

Lastly, I would replace ConsumedNoteInfo with Nullifier struct which would look simply like:

pub struct Nullifier(Digest);

Since we've migrated to dynamic note script invocation, I don't believe ConsumedNoteInfo is needed, and we actually should update the transaction kernel to work with just Nullifiers for consumed notes.

@plafer
Copy link
Contributor

plafer commented Dec 14, 2023

I'm still not 100% familiar with how exactly all of these are used, when and why, so my input is limited. A few things:

  • recipient: where does this name come from? My initial assumption is "the account that receives the note; the destination"
  • NoteStub: I would avoid using "stub", as it is already widely used in software, but with a different meaning, and it confuses me every time (i.e. I'm expecting it to be some kind of test data structure). Also, is this meant to be the same as NoteEnvelope?
  • Could we define Note as
pub struct Note {
    script: NoteScript,
    inputs: NoteInputs,
    vault: NoteVault,
    serial_num: Word,
}

Methods who also need the metadata can receive it in another argument. What I like about this is that note_hash == Note::hash() (given the intuitive definition for Note::hash() where all fields are included in the hash).

Applying the same logic, you wouldn't need NoteStub either; you'd just take note_hash: Digest and NoteMetadata as input to methods.

  • NoteLocation: this data structure could apply to any Merkle tree in the chain; maybe we name it something like ChainTreeLocation? With the goal of reducing the number of Note* types.
  • InputNote and OutputNote: maybe rename to TransactionInputNote and TransactionOutputNote? Otherwise the semantics of "input" and "output" aren't very clear to me.

Finally, I would try to use newtypes for hashes instead of Digest (this applies more generally across our codebases). For example, I'm not 100% sure if your suggested NoteStub is the same as NoteEnvelope, what is being hashed in note_hash could be different. If we had different types, I would know right away, and I could look up e.g. the NoteHash type and read the documentation about exactly what it hashes, why, etc.

@Dominik1999
Copy link
Collaborator

Dominik1999 commented Dec 15, 2023

Nice issue, and cool that we are constantly maintaining.

I think we should come up with a name for entity defined by hash(note_hash, metadata) and then this entity should be inserted into a single leaf in the notes tree.

Isn't hash(note_hash, metadata) the same as NoteStub? However, a Note together with its metadata we can call CompositeNote or simply NoteWithMetadata or more abstract MetaNote. To me, the more descriptive the name is, the better.

I would rename the current NoteOrigin struct into NoteLocation struct as I think it better describes the data contained in this struct (i.e., the location of the note in the chain).

That makes a lot of sense to me.

I would replace RecordedNote and NoteInclusionProof structs with a single InputNote struct which would look like so:

Makes sense to me as well. Renaming this will make it easier to understand.

InputNotes and OutputNotes that replace ConsumedNotes and CreatedNotes structs.

That makes sense. However, maybe it is even clearer when we say SetOfInputNotes or VecOfInputNotes. But I am not sure, maybe InputNotes is good enough. Just the difference from InputNote to InputNotes is marginal, so I would describe it rather as a set than only putting it in plural form.

Lastly, I would replace ConsumedNoteInfo with Nullifier struct which would look simply like:

This will make it clearer!

@bobbinth
Copy link
Contributor Author

I would try to use newtypes for hashes instead of Digest (this applies more generally across our codebases).

Agreed - I've already started doing it here and here.

recipient: where does this name come from? My initial assumption is "the account that receives the note; the destination"

It is not a perfect name, but I couldn't come up with a better one. The way it is computed is defined here. The basic rational is that a combination of script_root and note_inputs defines who can consume a note and how. But it is much more flexible than a single account - it could be a set of account IDs, a set of public keys, or even just some condition which can be fulfilled by anyone (e.g., in a case of atomic swap).

NoteStub: I would avoid using "stub", as it is already widely used in software, but with a different meaning, and it confuses me every time (i.e. I'm expecting it to be some kind of test data structure). Also, is this meant to be the same as NoteEnvelope?

This is also not a perfect name, and similarly, I wasn't able to come up with a better name yet. The idea here is that this is the minimal amount of data which is known publicly about a note (more concretely, this is what is stored in the leaves of note trees). It is a direct replacement for NoteEnvelope - which I like even less than NoteStub.

Could we define Note as

pub struct Note {
    script: NoteScript,
    inputs: NoteInputs,
    vault: NoteVault,
    serial_num: Word,
}

I'm not sure we can do it this way. In many cases metadata goes together with the note, and we'll just end up passing tuples of (Note, NoteMetadata) around. We could come up with a dedicated struct for this combination, but I don't see a lot of value of having it instead of just putting metadata into the Note.

One thing I was thinking about which could make definition of things like note_hash more intuitive is to define it as follows:

  • Rename note hash into NoteId - this would still be defined as hash(note_script_root || inputs_hash || vault_hash || serial_num).
  • Define note hash as hash(note_id, metadata).

What do you think about this approach?

NoteLocation: this data structure could apply to any Merkle tree in the chain; maybe we name it something like ChainTreeLocation? With the goal of reducing the number of Note* types.

This combines two things: (1) the block in which the note was recorded in (i.e., its place in Chain MMR), and (2) which index in the block's note tree the note is at. i'm not sure ChainTreeLocation is an improvement here. We could do something like NoteChainLocation - but not sure if it is much better either.

InputNote and OutputNote: maybe rename to TransactionInputNote and TransactionOutputNote? Otherwise the semantics of "input" and "output" aren't very clear to me.

I was thinking of moving these into transaction module - this way, it will be clear what they refer to.

Isn't hash(note_hash, metadata) the same as NoteStub?

NoteStub is currently (note_hash, metadata) (not hash of these).

However, a Note together with its metadata we can call CompositeNote or simply NoteWithMetadata or more abstract MetaNote. To me, the more descriptive the name is, the better.

We could, but then I'm not really sure we'd use Note by itself in many places.

That makes sense. However, maybe it is even clearer when we say SetOfInputNotes or VecOfInputNotes. But I am not sure, maybe InputNotes is good enough. Just the difference from InputNote to InputNotes is marginal, so I would describe it rather as a set than only putting it in plural form.

I think this should be OK as missing a letter would be caught by the compiler.

@plafer
Copy link
Contributor

plafer commented Dec 18, 2023

The basic rational is that a combination of script_root and note_inputs defines who can consume a note and how. But it is much more flexible than a single account - it could be a set of account IDs, a set of public keys, or even just some condition which can be fulfilled by anyone (e.g., in a case of atomic swap).

Right, I understand better why recipient was chosen, and I also can't think of anything better. Adding a comment with a description similar to this of "what" a recipient is (as opposed to just how it is computed) would be very helpful.

[NoteStub] is also not a perfect name, and similarly, I wasn't able to come up with a better name yet.

Few ideas that come to mind:

  • NoteMinimal, MinimalNote,
  • NoteCondensed, CondensedNote,
  • NotePrivate, PrivateNote (probably my favorite)

If we feel a bit silly: SmallNote, TinyNote

What do you think about this approach? [note id vs note hash]

Much much better! I think this removes all confusion.

i'm not sure ChainTreeLocation is an improvement here. We could do something like NoteChainLocation - but not sure if it is much better either.

I was thinking more along the lines of: if we have another object that fits the same structure of needing a block index & NodeIndex to fully identify its location, then we would be able to reuse the NoteLocation logic. But we don't have any object like that, so actually indeed NoteLocation is more specific, and better.

I was thinking of moving these into transaction module - this way, it will be clear what they refer to.

👍

@bobbinth
Copy link
Contributor Author

bobbinth commented Jan 4, 2024

With the recent refactorings, a big part of what is described in this issue is already done. Specifically:

  • We now have Nullifier and NoteId structs which are newtypes over Digest.
  • RecordedNote was renamed into InputNote and moved into transaction module.
  • NoteStub was renamed into OutputNote and moved into transaction module.
  • ConsumedNotes and CreatedNotes structs have been replaced by InputNotes and OutputNotes structs.

Out of things that remain from the original post we have:

  • Refactor NoteOrigin into NoteLocation (should be pretty easy).
  • Introduce NoteStub (though maybe under a different name).
  • Update MASM files to reflect new naming and also to put notes into a single leaf rather than two leaves in the note tree.

However, one other thing came up in #380 (comment). Specifically, we may want to create a Note object before we know the note's sender (which is a part of note metadata). So, we may need to change the definition of the note to be similar to what @plafer suggested above. Specifically:

pub struct Note {
    script: NoteScript,
    inputs: NoteInputs,
    assets: NoteAssets,
    serial_num: Word,
}

We can then update the InputNote to look something like this:

pub struct InputNote {
    note: Note,
    metadata: NoteMetadata,
    location: NoteLocation,
    proof: MerklePath,
}

We still probably want to have a struct which encapsulates the note together with its metadata (but without the location/proof) - though, I don't love the NoteWithMetadata or CompositeNote options.

@Dominik1999
Copy link
Collaborator

Let's break it down with the remaining tasks,m then ppl can pick it up Monday

@hackaugusto
Copy link
Contributor

Here are some ideas

Add explicit structure to hold hashed data

For example, Nullifier(Digest) contains only the digest, and not the data that was used to create the digest. IMO we need both, so in addition to the existing Nullifier, there would be another structure with the original data:

struct NoteNullifier {
    serial_number: SerialNumber,
    script: NoteScript,
    inputs: NoteInputs,
    assets: NoteAssets
}

The wrapper type represents the blinded version, which is used often in the protocol, so we need a way to express it. The full version represents the values when they are being created, and sometimes consumed. We need all the data to compute the hash in the first place, so it also makes sense to have these structures. Another example here is the Recipient, which unlike Nullifier has the data extracted, but is missing a wrapper for the Digest.

If we have this split, we can implement traits to compute the digest see. And the code follows more close the protocol.

The crypto structures should have generic containers

Types like InputNote should probably be defined in crypto. For example, InputNote may be implementable as ValuePath<Note> ref. This would have the benefit that we can actually implement and test its behavior in the crypto crate, increase code reuse, and simplifies naming. (For now, it seems it is not a one-to-one match, the point here is not to say this is a perfect replacement, but to share the idea of using generic containers)

@bobbinth
Copy link
Contributor Author

Add explicit structure to hold hashed data

For example, Nullifier(Digest) contains only the digest, and not the data that was used to create the digest. IMO we need both, so in addition to the existing Nullifier, there would be another structure with the original data:

struct NoteNullifier {
    serial_number: SerialNumber,
    script: NoteScript,
    inputs: NoteInputs,
    assets: NoteAssets
}

The wrapper type represents the blinded version, which is used often in the protocol, so we need a way to express it. The full version represents the values when they are being created, and sometimes consumed. We need all the data to compute the hash in the first place, so it also makes sense to have these structures. Another example here is the Recipient, which unlike Nullifier has the data extracted, but is missing a wrapper for the Digest.

Wouldn't NoteNullifier be basically the same as NoteDetails described in #645? The way I think about note nullifier is that it is just another way to commit to the note. In this way, it is similar to NoteId (which is also just NoteId(Digest)).

The situation with NoteRecipient is different - i.e., it is a subset of the note details and so, I think it makes sense to have it as a stand-alone struct.

@hackaugusto
Copy link
Contributor

Wouldn't NoteNullifier be basically the same as NoteDetails described in #645?

With the approach I proposed above, the struct:

pub struct NoteDetails {
    assets: NoteAssets,
    recipient: NoteRecipient,
}

Corresponds to:

/// Note ID is computed as:
///
/// > hash(recipient, asset_hash),

So not quite the NoteNullifier.

@bobbinth
Copy link
Contributor Author

bobbinth commented May 1, 2024

NoteRecipient consists of script, inputs, and serial number - so, while they are grouped differently, both, NoteId and Nullifier are based on exactly the same underlying components. Conceptually, both NoteId and Nullifier are the same - i.e., a commitment to the note just computed differently. I'm not sure it makes sense to expand each commitment into a separate structs.

@daedlock
Copy link

daedlock commented Jun 5, 2024

There are use cases where users don't know the serial number to construct a NoteRecipient. They would however know the hash(serial, note_script), allowing them to construct the recipient without knowing the serial by hashing their value with inputs_commitment. However, The current Note constructor expects a NoteRecipient which is impossible to construct given the described use case as the NoteRecipient is only instantiatable via a serial_number

Proposal

Change the Note::new constructor to accept Digest instead of NoteRecipient and implement Into<Digest> for NoteRecipient to allow anyone with a Digest to construct a Note.

This change would be obviously a breaking one since all Note::new() usages will need to be refactored to pass recipient.into(). I am not sure if there is a smarter or less breaking way to do it. Happy to hear your thoughts

This change would be obviously a breaking one since all Note::new() usages will need to be refactored to pass recipient.into(). I am not sure if there is a smarter or less breaking way to do it. Happy to hear your thoughts

note: discussion on discord here https://discord.com/channels/893151281848406016/1114419272165376030/1247871953767895120

@bobbinth
Copy link
Contributor Author

bobbinth commented Jun 6, 2024

I will look into this in more detail tomorrow, but I'm curious if the PartialNote is the more appropriate object to use here.

@daedlock
Copy link

daedlock commented Jun 6, 2024

I will look into this in more detail tomorrow, but I'm curious if the PartialNote is the more appropriate object to use here.

Thank you! I didn't know about PartialNote. I guess if I can use a PartialNote instead of Note in expected_output_notes param in TransactionRequest::new() that would suffice!

@bobbinth
Copy link
Contributor Author

bobbinth commented Jun 6, 2024

if I can use a PartialNote instead of Note in expected_output_notes param in TransactionRequest::new() that would suffice!

Yes, I think this should be possible. Let me think through what changes this would imply.

Also, cc @igamigo as this will probably result in more changes in the client rather than in miden-base.

@bobbinth
Copy link
Contributor Author

bobbinth commented Jun 6, 2024

Looking into this more, I think we might already have everything that we need on the miden-base side, and what we need to do is just update miden-client. Specifically, we could change the type of expected_output_notes to something like Vec<ExpectedNote> where:

pub enum ExpectedNote {
    Full(Note),
    Partial(PartialNote),
}

@igamigo - what do you think?

@igamigo
Copy link
Collaborator

igamigo commented Jun 7, 2024

Yeah, sounds right. I'll be implementing this soon.

EDIT: Implemented some of this but found out that we are missing some mechanism to extend the transaction arguments accordingly. Additionally, we might want to add a PartialNote::authentication_hash() as well.

@bobbinth
Copy link
Contributor Author

bobbinth commented Jun 8, 2024

Yeah - it seems like some changes in miden-base will be needed. Specifically, we'll need to modify TransactionArgs::add_expected_output_note to accept partial note data as well.

One way to do it is to define ExpectedNote enum in miden-base. It could look something like so:

pub enum ExpectedNote {
    Full(NoteDetails),
    Partial(Digest, NoteAssets),
}

But there could be other ways to do this as well.

@daedlock - one question: this feature seems to be useful only for private notes. For public notes, the users would always be able to get full note details from the chain. Or are you thinking of using this with public notes as well somehow?

@daedlock
Copy link

daedlock commented Jun 8, 2024

@daedlock - one question: this feature seems to be useful only for private notes. For public notes, the users would always be able to get full note details from the chain. Or are you thinking of using this with public notes as well somehow?

I was planning to use this with public notes but it turns out it's not entirely needed as I ended up exposing the note serial through the backend (since I learned it's already public for public notes). However, we are also planning to have private limit orders in the near future where the ExpectedNote::Partial() will become necessary.

@bobbinth
Copy link
Contributor Author

Most of the refactorings described in this issue are now done. What remains is:

  • Refactor NoteOrigin into NoteLocation (should be pretty easy).
  • Change the structure of the note tree to be a tree of depth 20 having leaves be note hashes (rather than depth 21 with every other leaf being note ID).
  • Update MASM files to reflect new naming and also to put notes into a single leaf rather than two leaves in the note tree.

@bobbinth
Copy link
Contributor Author

bobbinth commented Jul 1, 2024

Created #778, #779, and #781 for the remaining issues - so, closing this one.

@bobbinth bobbinth closed this as completed Jul 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactoring Code clean-ups, improvements, and refactoring
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

6 participants