Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for mapping old fields to new ones in TLV read macros #3378

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

TheBlueMatt
Copy link
Collaborator

As we've grown, we regularly face a question of whether to "break out" of our nice TLV-based struct/enum reading/writing macros in order to handle mapping legacy fields to new ones, or deal with keeping the legacy fields and handling at runtime what should be hanlded at (de-)serialization time.

This attempts to address this tradeoff by adding support for a "legacy" TLV read. This read style allows us to read a TLV which is not directly mapped to any fields in the struct/enum but which can be computed from the struct/enum's layout at write-time and which is incorporated into the read data at read-time.

It takes a type, a $read expression (which is executed after all TLVs are read but before the struct/enum is built) and a $write expression (which is executed to calculate the value to write in the TLV).

They are always read as options to retain a future ability to remove the legacy fields.

Sadly, there's two issues with doing this trivially which force us into proc-macro land:

(a) when matching the original struct we want to list the fields
in the match arm so that we have them available to write.
Sadly, we can't call a macro to have it write out the field
name based on the field type, so instead need to pass the whole
match to a proc-macro and have it walk through to find the
types and skip fields that are legacy.
(b) when building a final struct/enum after reading, we need to
list a few $field: $exprs and cannot decide whether to
include a field based on a regular macro.

The proc-macros to do so aren't trivial, but they aren't that bad either. We could instead try to rewrite our TLV stream processing macros to handle a new set of TLVs which are passed via a separate argument, but as TLVs are required to in ordered by type this requires a good chunk of additional generated code in each TLV write. It also would result in a somewhat less ergonomic callsite as it would no longer fit into our existing list of TLVs.

Copy link
Contributor

@shaavan shaavan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look good to me, on going through the code!

Can we craft some tests for it?

lightning-macros/src/lib.rs Outdated Show resolved Hide resolved
/// Wraps a `match self {..}` statement and scans the fields in the match patterns (in the form
/// `ref $field_name: $field_ty`) for types marked `legacy`, skipping those fields.
#[proc_macro]
pub fn skip_legacy_fields(expr: TokenStream) -> TokenStream {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can break the function into modular parts. Something like process_match_pattern for handling the Enum::Variant part and a process_field for the internal fields. It might help keep things organized and easier to follow!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly, this stuff is just kinda inscrutable :(. I'll add more comments and split it but I'm not sure how much it'll help...

Comment on lines +236 to +282
let is_init = macro_name == "_init_tlv_based_struct_field";
let ty_tokens = mac.tokens.clone().into_iter().skip(2).next();
if let Some(proc_macro2::TokenTree::Group(group)) = ty_tokens {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understand, the input field for this would look something like:

field: _init_tlv_based_struct_field!(field_name, (legacy, ...))

since the second element needs to be a group. I’m a bit unsure, though, about what exactly should go in place of ...—I’d love any insights on that!
Also, maybe it would be helpful to expand the docs a bit to clearly outline the expected input and behavior of the macro for future reference.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is inherently a group, anything wrapped in () or {} is a group, even if its just one token. That said, the code doesn't require the second element be a group, it will accept anything that isn't.

lightning-macros/src/lib.rs Outdated Show resolved Hide resolved
res
}

/// Scans an enum definition for fields initialized to `LDK_DROP_LEGACY_FIELD_DEFINITION` and drops
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading through the code, I wasn’t quite able to figure out how LDK_DROP_LEGACY_FIELD_DEFINITION will be used in the end. I’d love to get some insights on that! Thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its not, that's stale.

lightning/src/util/ser_macros.rs Outdated Show resolved Hide resolved
Comment on lines 112 to 202
let self_ident = stream.next().unwrap();
expect_ident(&self_ident, Some("self"));
res.extend(proc_macro::TokenStream::from(self_ident));

let token_to_stream = |tok| proc_macro::TokenStream::from(tok);

let arms = stream.next().unwrap();
if let TokenTree::Group(group) = arms {
let mut new_arms = TokenStream::new();

let mut arm_stream = group.stream().into_iter().peekable();
while arm_stream.peek().is_some() {
let enum_ident = arm_stream.next().unwrap();
let co1 = arm_stream.next().unwrap();
expect_punct(&co1, ':');
let co2 = arm_stream.next().unwrap();
expect_punct(&co2, ':');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am just putting it there for reference, if you are looking for a no dependencies parser for the proc macro I developed a PoC for the rust compiler a while back https://github.com/rsmicro/kproc-macros this simplifies the parsing a little bit IMHO, but not sure if it is worth for just a single proc macro

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, I think its survivable for now, will see what others think.

As we've grown, we regularly face a question of whether to "break
out" of our nice TLV-based struct/enum reading/writing macros in
order to handle mapping legacy fields to new ones, or deal with
keeping the legacy fields and handling at runtime what should be
hanlded at (de-)serialization time.

This attempts to address this tradeoff by adding support for a
"legacy" TLV read. This read style allows us to read a TLV which is
not directly mapped to any fields in the struct/enum but which can
be computed from the struct/enum's layout at write-time and which
is incorporated into the read data at read-time.

It takes a type, a `$read` expression (which is executed after all
TLVs are read but before the struct/enum is built) and a `$write`
expression (which is executed to calculate the value to write in
the TLV).

They are always read as `option`s to retain a future ability to
remove the `legacy` fields.

Sadly, there's two issues with doing this trivially which force us
into `proc-macro` land:

(a) when matching the original struct we want to list the fields
    in the match arm so that we have them available to write.
    Sadly, we can't call a macro to have it write out the field
    name based on the field type, so instead need to pass the whole
    match to a proc-macro and have it walk through to find the
    types and skip fields that are `legacy`.
(b) when building a final struct/enum after reading, we need to
    list a few `$field: $expr`s and cannot decide whether to
    include a field based on a regular macro.

The proc-macros to do so aren't trivial, but they aren't that bad
either. We could instead try to rewrite our TLV stream processing
macros to handle a new set of TLVs which are passed via a separate
argument, but as TLVs are required to in ordered by type this
requires a good chunk of additional generated code in each TLV
write. It also would result in a somewhat less ergonomic callsite
as it would no longer fit into our existing list of TLVs.
@TheBlueMatt TheBlueMatt force-pushed the 2024-10-legacy-tlv-type branch from 10ce004 to 5113f3c Compare December 8, 2024 00:59
@TheBlueMatt TheBlueMatt added this to the 0.2 milestone Dec 8, 2024
@TheBlueMatt
Copy link
Collaborator Author

Oops, fixed CI, but I realize we kinda forgot about this one, and #3342 (might) depend on it. Tagging 0.2 for that reason.

Copy link
Contributor

@shaavan shaavan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM mod testing! 🚀

/// ```ignore
/// drop_legacy_field_definition!(Self {
/// field1: _init_tlv_based_struct_field!(field1, option),
/// field2: _init_tlv_based_struct_field!(field2, (legacy, u64, {}, {})),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit ✨

Suggested change
/// field2: _init_tlv_based_struct_field!(field2, (legacy, u64, {}, {})),
/// field2: _init_tlv_based_struct_field!(field2, (legacy, u64, {}, {})),

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants