Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve support for oneof fields #337

Closed
NfNitLoop opened this issue Dec 7, 2022 · 10 comments
Closed

Improve support for oneof fields #337

NfNitLoop opened this issue Dec 7, 2022 · 10 comments

Comments

@NfNitLoop
Copy link

NfNitLoop commented Dec 7, 2022

After using protobuf-es-generated code a bit more, I'm going to expand on my comment in another issue and make it an issue of its own.

Currently, accessing oneof fields in code generated by protobuf-es is a bit more cumbersome than in other protobuf code generators. In my protobuf definition I've got a oneof that might be a post, comment or profile. Both protoc-gen-ts in TypeScript and rust-protobuf provide ways to quickly access or set a .post. So in TypeScript I can do something like:

let post = item.post
if (post) { 
   // ... 
}

or

item.post = new Post(...)

But in code generated by protobuf-es, I have to do:

let post: Post|undefined = undefined
let it = item.itemType
if (it.case == "post") { post = it.value }
if (post) {
   // ...
}

which ends up in my code so often that I made a helper function for myself.

// Helper function for getting inner Item types.
export function getInner(item: pb.Item, field: "post"): pb.Post | undefined;
export function getInner(item: pb.Item, field: "profile"): pb.Profile | undefined;
export function getInner(item: pb.Item, field: "comment"): pb.Comment | undefined;
export function getInner(item: pb.Item, field: "post"|"profile"|"comment"): pb.Post | pb.Profile | pb.Comment | undefined {
    let it = item.itemType
    if (it.case == field) {
        return it.value
    }
}

And I just discovered that when I set a field, I have to do:

// Nope: item.comment = comment
item.itemType = {case: "comment", value: comment}

It would be nice if generated code would just expose oneof fields as top-level fields (or properties) like other protobuf generators.

All that said, really enjoying protobuf-es so far. Native ESMsupport works so much more nicely than trying to get Google's protobuf implementation to compile to all my targets. 😄

@NfNitLoop
Copy link
Author

In case I have time this weekend (or over the holidays, but no promises! 😅), would you be open to a pull request that implements this?

I'd probably update the generator to just make getters/setters for oneof members. They'd all delegate to the oneof {case, value} and just serve to reduce boilerplate for accessing those values.

@timostamm
Copy link
Member

Hey Cody,

protoc-gen-es gives you easy access to oneof members, but at the cost of type safety.

Let's say you have the following protobuf source:

syntax="proto3";
message Post {}
message Profile {}
message Item {
  oneof item_type {
  Post post = 1;
  Profile profile = 2;
  }
}

If you forget an if statement, you can easily run into a runtime error:

const i = new Item();
i.post.serialize(); // TypeError: Cannot read properties of undefined

This problem could be solved by making oneof members optional. A very simple version would look like this:

class Item2 {
  private _post: Post | undefined;
  private _item: Item | undefined;

  set post(value: Post | undefined) {
    this._post = value;
    this._item = undefined;
  }
  get post(): Post | undefined {
    return this._post;
  }

  set item(value: Item | undefined) {
    this.post = undefined;
    this._item = value;
  }
  get item(): Item | undefined {
    return this._item;
  }
}

With this class, we can no longer accidentally access an undefined property:

const i2 = new Item2();
i2.post.serialize(); // TS18048: 'i2.post' is possibly 'undefined'

if (i2.post) {
  i2.post.serialize(); // this is fine, we just removed `undefined` from the type union with the if statement
}

But properties that unset other properties throw off TypeScript's type narrowing:

if (i2.post) {
  i2.item = new Item();
  i2.post.serialize(); // TypeError: Cannot read properties of undefined
}

We want to avoid issues like this, and the algebraic data type we use in protobuf-es is a solid solution for the problem. It is a trade-off we made - we gain a proper model of oneof semantics in the type system, and lose some ergonomics around setting oneof values.

Getting oneof values seems to be rather straight-forward though? The getInner() example you're giving still requires you to check for the undefined case. I'd argue that if you have to make checks anyway, you could just as well go for an exhaustive switch statement:

const {itemType} = new Item();

switch (itemType.case) {
  case "post":
    itemType.value; // handle Post
    break;
  case "profile":
    itemType.value; // handle Profile
    break;
}

We are constantly looking to improve protobuf-es, but the trade-off we made for the representation of oneof groups was a design decision. I hope that the ergenomics aren't too inconvenient for you and that it helps avoiding runtime errors for you as well as it did for us.

@lwhiteley
Copy link

lwhiteley commented Dec 12, 2022

@timostamm is there a way to get an enum or const map of the oneof properties.

I also miss this from other protobuf generators.

From protobufjs, you would get [oneof]Case enum that you could use in switches without the need to use plain strings.

Would this be possible for protobuf-es without compromising design decisions mentioned.

From the example above i would have to create myself the snippet below to do a migration from protobuf-js to protobuf-es. However, this wasn't mentioned in the migration guide so I was wondering if I missed something

const ItemTypeCase = {
 PROFILE: 'profile',
 POST: 'post'
} as const 

@timostamm
Copy link
Member

Layton, thanks for the feedback. Perhaps we should should go into more detail in the migration docs.

I'd like to note that the strings are constrained, though. You cannot set or get a case that does not exist:

Screen Shot 2022-12-12 at 11 38 25

I see that the case object could be very handy if for migrating. Here is a small plugin that generates the snippet you're handwriting: https://gist.github.com/timostamm/d36a6f15b010cbd8b0bf91e734df8cf3

I'm not sure that the case object provides any benefit over the union type besides the migration use case, TBH.

@lwhiteley
Copy link

Hey @timostamm
That's quite awesome how it was easy to whip up a plugin that quickly.

Context:
Im using this in a code base that is not yet fully migrated to typescript so we dont really get this benefit yet. One could easily make mistakes with strings in a plain js environment

@NfNitLoop
Copy link
Author

NfNitLoop commented Dec 12, 2022

The getInner() example you're giving still requires you to check for the undefined case. I'd argue that if you have to make checks anyway, you could just as well go for an exhaustive switch statement

I realize that was the intent with the design, and that would be fine if every time I needed to access my oneof field I needed to cover every case. But because I end up using the top-level Item as a data type throughout my codebase, there are many code paths that should only be reached for one of the cases, so I end up only checking one case, and throwing an error if that case isn't present.

But properties that unset other properties throw off TypeScript's type narrowing

Aha! I've been writing so much Kotlin recently (where this is handled in a type-safe way) that I didn't realize this was a shortcoming in TypeScript's handling of properties. (side note, I think you mixed up Profile and Item in your example code. I wrote what I think you intended here.)

This seems like a shortcoming in TypeScript, though, so I'd still love to be able to opt in to those less type-safe helpers if possible.


Alternatively: What if oneof discrimination were done like this?:

interface Item {
  oneof: {
    profile: Profile,
    post?: undefined,
  } | {
    post: Post
    profile?: undefined
  } | {
    profile?: undefined,
    post?: undefined,
  }
}

That way lets us both use .oneof?.post directly, AND lets TypeScript give compile-time type errors:

function example(item: Item, someProfile: Profile) {
  if (item.oneof?.post) {
    item.oneof = {profile: someProfile} 

    // TypeScript now gives us an error: 🎉
    item.oneof.post.serialize()
  }
}

@jcready
Copy link

jcready commented Dec 14, 2022

I believe that setup would make it more painful to generically handle the oneof as I don't thinks there's a way to use a switch at all. Object.keys(item.oneof)[0] is just string and even if cast to keyof typeof item.oneof that still won't narrow the value. This means the only way to generically handle a oneof would be via if-statements checking that the property value wasn't undefined.

So let's imagine that instead of just Post or Profile we had a dozen different messages that were part of the oneof. And let's also imagine each message had a common field, a string id, that we wished to return from a function.

// Before
function getItemId(item: Item): string | undefined {
  switch (item.itemType?.case) {
    case "post":
    case "profile":
    //...more cases
    case "whatever":
      return item.itemType.value.id;
  }
}

// After
function getItemId(item: Item): string | undefined {
  if (item.itemType?.post) {
    return item.itemType.post.id;
  }
  if (item.itemType?.profile) {
    return item.itemType.profile.id;
  }
  //...more cases
  if (item.itemType?.whatever) {
    return item.itemType.whatever.id;
  }
}

Another thing to consider is how the number of fields inside the oneof would drastically increase the size of the type in the generated code:

// The number of lines is effectively:
// Before: N * 2
// After:  N ^ 2

// Before
type ItemType = {
  value: Post;
  case: "post";
} | {
  value: Profile;
  case: "profile";
} | {
  //...nine more cases with only two properties: value and case
} | {
  value: Whatever;
  case: "whatever";
} | { case: undefined; value?: undefined }



// After
type ItemType = {
  profile: Profile;
  post?: undefined;
  //...9 more properties which are optionally undefined
  whatever?: undefined;
} | {
  profile?: undefined;
  post: Post;
  //...9 more properties which are optionally undefined
  whatever?: undefined;
} | {
  //...12 properties where the defined property is field 3
} | {
  //...12 properties where the defined property is field 4
} | {
  //...12 properties where the defined property is field 5
} | {
  //...12 properties where the defined property is field 6
} | {
  //...12 properties where the defined property is field 7
} | {
  //...12 properties where the defined property is field 8
} | {
  //...12 properties where the defined property is field 9
} | {
  //...12 properties where the defined property is field 10
} | {
  //...12 properties where the defined property is field 11
} | {
  profile?: undefined;
  post?: undefined;
  //...9 more properties which are optionally undefined
  whatever?: Whatever;
} | {
  profile?: undefined;
  post?: undefined;
  //...9 more properties which are optionally undefined
  whatever?: undefined;
}

@lwhiteley
Copy link

lwhiteley commented Dec 15, 2022

https://gist.github.com/timostamm/d36a6f15b010cbd8b0bf91e734df8cf3

@timostamm Thanks again for the script. worked perfectly but two things i had to do

  1. had to run chmod +x protoc-gen-oneofhelper.ts
  2. had to modify the code a bit due to some conflicts with similar named oneof properties
const titleCase = (value: string) => {
  return value.replace(/^[a-z]/, (v) => v.toUpperCase());
};

// prettier-ignore
function generateMessage(f: GeneratedFile, message: DescMessage) {
  for (const oo of message.oneofs) {
    // Name of the enum we are about to generate
    const name = titleCase(message.name) + '_' + titleCase(localName(oo)) + "Case";
    f.print`export const ${name} = {`;
    for (const field of oo.fields) {
      f.print`  ${field.name.toUpperCase()}: '${localName(field)}',`;
    }
    f.print`} as const;
`;
  }
  for (const nestedMessage of message.nestedMessages) {
    generateMessage(f, nestedMessage);
  }
}

Not the exact migration 1:1 but works fine as I can just do an import as to rename it to the desired name 👍🏾

@smaye81
Copy link
Member

smaye81 commented Jan 10, 2023

Going to close this issue as it seems there's a workable solution. If another issue arises, feel free to reopen. Thanks!

@smaye81 smaye81 closed this as completed Jan 10, 2023
@boan-anbo
Copy link

Layton, thanks for the feedback. Perhaps we should should go into more detail in the migration docs.

I'd like to note that the strings are constrained, though. You cannot set or get a case that does not exist:

Screen Shot 2022-12-12 at 11 38 25

I see that the case object could be very handy if for migrating. Here is a small plugin that generates the snippet you're handwriting: https://gist.github.com/timostamm/d36a6f15b010cbd8b0bf91e734df8cf3

I'm not sure that the case object provides any benefit over the union type besides the migration use case, TBH.

Was looking for a type-safe way to switch on oneof cases, and this solves my problem. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants