Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Signatures as a special case of enums #434

Open
KOLANICH opened this issue May 13, 2018 · 2 comments
Open

Signatures as a special case of enums #434

KOLANICH opened this issue May 13, 2018 · 2 comments

Comments

@KOLANICH
Copy link

Working on #383 I found that that language doesn't have a distinct tag for a signature. Instead it has 2 types of enums: mustmatch="yes" and mustmatch="no". And this allows to match against multiple signatures. For now we don't have this distinction, instead the behavior is defined by runtime, for example for JS runtime it's OK to have properties not be a valid enum value, but python runtime raises an error in this case.

So the proposal
1 Allow a value of any type to be a enum. For example strings.
2 Widen the set of possible values for a enum. If the value is a string, it's a enum name. If the value is an array, it's an array of values of the same type as the property is marked with (the behavior somehow consistent to contents).
3 Introduce way to check if the matched value a valid enum value. Maybe as _enum_valid?
4 So with #81 implemented we will be able to throw on incorrect signatures and / or enum values.
5 once it is implemented, deprecate contents, automatically transform it into a new equivalent and remove it.

@KOLANICH KOLANICH changed the title Enums as signatures Signatures as a special case of enums May 13, 2018
@GreyCat
Copy link
Member

GreyCat commented May 15, 2018

I don't think I understand this proposal. For starters, I don't understand what "allowing a value of any type to be enum" means. Vast majority of our target languages treats enums as an integer-based type with a set of predefined, namespaced integer constants. There are no other "enums", and I don't understand why we should invent & emulate them, and even if we will, why we should name these things "enums", as it would only confuse end-users.

As far as I understand, what you imply is:

  1. You don't like current contents implementation
  2. You want Checksumming, other constraints and asserts validation and exceptions #81 to be implemented
  3. You want some way to check a value against a set of possible "legal" values. We can actually do that in Checksumming, other constraints and asserts validation and exceptions #81, as a certain type of constraint, akin to simple equality, "less than", "greater than", etc.

@KOLANICH
Copy link
Author

For starters, I don't understand what "allowing a value of any type to be enum" means.

Let's look what enums are. They are often not quite numbers, but just unique identifiers, which exact numeric value doesn't matter, what matter are relations "the same as" and "contains" (usually achieved with bit operations). I mean a programmer usually uses enums in the following cases:
a) he needs a compile-time constant not consuming memory andd don't want to use preprocessor; out of scope of KS;
b) he needs to pass some binary flags consuming as little space as possible and it's convenient to use enums here. KS allows doing this directly with b1 type.
c) he needs an identifier. With this case in mind enums were designed in most of programming languages.

So sometimes we don't need their underlying types.

Vast majority of our target languages treats enums as an integer-based type with a set of predefined, namespaced integer constants.
Yes, but since we think of them as about chunks of bits, we have no reason to distinguish an integer from a string. For example some formats use 4 byte strings, and ksy developers do some perversions like packing their bytes into integers and use these integers as values for enums. I guess it's better to write directly

enum:
  a:
    "'chck'": check

than

enum:
  a:
    0x6b636863: check #chck

There are no other "enums", and I don't understand why we should invent & emulate them

I guess these generic enums will be only in Kaitai Struct, but in underlying languages something more suitable for that language (maybe a enum, but not necessary) would be used.

and even if we will, why we should name these things "enums", as it would only confuse end-users.

Because they are compared like enums and we think of them as enums.

You don't like current contents implementation

I have not said that. I just run into something which may cover it use case and some other. What I don't like, it is current enum implementation since it is not consistent across languages: the parser from the same ksy compiled into JS will parse a file successfuly, but compiled into python will raise an error.

You want #81 to be implemented

Of course I do. Don't you?

You want some way to check a value against a set of possible "legal" values. We can actually do that in #81, as a certain type of constraint, akin to simple equality, "less than", "greater than", etc.

Not quite. I mean that now there are different formats with different specs. In some formats if you encounter an unknown frame type id you cannot parse the file any further because you don't know the size of a frame, in another ones you can just ignore the frames with unknown to you type identifiers. So a ksy developer needs control on what to do in the case of an uexpected value. Now we don't have. Instead it depends entirely on language. So we need to fix this. But once implemented, it turns out that contents become unneeded, we would be able to do the same with enums throwing an error.

In synalysis grammar language there are 2 types of enums. But it is a bit restricted, I guess we may do better. A more flexible system would be if we are able to check if a value has been matched to a enum not only from that field, but also from other ones.

I thought about it one little bit more and now explicit check doesn't look such a good idea:
we wanna the generated code use the features of languages as much as possible. Some languages have built-in (or in an stdlib) enums throwing an error if its value is invalid. So we have to detect that case to use built-in throwing enums for it. But with an explicit check it will require triggering a SMT-solver: if we have a check

throw:
  if: F(a._enum_valid, x)

where x is a vector of bools and F is a boolean function, to distinguish that case we have to check satisfability of F(a._enum_valid, x) xor (!a._enum_valid | F(True, x)) == 0, and if it solves to true then we generate the code using the checking enums and use F(True, x) instead of F(a._enum_valid, x) in the check .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants