Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework TOML token parsing #100

Merged
merged 27 commits into from
Apr 30, 2024
Merged

Conversation

lkirkwood
Copy link
Contributor

This PR closes #81 and also allows a number of other valid identifiers to be parsed by reworking the logic for parsing TOML tokens from characters.

For example, all of the identifiers in the following document are valid according to the TOML spec (ABNF here), but none are currently able to be parsed:

[foo.bar.baz]
1key = "myval"
-inf = 0
2024-04-30 = 100
½ = 0.5

(although even github markdown highlighting doesn't get that last one)

@lkirkwood lkirkwood marked this pull request as ready for review April 29, 2024 16:56
Copy link
Collaborator

@knickish knickish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks pretty nice, thanks for the PR. If we can find a more readable way to handle the matching ( without trashing codegen) I would prefer that

src/toml.rs Outdated
#[allow(unreachable_patterns)]
match self.cur as u32 {
// ,
0x2C => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would really prefer if these stayed as chars instead of converting to u32 first. It might make the codegen a tiny bit worse, but I really doubt it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely, will do.

Sorry but I'm a bit confused - by codegen are you referring to generating the match statement from the ABNF?

@lkirkwood
Copy link
Contributor Author

Is this better? I would add comments with the characters next to each range in the macro, but many don't display in my system font. Can still do so if you feel it would help.

@knickish
Copy link
Collaborator

No I think that looks great. I'll give it a day or so and see if @not-fl3 has any input, if not will merge soon. Thanks for fixing this up!

@lkirkwood
Copy link
Contributor Author

No problem, thanks for the great project!

@not-fl3
Copy link
Owner

not-fl3 commented Apr 30, 2024

LGTM! Honestly I just don't have much of an opinion on TOML, feel free to merge!

@knickish
Copy link
Collaborator

Merging it is then. Thanks @lkirkwood

@knickish knickish merged commit 2f538fa into not-fl3:master Apr 30, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TomlParser doesn't allow keys with 0-9 in them
3 participants