-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More parser examples #465
Comments
I agree that such a cookbook would be nice but I assume that at this point there won't be any capacity on anyone's side to really put sth like that together. One real-life example of a larger pb2 parser construct are akka-http's header parsers here: The UriParser for example... But these won't really help you with your problem. In general "real" parsers shine when you have languages (or problems) that allow for recursion, because these are inherently difficult or even impossible to parser with simpler approaches like regular expressions. Also, if you know your input is always correctly formatted and will not contain syntax errors that need to be properly reported to the user, things become much easier. (As in your case.) |
Thanks for the write-up! Indeed, the akka-http parsers are good examples. Perhaps not for my specific problem, but possibly for somebody else's. Regarding the efforts writing a cookbook: I share your concerns but I think a little can already go a long way. Simple things like "how would I write this RegEx in Parboiled2?". Imagine a table, e.g.,
Or a simple parser showing how to translate this scala code: val s = """abc match
|def
|ghi match
|jkl match
|mno""".stripMargin.split("\n").filter(_.contains("match")).map(l => l.split(" ").head) into an equivalent Parboiled2 parser. Little things from which people can then gather a more hands-on understanding of how a PEG parser and parboiled2 in particular can be used. If this has a chance, then in the wiki or the discussions tab. Some place, where people can collaborate. If you still think it's a lost cause then I shall curb my optimism and not speak of it any further ;) Otherwise: I do have one or two small parsers to contribute. |
Ok, I see. I for my part find PEG parsing much easier to understand than other approaches like LL and definitely LR. |
Ok, I wasn't sure how exchangable PEG parsers would be. Do you know of any such cookbook? I know that Fastparse has a few examples and I assume there is quite some material out there for Scala parser combinators. But those support things, that Parboiled can't do (for instance, PCREs) so they are probably not as helpful. |
I'd appreciate a larger list of example parsers. There is the CSV parser and JSON parser, as well as the two calculator parsers and the ABC parser, of course. However, given sufficiently low caffeeine levels these and the documentation are not enough. I expect that my parser writing endeavors would be helped by more examples to draw inspiration from.
Is there some place where people contribute their Parboiled2 parsers? Github's discussion tab or the wiki could be a great place for this. ANTLR has its own repository with lots of examples.
Compare:
To provide some context as to what kind of problem I am trying to solve, imagine sbt's "inspect tree" output:
Say you are interested in the task and settings, e.g.,
some-project/*:packModuleEntries::streams
, perhaps their indentation depth, and don't care about the rest. There are multiple lines, not all of which contain mention of a task or setting. Is it better to treat it as one big string and write a parser such as?
Or is it better to treat it as individual lines, remove the
[info]
bit in the front separately, maybe even apply another rule on top of that now cleaned up line using a subparser or the~>
operator? When dealing with separate lines, how does one best filter out lines that do not contain a task or setting such as the first few lines and the fourth to last line that only contains a|
? I think there is multiple valid approaches to this and I write parsers infrequently enough that it feels more like an uphill battle to figure this out vs. just using RegEx, .filter, etc., again.In summary, a cookbook where users could contribute small examples sounds (to me) like a great idea. Especially for sligthly messy inputs. What are your thoughts?
The text was updated successfully, but these errors were encountered: