Replies: 6 comments 6 replies
-
This looks really cool. Have you made a Python parser yet? Are there any white space sensitive languages that use tabs over spaces? Aren't tabs usually implementation dependent? |
Beta Was this translation helpful? Give feedback.
-
As a proponent of tabs (not religiously though, purely rational), I hope it will be possible to use tabs for indentation. |
Beta Was this translation helpful? Give feedback.
-
@joshmarinacci Any thoughts on the question about whether |
Beta Was this translation helpful? Give feedback.
-
Is this example still available? I can't really figure out how the example in this post works, so having an example to experiment with will help a lot :) |
Beta Was this translation helpful? Give feedback.
-
Initial thoughts from the write up (I'll get to trying it out in a couple days)
|
Beta Was this translation helpful? Give feedback.
-
On this excellent talk "The Post JavaScript Apocalypse" by Douglas Crockford (creator of json.org and author of "Javascript the good parts"). https://www.youtube.com/watch?v=NPB34lDZj3E His main point is that in a well designed system it should be obvious how to do the right thing. And thing that "don't spark joy" are the ones that lead to conflict.
I think that he misses the mark. So my vote is for TABS for code nesting. And also I recommend watching the talk. |
Beta Was this translation helpful? Give feedback.
-
As part of the upcoming Ohm v17 release, I've been working on a new feature — support for indentation-sensitive grammars. I'd love to get some feedback on what I've got so far.
You can try this out by installing the latest prerelease (
npm install [email protected]
). Below is some documentation to get you started. Any and all feedback is welcome!Parsing indentation sensitive languages
As of v17, Ohm supports indentation-sensitive languages. This means that it's possible to write Ohm grammars for languages like Python and YAML.
Background
Ohm language is based on parsing expression grammars (PEGs), and pure PEGs can't express indentation sensitivity. The usual trick is to pre-process the input and insert explicit
indent
anddedent
tokens, then parse the modified output. While this works, it has a few downsides:For these reasons, we decided to add built-in support for indentation sensitive languages.
Making indentation-sensitive grammars
To define an indentation-sensitive language, create a grammar that inherits from
IndentationSensitive
. For example, here is a grammar for language support nested lists of bullet points:Implementation details
The
indent
anddedent
rules are primitive rules defined byIndentationSensitive
. You can think of them as special characters that automatically inserted at the appropriate points — except that they take up no width in the input stream. They are inserted immediately after the associated indentation characters at the beginning of the line. For example, here is some Python code, with comments indicating where the indents and dedents are inserted:There is also final dedent at the end of the input.
Examples
See examples/indentation-sensitive for an example you can experiment with.
Notes and open questions
any
be able to consume an indent or dedent?any*
to consume all of the remaining input. This wouldn't be true anymore ifany
failed on indent/dedent.any
succeeds, but consumes no actual input characters?any
to consume indent/dedent; you should probably consume those explicitly.Beta Was this translation helpful? Give feedback.
All reactions