Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON encoding of Loyc trees #104

Open
qwertie opened this issue Apr 22, 2020 · 1 comment
Open

JSON encoding of Loyc trees #104

qwertie opened this issue Apr 22, 2020 · 1 comment
Labels

Comments

@qwertie
Copy link
Owner

qwertie commented Apr 22, 2020

@vladimir-vg mentioned he wanted to store syntax as "JSON in text file... file tree in Git or tree in IPFS". I'm not sure about those last two, but it would made sense to standardize a JSON representation of Loyc trees, so I'm sketching out a proposal. This proposal will mainly take the form of a series of examples showing how a given bit of LES3 code will be represented in JSON.

Identifiers

LES3 JSON Comment
Hello "Hello" Strings = identifiers in JSON
`` "" The empty identifier is the empty string
`\t\0\n\u1234` "\t\0\n\u1234" Escape sequences are largely the same
`\u01F4A9`.\u10FFFF` "\uD83D\uDCA9.\uDBFF\uDFFF" Astral characters are surrogate pairs in JSON
`\xFF.\uD800` "\xFF.\uD800" Invalid UTF-8 bytes are transliterated to 0xDCxx characters. High surrogates (0xD800..0xDBFF) are left alone.
# "+#" Single-character strings with an ASCII code of 64 or less are reserved for special purposes. Use a + prefix (#98) to define a single-character identifier with one of these values.
#if "#if" This rule does not affect multi-char identifiers
`'+` "'+" This rule does not affect normal operators
_ "_" This rule does not affect normal identifiers such as _ (ASCII 95)

Literals

LES3 JSON Comment
x"hi!" {"x": "hi!"} In general, literals become objects with one prop; the key is a "type marker"
"hi!" {"": "hi!"} The empty type marker represents a string
`@`"hi!" {"@": 123} As usual, unusual type markers are allowed
123 123 JSON number => assume type marker is "_"
123.0 {"_": "123.0"} JSON parsers may ignore the difference between "123" and "123.0". If a floating-point number is an integer, it should be stored in string form
1234f {"_f":"1234"} The type marker starts with _ for all numbers
1234f {"_f":1234} In JSON, the second array element can be a number
123 {"_":123} As usual, it can be stored as a pair instead
true true True and false are themselves (type marker bool)
true {"bool":"true"} Same thing in cumbersome form
null null Null is itself
null {"null":""} Null in cumbersome form
json"{\"x\":123}" {"json":{ "x": 123 }} Special case: object as JSON string
json"[\"x", 123]" {"json":["x", 123]} Special case: array as JSON string
json"{\"x\":123}" {"json":"{\"x\":123}"} Using special cases is optional
json"{x:123}" {"json":"{x:123}"} This cannot be stored in object form

Note that general JSON objects like { "x":1, "y":2 } have no interpretation above, and serve as an indicator that the JSON file does not represent a Loyc tree.

Calls

LES3 JSON Comment
foo() ["foo"] Calls are arrays
1234(z) [1234, "z"] As usual, literals can be called
x"hi!"(z) [{"x":"hi!"}, "z"] As usual, literals can be called
foo(x, 2, null) ["foo", "x", 2, null] Call with 3 arguments
x + 2 ["'+", "x", 2] As usual, operators are identifiers with an apostrophe prefix
{ } ["'{}"] As usual, braced block is a call to '{}
#foo(42) ["#foo", 42] As usual, there's nothing special about #
.foo 42 ["#foo", 42] Remember, LES3's dot-notation means #
{ "x": 123 } ["'{}", ["':", {"":"x"}, 123]] JSON stored in a Loyc tree is ugly when saved in JSON
["x", 123] ["'[]", {"":"x"}, 123] JSON stored in a Loyc tree is ugly when saved in JSON
foo(x)(y) [["foo", "x"], "y"] As usual, complex targets are possible

Attributes

LES3 JSON Comment
@Foo X ["@","Foo","x"] In general, attributes are attached via arrays that start with the magic string "@"
@x foo() ["@","x",["foo"]] (which, as mentioned before, is not an identifier)
@x @y(z) foo ["@","x",["y", "z"],"foo"] There can be multiple attributes. The final item is the tree to which the attributes are attached.
@123 X ["@",123,"x"] As usual, attributes can be any Loyc tree including literals.
/*comment*/ X ["@",["%MLComment","comment"],"X"] Trivia are attached in the standard way
foo ["@","foo"] This is legal, but pointless
N/A ["@"] Meaningless and illegal
N/A "@" Meaningless and illegal

Edited Jan 21, 2021: since no one has reported interest in using the JSON encoding, I've changed parts of the proposal without notice. Most notably, backreferences and attributes now use a more compact encoding. Previously, @a @b foo() would be represented as {"@":["a","b"], "":["foo"]}, but now it's ["@","a","b", ["foo"]].

@qwertie
Copy link
Owner Author

qwertie commented May 1, 2020

Also, in general, Loyc trees are DAGs (directed acyclic graphs) so I would also propose the following JSON representation for tree definitions and backreferences.

LES3 JSON Comment
@.id tree(a.b.c) ["*","id", ["tree",["'.",["'.","a","b"],"c"]]] A subtree that also has a name.
@@id ["*","id"] Backreference to a previously defined subtree.
@.id2 @x tree2() ["*","id2", ["@","x", ["tree2"]]] Define a subtree with an attribute.
@x @@id2 ["@","x", ["*","id2"]] Refer to a previously defined subtree and attach an attribute.
`*`(x) `["+*","x"] As mentioned above, certain one-character identifiers such as * must have + prepended to avoid ambiguity.

Edited Jan 2021 to make the notation more brief. The representation of @.id tree(a.b) changed from {"@@":"id","":["tree",["'.","a","b"]]} to ["*","id", ["tree",["'.","a","b"]]]; the representation of @@id changed from {"@@":"id","":[]} to ["*","id"]. The name * is intended to remind you of pointer notation in C/C#/Rust, since shared subtrees involve duplicate pointers.

Just as in LES3, a subtree definition must appear lexically before any references to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant