-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🔄 Redefine schema in terms of @types/mdast
#67
base: main
Are you sure you want to change the base?
Conversation
@@ -4,5 +4,4 @@ A basic Markdown Abstract Syntax Tree, mdast, is defined at [](https://github.co | |||
|
|||
## Deviations commonmark mdast | |||
|
|||
- According to the mdast spec [list items](https://github.com/syntax-tree/mdast#listitem) may only have [flow content](https://github.com/syntax-tree/mdast#flowcontent) children. However, according to the commonmark spec, list items may also be flow or phrasing content, such as [text](https://spec.commonmark.org/0.30/#example-255). This depends on spacing between the list items in the original document. In myst-spec, we choose to follow the commonmark spec and allow `ListItem` children to be `FlowContent` or `PhrasingContent`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is correct w.r.t the CommonMark spec:
- one
two
produces this CommonMark AST:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document SYSTEM "CommonMark.dtd">
<document xmlns="http://commonmark.org/xml/1.0">
<list type="bullet" tight="true">
<item>
<paragraph>
<text>one</text>
</paragraph>
</item>
</list>
<paragraph>
<text>two</text>
</paragraph>
</document>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this saying that we default to having a paragraph instead of text directly in the list item? Which is what is created?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our documentation states that
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document SYSTEM "CommonMark.dtd">
<document xmlns="http://commonmark.org/xml/1.0">
<list type="bullet" tight="true">
<item>
<text>one</text>
</item>
</list>
</document>
is permitted by the CommonMark spec. I don't see any evidence that that is the case. What I did notice is that the HTML renderer doesn't show the <p>
, but I think that might be a rendering difference rather than a spec difference.
I noticed this because whilst the spec page shows . EDIT: oops, off-by-one error :P<li>TEXT</li>
in the inline example, the actual examples file does not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To add -- if my conclusion here is incorrect, then typing this proves quite hard because it is not easy/possible to extend an existing type and change the type of children in that manner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow... yeah, wild that the html doesn't show the <p>
despite it being in the AST! Oof, good catch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a good direction. Excited to have proper typescript types.
Docs and a downstream PR in mystmd consuming these changes are the major things to see before merging/releasing.
This reverts commit 815c0fd.
TODO: move to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks great. It sure is nice to define actual typescript types - I think my original reasoning for JSON schema was (a) language agnostic (a bit too lofty a goal so early in the project, I suspect...) and (b) I wasn't super happy with any TS -> JSON options. Looks like you have one that works, though.
I'm approving this, but I agree with Rowan that we should probably consume it in mystmd
before merging/releasing just to iron out any type oddities that may come up (I assume it will substantially improve things and allow us to basically scrub GenericNode
/GenericParent
from the codebase).
This PR redefines the MyST schema in terms of types exposed by
@types/mdast
, and generates the MyST JSON schema from these types.By making this change, we no longer need to vendor the CommonMark schema itself, and we can directly author the types that the rest of the MyST ecosystem consumes (rather than writing indirect JSON schema).
Tasks
Schema
Warning
This PR changes our spec slightly, to rectify breaking changes with MDAST and to make better use of types. These changes are as follows:
Table.align
change fromstring | undefined
tostring[] | undefined | null
(this comes from@types/mdast
crossReference
,mystRole
,mystDirective
change fromNode
toLiteral & Parent
(these attributes already were used, but not at the required-type level)Docs
To build the docs, run
npm run docs:prepare
, and thenmyst start
in thedocs
directory.With this PR, it's now the case that the code (rather than any IDL definition or schema) is the reference spec for working with MyST's AST. Consequently, this PR reworks the docs to focus on people who want to reason about e.g. MyST JSON AST, such as people writing plugins in Python.1
I chose to split each node into its own page:
Complex types (only arrays presently) are moved to footnotes:
This might be a controversial change, but I think on balance it does help readability.
Schema
The downside of this PR is that we have to use https://github.com/YousefED/typescript-json-schema, because of the way that
mdast
types are designed to be extended (via interface declaration merging). This mechanism is not handled by any AST-driven TS -> JSON Schema generators, so we need something that actually invokes the compiler (see above).Footnotes
This is not solely a pragmatic take on who needs the docs -- it's also hard to preserve all syntactic type information when exporting the spec to a non TS format (see also Schema). As we lose types like
Array<BlockContent | DefinitionContent>
(which is simply expanded to an unnamed literal intersection), I opted to focus exclusively on the parts of the AST that end up in JSON, i.e. node types, rather than e.g. named array types. ↩