Skip to content

Allow namespaces in tag names. #123

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

ioquatix
Copy link

@ioquatix ioquatix commented May 18, 2020

Is there any documentation about how to run re2c to minimise the scanners.c diff?

@kivikakk
Copy link
Collaborator

These come from the upstream cmark project, which we track. Its Makefile includes a rule for regenerating scanners.c.

This change should technically be directed to that project, as we copy changes from it via cmark-gfm, but I don't believe any current Commonmarker maintainer has write access to cmark-gfm at present, so there's no guarantee any upstream cmark changes will make it through in a timely fashion. A change to scanners.re and scanners.c in Commonmarker would be lost next time we sync with cmark-gfm.

This should perhaps first be proposed as an update to the spec itself, which defines HTML tags and is responsible for the tagname definition there.

@gjtorikian What do you think of the current situation? As it stands I'm not sure cmark-gfm is maintained, which would mean we lose the ability to track upstream cmark changes.

@ioquatix
Copy link
Author

I'm happy to make a proposal to change the spec but I have no idea where to begin, and maybe it's rude but I just want this one feature to work, it's a blocker for me. I can just make a fork, but that doesn't seem like nice behaviour TBH.

@movermeyer
Copy link

I would like to voice opposition to anything that makes this gem deviate from the Commonmark spec in any way. We're relying on this gem to be a faithful implementation of the spec, with a nice Ruby interface.

It should accept no more, and no less than what the spec says.

@ioquatix
Copy link
Author

Sure, that makes sense, so we should update the spec then?

@kivikakk
Copy link
Collaborator

@ioquatix Yes — it is thankfully not too hard. Fork the commonmark/commonmark-spec repo, and open a PR making the necessary changes to spec.txt — I've linked the Raw HTML section of it here. You'll want to update the paragraph that begins: A [tag name](@) consists of ….

A bit further down, examples of valid and invalid HTML tags follow. Add some examples showing how namespaces should work, i.e. that a tag like <a:b> should be passed through unchanged, not translated to &lt;a:b&gt;. Once accepted and included in a new version, compliant implementations will be out of spec until the new examples pass.

@ioquatix
Copy link
Author

@kivikakk thanks for your help.

@kivikakk
Copy link
Collaborator

@ioquatix You're welcome! Feel free to @mention me in the new PR and I'll do what I can to help shepherd it along.

@gjtorikian
Copy link
Owner

What do you think of the current situation? As it stands I'm not sure cmark-gfm is maintained, which would mean we lose the ability to track upstream cmark changes.

It's a bad situation, to be sure. I doubt that cmark-gfm will be modified in the future; it does what it needs to for GitHub. I keep thinking about moving the backend for this gem to Rust (#107), both for added safety and for more incremental improvements, but I haven't had a chance to look at those implications yet.

Regardless, for this specific change, I'm afraid it will have to adhere first to the Commonmark spec (which takes precedence over GitHub's extensions spec). There's already a discussion for it on commonmark/commonmark-spec#648 which suggests it's not a simple change unfortunately. When/if it gets in there I'd be open to including it here (with tests). Hopefully it's a small enough change whereby if GitHub does update their C code for something else we can apply the regex diff here. (Or maybe by then we'll stop using their repo.)

We're relying on this gem to be a faithful implementation of the spec, with a nice Ruby interface.

PS If Shopify is using this gem, please consider sponsoring it.

@gjtorikian gjtorikian closed this Aug 11, 2020
@ioquatix
Copy link
Author

Just in case anyone stumbles across this issue, I made a fork of commonmarker and reworked a bunch of the C code to be more parser centric.

https://github.com/ioquatix/markly/

It supports tags with XML namespaces and almost passes the cmark spec, except for one failure which is questionable on the part of the spec. I also found that HTML5 definition of how to parse a tag name is at odds with cmark, so I suspect change will eventually happen.

@ioquatix ioquatix deleted the namespaces branch August 12, 2020 01:18
Copy link

@crimscim crimscim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hei

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants