-
Notifications
You must be signed in to change notification settings - Fork 92
Make attribute value and text content escaping more conforming #304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
To match `XMLSerializer` for XML documents and `element.outerHTML` for HTML documents.
d2ca98c
to
38595ed
Compare
38595ed
to
f67a19c
Compare
f67a19c
to
6aa67ee
Compare
I think this is a breaking change … and title doesn’t need escaping or it results as title escaped … if you’re after JSDOM compliance I think you should stop trying to make LinkeDOM JSDOM compliant but if you need for real world, not testing related reasons, these changes, please be careful with suggested changes because this library is not just a mock target, it can render in workers and edge cloud scenarios |
to clarify, this is a perfectly valid <title>1 <and>&</and> 2</title> it's a special node that acts like |
My general use case for the three PRs i have opened is actually the same one. I'm doing SSR of XHTML and SVG documents and until now i was indeed using JSDOM for this. However i'm starting to face performance issues, as JSDOM implements many things i don't need. Basically all i need is parsing, basic DOM manipulation (setting attributes and adding nodes) and serializing the whole thing, and LinkeDOM seems to fit my needs. Thanks for making it! The data i generate is versioned though, so while this not about testing, i indeed want to minimize the diffs, and together, the PRs i have opened achieve avoiding having any diff on my data. That said, the particular issues i'm trying to fix with this PR are:
While looking at the spec, i also fixed other escaping cases i encountered. |
this is the key and reason this project is fast ... so let's talk about this: why would you need attributes insertion order, as example or why should I add complexity for things nobody practically cares about? I am afraid I'll be a push back for anything not really interesting, relevant, solving anything useful and so on ... there is a bug? PR welcome ... there is a not 100% strict behavior nobody cares about or use in the real world? PR not welcome as I don't care maintaining those things and neither should you. |
I think there are different things to consider here:
So if we could simplify and split the MR in different precise fixes with tests, that'd be awesome ... otherwise it'll stale |
Fixes #216
Supersedes and closes #217
The HTML spec defines which characters to escape when serializing HTML. This got changed recently to also serialize
<
and>
in attribute values.This aligns with the DOM Parsing and Serialization spec which defines its own rule for escaping attribute values and for escaping text nodes, both requiring these two characters to always be escaped.
HTML also requires non-breaking spaces to be escaped (in HTML).
Also browsers de facto serialize tabs, line feeds and carriage returns in attribute values in XML (there is no spec for this, but there is a web platform test). JSDOM does this too.
This PR implements those requirements.
This PR also brings another fix about
<title>
elements which used to be considered a special case when serializing in this codebase, while i couldn't find anything in the spec about this (the<title>
element is not in this list). Instead, we customize itsinnerHTML
setter to escape everything, which prevents child elements to be added to them through this property (for HTML documents only).