Skip to content
This repository has been archived by the owner on Jul 31, 2023. It is now read-only.

Need an "XML Encode" option. #10

Open
garretwilson opened this issue Aug 10, 2016 · 1 comment
Open

Need an "XML Encode" option. #10

garretwilson opened this issue Aug 10, 2016 · 1 comment

Comments

@garretwilson
Copy link

escape-utils needs an option to only escape those characters that are recognized by an XML parser; specifically:

  • <
  • >
  • &
  • '
  • "

https://www.w3.org/TR/xml/#sec-predefined-ent

A little background: the full spectrum of traditional HTML entities (including e.g. é) is available to an HTML parser for non-XML HTML document. But if I create an XHTML document (which is essentially a normal HTML document but adhering to the stricter rules of XML), only the XML entities above are recognized, unless the XHTML document actually defines those entities (or pulls in a DTD that defines them).

So if I have the following test:

<p>touché</p>

escape-utils will HTML-encode that to:

&lt;p&gt;touch&eacute;&lt;/p&gt;

That works for a plain HTML document, but will break an XML parser, as &eacute; would be undefined. An "XML Encode" option would give me this:

&lt;p&gt;touché&lt;/p&gt;

Even in plain HTML, I may not want the é encoded---after all UTF-8 can handle accents just fine now. All those HTML entities for Latin characters were added when we were using plain ASCII to create HTML files, and editors didn't support UTF-8 and Unicode.

So please add an "XML Encode" option. It would work exactly the same as the "HTML Encode" option, except the list of entities would be restricted to those predefined XML entities listed at the beginning of this issue. (Obviously we would need a corresponding "XML Encode Maintain Lines" option as well.) Thanks.

@PiotrCzapla
Copy link
Contributor

Hi make sense, I don't think I will find time to add this anytime soon but feel free to provide a patch.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants