Replace jquery parseHTML with native alternative #474

harryadel · 2025-01-06T13:43:25Z

Decoupling jQuery and Blaze would take substantial effort. jQuery is used in many places:

HTML Parsing

DOMBackend.parseHTML = function (html) {
  return $jq.parseHTML(html, DOMBackend.getContext()) || [];
};

DOM Selection

findBySelector: function (selector, context) {
  return $jq(selector, context);
}

Element Teardown Detection

$jq.event.special[DOMBackend.Teardown._JQUERY_EVENT_NAME]

Event Delegation/Handling

delegateEvents: function (elem, type, selector, handler) {
  $jq(elem).on(type, selector, handler);
}

They're ranked them in terms of ease of replacement, and impact. The challenge lies mostly in testing and ensuring cross browser compatibility so it's best to merge each change individually, do a minor release, test then repeat until all is merged then we can do a major release.

I chose to start off with parseHTML. It present a nice challenge where even if we got it off won't cause major errors and can act an indicator if the moving out of jQuery would be doable.

This implementation and the official tests were used in constructing the new tests to ensure backwards compatibility. You may remove the code I did and then re-add the jQuery parseHTML function and you'd find the tests still pass.

…ements

harryadel · 2025-01-07T00:06:25Z

cc @jankapunkt @radekmie @leonardoventurini @nachocodoner @StorytellerCZ

jankapunkt

Thank you so much @harryadel for initiating this one!
I have one thing to discuss regarding the es5 code.

If Blaze depends on ecmascript then you're fine using es6+ because it will be compiled down (and for browser.legacy also in compatibility mode). On top of that, the classic var remains unscoped, while let and const are block-scoped, which allows for a more granular scoping and thus less errors.

However, this is just a suggestion and not a demand from my end and I'd like to see what others are saying.

harryadel · 2025-01-07T08:18:29Z

I thought we need to support IE, no? that's why I opted for using es5 code. If that's not a concern I could revert back to es6

jankapunkt · 2025-01-07T09:09:51Z

My understanding is, that web.browser.legacy builds in a way that IE is supported. However I'm not 100% if that's the case.

StorytellerCZ · 2025-01-07T14:05:26Z

Plus IE is already out of official support anyway so I think we should scratch out IE support.
Right now the biggest offender is Safari.

radekmie · 2025-01-09T13:28:28Z

packages/blaze/dombackend.js

+  // Return empty array for empty strings
+  if (html === "") {
+    return [];
+  }


This is covered by !html above.

Is this implementation made by you or based on some existing one? It's not trivial (at least to me, because I'm not sure what "fancy stuff" is jQuery doing and what "IE quirks" are we talking about) and I'm wondering if there's maybe a different small and maintained library we could use instead.

Another difference I can see is that jQuery removes scripts by default (see keepScripts), which is not implemented here (and it's missing in the tests).

This is covered by !html above.

Good catch!

Is this implementation made by you or based on some existing one? It's not trivial

It's a mix of everything really, as I've said this example was source of inspiration along with jQuery's implementation.

I'm wondering if there's maybe a different small and maintained library we could use instead.

There're definitely lots of libraries like htmlparser2 but again they're not drag n' drop kind of replacement. They require fine tuning to be backwards compatible. I'd love to be proven wrong if someone out there knows of a 1:1 alternative.

Another difference I can see is that jQuery removes scripts by default (see keepScripts), which is not implemented here (and it's missing in the tests).

You're right. That can be added.

All in all, this PR can be used as a spring board for further discussion to seek our best solution. There're native APIs out there that can integrated like createHTMLDocument and DOMParser. Or using NPM libraries along with other modifications.

@radekmie I made new modifications. Please recheck.

The only drawback is how garbage input gets handled now:

<#if><tr><p>Test</p></tr><#/if> // Garbage input // jQuery returns a length of 1 // Current solution returns 4

jQuery would return a length of 1 as it attempts to maintain a root element for garbage input but in the new implementation it returns 4 as it creates a new element for each tag. I feel it's a small price to pay without trying to over engineer the current solution.

Also regarding the keepScripts part you mentioned which meant jQuery by default removes the script tag is now accounted for by a test case and https://github.com/apostrophecms/sanitize-html is used to handle other XSS. So in theory, when it comes to security the current implementation is better than the previous one.

EDIT: It appears that HTML sanitization causes problems due to event removal, we might need to only stick to script tag and call it a day 🤷. We'll see.

distalx · 2025-01-27T07:50:21Z

Should we consider use of document.createDocumentFragment()? It might help improve performance by reducing repaints when appending multiple child nodes. ref.

jankapunkt · 2025-02-04T15:46:08Z

@distalx I think we can put this on the list for improvements as it definitely makes sense for larger lists etc. However for now I'd like to have a maximum in compliance to the existing code behavior in order to not break things unless really necessary.

harryadel added 9 commits January 5, 2025 21:31

Replace jquery parseHTML with native alternative

f6bbe70

Add extra tests for plain text, self closing tags and nested table el…

396aa3a

…ements

Adjust code for the new tests

e5a580f

Pair createHTMLDocument with a fallback

f8cb08b

Properly handle leading white spaces

96d5063

Modify test to preserve white space nodes

df65dc0

Ensure createHTMLDocument is used

bec42cc

Ensure tests follows jQuery standards

4f5dda5

Modify our code to stay consistent with jQuery

287f9ce

harryadel marked this pull request as ready for review January 7, 2025 00:05

Use IE compliant features

447fe73

jankapunkt reviewed Jan 7, 2025

View reviewed changes

radekmie reviewed Jan 9, 2025

View reviewed changes

harryadel added 4 commits January 14, 2025 10:18

Use new approach

9731d20

Handle iframes and events

7a1629f

Use sanitize-html

822feba

Allow more tags and attributes

8e111fe

harryadel requested review from radekmie and jankapunkt January 14, 2025 09:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace jquery parseHTML with native alternative #474

Replace jquery parseHTML with native alternative #474

harryadel commented Jan 6, 2025 •

edited

Loading

harryadel commented Jan 7, 2025 •

edited

Loading

jankapunkt left a comment

harryadel commented Jan 7, 2025

jankapunkt commented Jan 7, 2025

StorytellerCZ commented Jan 7, 2025

radekmie Jan 9, 2025

harryadel Jan 9, 2025 •

edited

Loading

harryadel Jan 14, 2025 •

edited

Loading

distalx commented Jan 27, 2025

jankapunkt commented Feb 4, 2025

Replace jquery parseHTML with native alternative #474

Are you sure you want to change the base?

Replace jquery parseHTML with native alternative #474

Conversation

harryadel commented Jan 6, 2025 • edited Loading

harryadel commented Jan 7, 2025 • edited Loading

jankapunkt left a comment

Choose a reason for hiding this comment

harryadel commented Jan 7, 2025

jankapunkt commented Jan 7, 2025

StorytellerCZ commented Jan 7, 2025

radekmie Jan 9, 2025

Choose a reason for hiding this comment

harryadel Jan 9, 2025 • edited Loading

Choose a reason for hiding this comment

harryadel Jan 14, 2025 • edited Loading

Choose a reason for hiding this comment

distalx commented Jan 27, 2025

jankapunkt commented Feb 4, 2025

harryadel commented Jan 6, 2025 •

edited

Loading

harryadel commented Jan 7, 2025 •

edited

Loading

harryadel Jan 9, 2025 •

edited

Loading

harryadel Jan 14, 2025 •

edited

Loading