Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace jquery parseHTML with native alternative #474

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

harryadel
Copy link
Contributor

@harryadel harryadel commented Jan 6, 2025

Decoupling jQuery and Blaze would take substantial effort. jQuery is used in many places:

  1. HTML Parsing
DOMBackend.parseHTML = function (html) {
  return $jq.parseHTML(html, DOMBackend.getContext()) || [];
};
  1. DOM Selection
findBySelector: function (selector, context) {
  return $jq(selector, context);
}
  1. Element Teardown Detection
$jq.event.special[DOMBackend.Teardown._JQUERY_EVENT_NAME]
  1. Event Delegation/Handling
delegateEvents: function (elem, type, selector, handler) {
  $jq(elem).on(type, selector, handler);
}

They're ranked them in terms of ease of replacement, and impact. The challenge lies mostly in testing and ensuring cross browser compatibility so it's best to merge each change individually, do a minor release, test then repeat until all is merged then we can do a major release.

I chose to start off with parseHTML. It present a nice challenge where even if we got it off won't cause major errors and can act an indicator if the moving out of jQuery would be doable.

This implementation and the official tests were used in constructing the new tests to ensure backwards compatibility. You may remove the code I did and then re-add the jQuery parseHTML function and you'd find the tests still pass.

@harryadel harryadel marked this pull request as ready for review January 7, 2025 00:05
@harryadel
Copy link
Contributor Author

harryadel commented Jan 7, 2025

Copy link
Collaborator

@jankapunkt jankapunkt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much @harryadel for initiating this one!
I have one thing to discuss regarding the es5 code.

If Blaze depends on ecmascript then you're fine using es6+ because it will be compiled down (and for browser.legacy also in compatibility mode). On top of that, the classic var remains unscoped, while let and const are block-scoped, which allows for a more granular scoping and thus less errors.

However, this is just a suggestion and not a demand from my end and I'd like to see what others are saying.

@harryadel
Copy link
Contributor Author

I thought we need to support IE, no? that's why I opted for using es5 code. If that's not a concern I could revert back to es6

@jankapunkt
Copy link
Collaborator

My understanding is, that web.browser.legacy builds in a way that IE is supported. However I'm not 100% if that's the case.

@StorytellerCZ
Copy link
Collaborator

Plus IE is already out of official support anyway so I think we should scratch out IE support.
Right now the biggest offender is Safari.

Comment on lines 42 to 45
// Return empty array for empty strings
if (html === "") {
return [];
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is covered by !html above.

Is this implementation made by you or based on some existing one? It's not trivial (at least to me, because I'm not sure what "fancy stuff" is jQuery doing and what "IE quirks" are we talking about) and I'm wondering if there's maybe a different small and maintained library we could use instead.

Another difference I can see is that jQuery removes scripts by default (see keepScripts), which is not implemented here (and it's missing in the tests).

Copy link
Contributor Author

@harryadel harryadel Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is covered by !html above.

Good catch!

Is this implementation made by you or based on some existing one? It's not trivial

It's a mix of everything really, as I've said this example was source of inspiration along with jQuery's implementation.

I'm wondering if there's maybe a different small and maintained library we could use instead.

There're definitely lots of libraries like htmlparser2 but again they're not drag n' drop kind of replacement. They require fine tuning to be backwards compatible. I'd love to be proven wrong if someone out there knows of a 1:1 alternative.

Another difference I can see is that jQuery removes scripts by default (see keepScripts), which is not implemented here (and it's missing in the tests).

You're right. That can be added.

All in all, this PR can be used as a spring board for further discussion to seek our best solution. There're native APIs out there that can integrated like createHTMLDocument and DOMParser. Or using NPM libraries along with other modifications.

Copy link
Contributor Author

@harryadel harryadel Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@radekmie I made new modifications. Please recheck.

The only drawback is how garbage input gets handled now:

<#if><tr><p>Test</p></tr><#/if>  // Garbage input
// jQuery returns a length of 1
// Current solution returns 4 

jQuery would return a length of 1 as it attempts to maintain a root element for garbage input but in the new implementation it returns 4 as it creates a new element for each tag. I feel it's a small price to pay without trying to over engineer the current solution.

Also regarding the keepScripts part you mentioned which meant jQuery by default removes the script tag is now accounted for by a test case and https://github.com/apostrophecms/sanitize-html is used to handle other XSS. So in theory, when it comes to security the current implementation is better than the previous one.

EDIT: It appears that HTML sanitization causes problems due to event removal, we might need to only stick to script tag and call it a day 🤷. We'll see.

@distalx
Copy link
Contributor

distalx commented Jan 27, 2025

Should we consider use of document.createDocumentFragment()? It might help improve performance by reducing repaints when appending multiple child nodes. ref.

@jankapunkt
Copy link
Collaborator

@distalx I think we can put this on the list for improvements as it definitely makes sense for larger lists etc. However for now I'd like to have a maximum in compliance to the existing code behavior in order to not break things unless really necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants