-
-
Notifications
You must be signed in to change notification settings - Fork 348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sanitizing localization files #1354
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @berhalak!
|
||
function purify(inputString) { | ||
// This removes any html tags from the string | ||
return DOMPurify.sanitize(inputString); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure that DOMPurify.sanitize
in fact removes all html tags from a string? Examples at https://github.com/cure53/DOMPurify?tab=readme-ov-file#some-purification-samples-please contract that, e.g.:
DOMPurify.sanitize('<img src=x onerror=alert(1)//>'); // becomes <img src="x">
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@berhalak, I'm wondering if it's better to revert this change.
This PR care of the XSS attack vector discussed here, but I think style
elements can still be injected (if strings are not escaped). Furthermore, I don't think we explicitly set innerHTML
to a translation string anywhere in our code, do we? Sanitizing here certainly doesn't hurt, but I don't think it fully replaces the need to escape HTML, and it seems like that's all that #1247 needs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure that DOMPurify.sanitize in fact removes all html tags from a string?
@dsagal Indeed, it does not. Currently we don't localize texts with any html tags.
In the future, we might support some of them.
Would it make sense to whitelist html tags through ALLOWED_TAGS
?
https://github.com/cure53/DOMPurify/blob/f41b45df18a9666a50c1ad2662cee259230cfef4/src/config.ts#L57
return DOMPurify.sanitize(inputString); | |
return DOMPurify.sanitize(inputString, { ALLOWED_TAGS: [ /* Place here tags you want to whitelist */ ] });); |
This PR care of the XSS attack vector discussed here, but I think style elements can still be injected (if strings are not escaped).
@georgegevoian I made some tests, it looks like the <style>
tags are removed:
$ node
> const createDOMPurify = require('dompurify');
undefined
> const { JSDOM } = require('jsdom');
undefined
> const window = new JSDOM('').window;
undefined
> const DOMPurify = createDOMPurify(window);
undefined
> DOMPurify.sanitize('<style>hello</style>');
''
I'm wondering if it's better to revert this change. [...] Furthermore, I don't think we explicitly set innerHTML to a translation string anywhere in our code, do we?
I would like to advocate for keeping this work. Actually even if we suppose the current state of the work would actually use textContent
everywhere, preventing the XSS attack, we would have the charge to prove that now and most of all in the future.
Having this tool would remove any doubts and save us time not worrying about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it's good to keep sanitization, but I think it makes sense to actually fully strip HTML tags (i.e. maybe just what you suggested @fflorent , with ALLOWED_TAGS: []
), since keeping them creates the impression that they might be supported, or a temptation for someone to use innerHTML
in the future. In reality, there is no example currently that uses HTML tags, and (hopefully) no use of innerHTML
. We already support markdown, which has been enough (useful for links and emphasis) without actually needing any html tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 That would be even better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Included all comments in the new PR #1367
## Context Latest PR #1354 was breaking docker build and took the wrong approach for sanitizing translation files. More in the comments. ## Proposed solution 1. Docker build was fixed by copying translation files sooner (just for validating them). 2. Build now only fails if some translations keys have values with invalid tags. Previously it was fixing them. ## Related issues #1350 #1354 #1361 ## Has this been tested? <!-- Put an `x` in the box that applies: --> - [x] 👍 yes, I added tests to the test suite - [ ] 💭 no, because this PR is a draft and still needs work - [ ] 🙅 no, because this is not relevant here - [ ] 🙋 no, because I need help <!-- Detail how we can help you -->
Context
Sanitizing localization files.
Proposed solution
Added a script that is executed during build time that will sanitize every key in every translations resource file.
Related issues
#1350
Has this been tested?
Tested manually.