Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HCL with the JavaScript engine causes browser tab freeze due to catastrophic backtracking #871

Closed
4 of 5 tasks
yoavbls opened this issue Dec 20, 2024 · 7 comments
Closed
4 of 5 tasks

Comments

@yoavbls
Copy link

yoavbls commented Dec 20, 2024

Validations

Describe the bug

Since this change on HCL tmGrammer almost anything causes HCL snippets on the JavaScript engine freezing the whole tab due to catastrophic backtracking.

I opened an issue about it + a fix here: hashicorp/syntax#164
but until it's merged it would be great if we could make Shiki handle it instead of freezing.

optional solutions

  1. Pin the grammar JSON file link (here) to the version before the bug.
    I don't see pinned versions of anything there so I'm assuming it's not desirable
  2. Fix the bad regex on the generated file (here)
    I'm not sure if there is an option to prevent it from overridden

If these or other solutions are acceptable I'll open a PR.
Thanks!

Reproduction

https://textmate-grammars-themes.netlify.app/?grammar=hcl&theme=github-dark&code=var+%3D+%5B%22Every%22%2C+%22Item%22%2C+%22Slows%22%2C+%22Down%22%2C+%22Everything%22%5D

Contributes

  • I am willing to submit a PR to fix this issue
  • I am willing to submit a PR with failing tests
@yoavbls yoavbls changed the title HCL causing a freeze on the JavaScript engine due to catastrophic backtracking HCL with the JavaScript engine causes browser tab freeze due to catastrophic backtracking Dec 20, 2024
@antfu
Copy link
Member

antfu commented Dec 20, 2024

/cc @slevithan

@slevithan
Copy link
Collaborator

The best solution will be for https://github.com/hashicorp/syntax to publish a patch release with the change. Can you ask them to do that? If they do so, it can then be quickly picked up for new tm-grammars and Shiki releases.

Up to @antfu if he's okay with you submitting a PR with one of your alternative solutions in the meantime.

@yoavbls can you confirm whether Shiki runs into similar runaway backtracking when using Oniguruma directly via the WASM engine? Presumably yes unless Oniguruma is running some prevention or early-fail strategy that is not built into browser regexp engines.

@slevithan
Copy link
Collaborator

It might be possible to build protection against runaway backtracking into future versions of Shiki's JS engine (at least when it's running in Node, which has an API for killing processes) but that is beyond the scope of the underlying oniguruma-to-es library that I maintain.

Also note that Oniguruma has features for backtracking control like atomic groups and possessive quantifiers, and if you use them, oniguruma-to-es knows how to emulate them.

@yoavbls
Copy link
Author

yoavbls commented Dec 20, 2024

Thank you! they merged the fix now and it seems like Shiki reference to the updated file here.

So I think that releasing a new version of Shiki will solve it 🙂

@slevithan about the Oniguruma engine, it seems like the time does go up on every additional word but it's probably not exponentially because it is still super fast even with gigantic and complicated HCL files.
I assume they do have some kind of protections or limitations against it

@slevithan
Copy link
Collaborator

slevithan commented Dec 20, 2024

So I think that releasing a new version of Shiki will solve it 🙂

The change first needs to show up in https://github.com/shikijs/textmate-grammars-themes/ and then Shiki's version of tm-grammars can be bumped. Feel free to ping me when that happens.

about the Oniguruma engine, it seems like the time does go up on every additional word but it's probably not exponentially because it is still super fast even with gigantic and complicated HCL files.

Size and complexity of the file are not relevant, since the thing with exponential backtracking is that any extreme cases are triggered by almost-but-not-quite matching strings. So it's specifically the frequency and length of inputs that trigger the problem that will affect performance. I haven't looked closely at the problematic regex and your fix for it, but I'm guessing that applies in this case as well.

@slevithan
Copy link
Collaborator

OK, the HCL grammar fix is now in tm-grammars v1.21.9.

@antfu can you publish a new release that includes #872 plus also bumps tm-grammars?

@antfu
Copy link
Member

antfu commented Dec 21, 2024

Thanks! Released as 1.24.4

@antfu antfu closed this as completed Dec 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants