Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hack grammar contains invalid Oniguruma token \xff #107

Closed
slevithan opened this issue Nov 23, 2024 · 1 comment
Closed

Hack grammar contains invalid Oniguruma token \xff #107

slevithan opened this issue Nov 23, 2024 · 1 comment

Comments

@slevithan
Copy link

This line in the Hack grammar is an invalid Oniguruma pattern. The behavior for TextMate grammars when an invalid pattern is encountered (dutifully reproduced by vscode-textmate) is to silently fail, causing the regex to never match. So whatever it's intended to do is not happening.

The error in this pattern is the use of [^...\\x7f-\\xff]. From context, it's easy to figure out that the author meant for this to exclude, among other things, the range from code point U+007F to U+00FF. And although that's what it would do in nearly every modern regex engine, that's not what it's doing in Oniguruma.

In Oniguruma, \xHH matches an "encoded byte value", not a code point value like \x{...} does. That means for values from 0 to 7F, \xHH and \x{HH} work the same, but they diverge for hex values 80 to FF. With \xHH above 7F, the token must be part of a valid encoded byte sequence. So e.g. the three-byte sequence \xEF\xBB\xBF in Oniguruma is equivalent to the single code point \uFEFF in JavaScript (NOT the same as the three code points \xEF\xBB\xBF in JavaScript) and the same as \x{FEFF} in Oniguruma. \xFF, on it's own, is not a valid encoded byte sequence, so it is an error in Oniguruma.

To fix this, the \\xff should be replaced with \\x{ff}. It will then run in Oniguruma.

@slevithan slevithan changed the title Hack grammar contains an invalid Oniguruma pattern Hack grammar contains invalid Oniguruma token \xff Nov 23, 2024
@slevithan
Copy link
Author

Reported upstream here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant