Skip to content

Commit

Permalink
feat(engine-js): lazy compile extremely long patterns in precompiled …
Browse files Browse the repository at this point in the history
…grammars (#916)
  • Loading branch information
slevithan authored Feb 4, 2025
1 parent 79d83ff commit c792c7d
Show file tree
Hide file tree
Showing 6 changed files with 31 additions and 29 deletions.
12 changes: 6 additions & 6 deletions docs/references/engine-js-compat.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,20 @@

Compatibility reference of all built-in grammars with the [JavaScript RegExp engine](/guide/regex-engines#javascript-regexp-engine).

> Generated on Tuesday, January 21, 2025
> Generated on Monday, February 3, 2025
>
> Version `2.0.3`
> Version `2.2.0`
>
> Runtime: Node.js v22.11.0
> Runtime: Node.js v22.13.1
## Report Summary

| | Count |
| :-------------- | --------------------------: |
| Total Languages | 219 |
| Supported | [214](#supported-languages) |
| Supported | [215](#supported-languages) |
| Mismatched | [0](#mismatched-languages) |
| Unsupported | [5](#unsupported-languages) |
| Unsupported | [4](#unsupported-languages) |

## Supported Languages

Expand Down Expand Up @@ -56,6 +56,7 @@ In some edge cases, it's not guaranteed that the highlighting will be 100% the s
| cmake | ✅ OK | 23 | - | |
| cobol | ✅ OK | 868 | - | |
| codeowners | ✅ OK | 4 | - | |
| codeql | ✅ OK | 151 | - | |
| coffee | ✅ OK | 471 | - | |
| common-lisp | ✅ OK | 60 | - | |
| coq | ✅ OK | 26 | - | |
Expand Down Expand Up @@ -259,7 +260,6 @@ Languages that throw with the JavaScript RegExp engine, either because they cont

| Language | Highlight Match | Patterns Parsable | Patterns Failed | Diff |
| ---------- | :-------------- | ----------------: | --------------: | ---: |
| codeql | ✅ OK | 150 | 1 | |
| csharp | ❌ Error | 312 | 1 | 137 |
| purescript | ❌ Error | 72 | 1 | |
| razor | ❌ Error | 961 | 1 | |
Expand Down
7 changes: 5 additions & 2 deletions packages/engine-javascript/src/engine-compile.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import type { RegexEngine } from '@shikijs/types'
import type { OnigurumaToEsOptions } from 'oniguruma-to-es'
import type { ToRegExpOptions } from 'oniguruma-to-es'
import type { JavaScriptRegexScannerOptions } from './scanner'
import { toRegExp } from 'oniguruma-to-es'
import { JavaScriptScanner } from './scanner'
Expand All @@ -25,12 +25,15 @@ export interface JavaScriptRegexEngineOptions extends JavaScriptRegexScannerOpti
/**
* The default regex constructor for the JavaScript RegExp engine.
*/
export function defaultJavaScriptRegexConstructor(pattern: string, options?: OnigurumaToEsOptions): RegExp {
export function defaultJavaScriptRegexConstructor(pattern: string, options?: ToRegExpOptions): RegExp {
return toRegExp(
pattern,
{
global: true,
hasIndices: true,
// This has no benefit for the standard JS engine, but it avoids a perf penalty for
// precompiled grammars when constructing extremely long patterns that aren't always used
lazyCompileLength: 3000,
rules: {
// Needed since TextMate grammars merge backrefs across patterns
allowOrphanBackrefs: true,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ exports[`precompile 1`] = `
},
},
end: /*@__PURE__*/ new EmulatedRegExp("^(?=\\\\P{space})|(?!^)", "dgv", {
strategy: "search_start_clip",
strategy: "clip_search",
}),
patterns: [
{
Expand All @@ -113,7 +113,7 @@ exports[`precompile 1`] = `
"1": { name: "punctuation.whitespace.comment.leading.yaml" },
},
end: /*@__PURE__*/ new EmulatedRegExp("(?!^)", "dgv", {
strategy: "search_start_clip",
strategy: "clip_search",
}),
patterns: [
{
Expand Down Expand Up @@ -378,7 +378,7 @@ exports[`precompile 1`] = `
property: {
begin: /(?=!|&)/dgv,
end: /*@__PURE__*/ new EmulatedRegExp("(?!^)", "dgv", {
strategy: "search_start_clip",
strategy: "clip_search",
}),
name: "meta.property.yaml",
patterns: [
Expand Down
2 changes: 1 addition & 1 deletion packages/langs-precompiled/scripts/langs.ts
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@ export function toJsLiteral(value: any, seen = new Set()): string {
}

if (value instanceof EmulatedRegExp) {
return `/*@__PURE__*/ new EmulatedRegExp(${JSON.stringify(value.rawArgs.pattern)},${JSON.stringify(value.rawArgs.flags)},${JSON.stringify(value.rawArgs.options)})`
return `/*@__PURE__*/ new EmulatedRegExp(${JSON.stringify(value.source)},"${value.flags}",${JSON.stringify(value.rawOptions)})`
}

// RegExp
Expand Down
31 changes: 15 additions & 16 deletions pnpm-lock.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pnpm-workspace.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ catalog:
monaco-editor-core: ^0.52.2
ofetch: ^1.4.1
ohash: ^1.1.4
oniguruma-to-es: ^2.3.0
oniguruma-to-es: ^3.1.0
picocolors: ^1.1.1
pinia: ^2.3.1
pnpm: ^9.15.4
Expand Down

0 comments on commit c792c7d

Please sign in to comment.