Skip to content

Commit

Permalink
Target ESNext -> ES2025; regex features confirmed for next spec
Browse files Browse the repository at this point in the history
  • Loading branch information
slevithan committed Nov 10, 2024
1 parent 1ba8f11 commit e87803b
Show file tree
Hide file tree
Showing 10 changed files with 26 additions and 26 deletions.
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ npm install oniguruma-to-es
import {toRegExp} from 'oniguruma-to-es';
const str = '';
const pattern = '';
// Works with all string/regexp methods since it returns a native JS regexp
// Works with all string/regexp methods since it returns a native regexp
str.match(toRegExp(pattern));
```

Expand Down Expand Up @@ -85,7 +85,7 @@ type Options = {
global?: boolean;
hasIndices?: boolean;
maxRecursionDepth?: number | null;
target?: 'ES2018' | 'ES2024' | 'ESNext';
target?: 'ES2018' | 'ES2024' | 'ES2025';
tmGrammar?: boolean;
verbose?: boolean;
};
Expand Down Expand Up @@ -172,7 +172,7 @@ Supports slightly fewer features, but the missing features are all relatively un

Supports all features of `strict`, plus the following additional features, depending on `target`:

- All targets (`ESNext` and earlier):
- All targets (`ES2025` and earlier):
- Enables use of `\X` using a close approximation of a Unicode extended grapheme cluster.
- Enables recursion (e.g. via `\g<0>`) with a depth limit specified by option `maxRecursionDepth`.
- `ES2024` and earlier:
Expand Down Expand Up @@ -233,7 +233,7 @@ Higher limits have no effect on regexes that don't use recursion, so you should

### `target`

One of `'ES2018'`, `'ES2024'` *(default)*, or `'ESNext'`.
One of `'ES2018'`, `'ES2024'` *(default)*, or `'ES2025'`.

Sets the JavaScript language version for the generated pattern and flags. Later targets allow faster processing, simpler generated source, and support for additional features.

Expand All @@ -246,7 +246,7 @@ Sets the JavaScript language version for the generated pattern and flags. Later
- `ES2024`: Uses JS flag `v`.
- No emulation restrictions.
- Generated regexes require Node.js 20 or any 2023-era browser ([compat table](https://caniuse.com/mdn-javascript_builtins_regexp_unicodesets)).
- `ESNext`: Uses JS flag `v` and allows use of flag groups and duplicate group names.
- `ES2025`: Uses JS flag `v` and allows use of flag groups and duplicate group names.
- Benefits: Faster transpilation, simpler generated source, and duplicate group names are preserved across separate alternation paths.
- Generated regexes might use features that require Node.js 23 or a 2024-era browser (except Safari, which lacks support for flag groups).
</details>
Expand All @@ -268,7 +268,7 @@ Disables optimizations that simplify the pattern when it doesn't change the mean
Following are the supported features by target. The official Oniguruma [syntax doc](https://github.com/kkos/oniguruma/blob/master/doc/RE) doesn't cover many of the finer details described here.

> [!NOTE]
> Targets `ES2024` and `ESNext` have the same emulation capabilities. Resulting regexes might have different source and flags, but they match the same strings.
> Targets `ES2024` and `ES2025` have the same emulation capabilities. Resulting regexes might have different source and flags, but they match the same strings.
Notice that nearly every feature below has at least subtle differences from JavaScript. Some features and subfeatures listed as unsupported are not emulatable using native JavaScript regexes, but support for others might be added in future versions of this library. Unsupported features throw an error.

Expand Down Expand Up @@ -953,7 +953,7 @@ Oniguruma-To-ES fully supports mixed case-sensitivity (and handles the Unicode e
Oniguruma-To-ES focuses on being lightweight to make it better for use in browsers. This is partly achieved by not including heavyweight Unicode character data, which imposes a couple of minor/rare restrictions:

- Character class intersection and nested negated character classes are unsupported with target `ES2018`. Use target `ES2024` or later if you need support for these features.
- With targets before `ESNext`, a handful of Unicode properties that target a specific character case (ex: `\p{Lower}`) can't be used case-insensitively in patterns that contain other characters with a specific case that are used case-sensitively.
- With targets before `ES2025`, a handful of Unicode properties that target a specific character case (ex: `\p{Lower}`) can't be used case-insensitively in patterns that contain other characters with a specific case that are used case-sensitively.
- In other words, almost every usage is fine, including `A\p{Lower}`, `(?i:A\p{Lower})`, `(?i:A)\p{Lower}`, `(?i:A(?-i:\p{Lower}))`, and `\w(?i:\p{Lower})`, but not `A(?i:\p{Lower})`.
- Using these properties case-insensitively is basically never done intentionally, so you're unlikely to encounter this error unless it's catching a mistake.

Expand Down
2 changes: 1 addition & 1 deletion demo/demo.js
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ function showTranspiled() {
return;
}
ui.comparisonInfo.classList.remove('hidden');
const otherTargetAccuracyCombinations = ['ES2018', 'ES2024', 'ESNext'].flatMap(
const otherTargetAccuracyCombinations = ['ES2018', 'ES2024', 'ES2025'].flatMap(
t => ['loose', 'default', 'strict'].map(a => ({target: t, accuracy: a}))
).filter(c => c.target !== options.target || c.accuracy !== options.accuracy);
const differents = [];
Expand Down
2 changes: 1 addition & 1 deletion demo/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ <h2>Try it</h2>
<select id="option-target" onchange="setOption('target', this.value)">
<option value="ES2018">ES2018</option>
<option value="ES2024" selected>ES2024</option>
<option value="ESNext">ESNext</option>
<option value="ES2025">ES2025</option>
</select>
<code>target</code>
<img src="https://upload.wikimedia.org/wikipedia/commons/9/99/Unofficial_JavaScript_logo_2.svg" width="15" height="15">
Expand Down
2 changes: 1 addition & 1 deletion spec/helpers/features.js
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ const patternModsSupported = (() => {
return true;
})();
const maxTestTargetForPatternMods = patternModsSupported ? null : 'ES2024';
const minTestTargetForPatternMods = patternModsSupported ? 'ESNext' : Infinity;
const minTestTargetForPatternMods = patternModsSupported ? 'ES2025' : Infinity;

const minTestTargetForFlagV = 'ES2024';

Expand Down
2 changes: 1 addition & 1 deletion spec/helpers/matchers.js
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ import {EsVersion} from '../../src/options.js';
function getArgs(actual, expected) {
const max = expected.maxTestTarget;
const min = expected.minTestTarget;
const targets = ['ES2018', 'ES2024', 'ESNext'];
const targets = ['ES2018', 'ES2024', 'ES2025'];
const targeted = targets.
filter(target => !max || EsVersion[target] <= EsVersion[max]).
filter(target => !min || (min !== Infinity && EsVersion[target] >= EsVersion[min]));
Expand Down
4 changes: 2 additions & 2 deletions spec/match-backreference.spec.js
Original file line number Diff line number Diff line change
Expand Up @@ -428,7 +428,7 @@ describe('Backreference', () => {

describe('case sensitivity', () => {
it('should match case-insensitive backref to case-sensitive group', () => {
// Real support with `target` ESNext
// Real support with `target` ES2025
expect(['aa', 'aA']).toExactlyMatch({
pattern: r`(a)(?i)\1`,
minTestTarget: minTestTargetForPatternMods,
Expand All @@ -437,7 +437,7 @@ describe('Backreference', () => {
pattern: r`(a)(?i)\1`,
minTestTarget: minTestTargetForPatternMods,
});
// Throw with strict `accuracy` if `target` not ESNext
// Throw with strict `accuracy` if `target` not ES2025
['ES2018', 'ES2024'].forEach(target => {
expect(() => toDetails(r`(a)(?i)\1`, {
accuracy: 'strict',
Expand Down
2 changes: 1 addition & 1 deletion spec/todetails.spec.js
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ describe('toDetails', () => {

it('should add flag v for target ES2024+', () => {
expect(toDetails('', {target: 'ES2024'}).flags).toBe('v');
expect(toDetails('', {target: 'ESNext'}).flags).toBe('v');
expect(toDetails('', {target: 'ES2025'}).flags).toBe('v');
});

it('should add flag u for target ES2018', () => {
Expand Down
14 changes: 7 additions & 7 deletions src/generate.js
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Generates a Regex+ compatible `pattern`, `flags`, and `options` from a Regex+ AS
function generate(ast, options) {
const opts = getOptions(options);
const minTargetEs2024 = isMinTarget(opts.target, 'ES2024');
const minTargetEsNext = isMinTarget(opts.target, 'ESNext');
const minTargetEs2025 = isMinTarget(opts.target, 'ES2025');
const rDepth = opts.maxRecursionDepth;
if (rDepth !== null && (!Number.isInteger(rDepth) || rDepth < 2 || rDepth > 100)) {
throw new Error('Invalid maxRecursionDepth; use 2-100 or null');
Expand All @@ -30,7 +30,7 @@ function generate(ast, options) {
// [TODO] Consider gathering this data in the transformer's final traversal to avoid work here
let hasCaseInsensitiveNode = null;
let hasCaseSensitiveNode = null;
if (!minTargetEsNext) {
if (!minTargetEs2025) {
const iStack = [ast.flags.ignoreCase];
traverse({node: ast}, {
getCurrentModI: () => iStack.at(-1),
Expand All @@ -51,7 +51,7 @@ function generate(ast, options) {
// - Turn global flag i on if a case insensitive node was used and no case sensitive nodes were
// used (to avoid unnecessary node expansion).
// - Turn global flag i off if a case sensitive node was used (since case sensitivity can't be
// forced without the use of ESNext flag groups)
// forced without the use of ES2025 flag groups)
ignoreCase: !!((ast.flags.ignoreCase || hasCaseInsensitiveNode) && !hasCaseSensitiveNode),
};
let lastNode = null;
Expand All @@ -67,9 +67,9 @@ function generate(ast, options) {
inCharClass: false,
lastNode,
maxRecursionDepth: rDepth,
useAppliedIgnoreCase: !!(!minTargetEsNext && hasCaseInsensitiveNode && hasCaseSensitiveNode),
useDuplicateNames: minTargetEsNext,
useFlagMods: minTargetEsNext,
useAppliedIgnoreCase: !!(!minTargetEs2025 && hasCaseInsensitiveNode && hasCaseSensitiveNode),
useDuplicateNames: minTargetEs2025,
useFlagMods: minTargetEs2025,
useFlagV: minTargetEs2024,
usePostEs2018Properties: minTargetEs2024,
verbose: opts.verbose,
Expand Down Expand Up @@ -230,7 +230,7 @@ function genBackreference({ref}, state) {
state.currentFlags.ignoreCase &&
!state.captureFlagIMap.get(ref)
) {
throw new Error('Use of case-insensitive backref to case-sensitive group requires target ESNext or non-strict accuracy');
throw new Error('Use of case-insensitive backref to case-sensitive group requires target ES2025 or non-strict accuracy');
}
return '\\' + ref;
}
Expand Down
4 changes: 2 additions & 2 deletions src/options.js
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,13 @@ const Accuracy = /** @type {const} */ ({
const EsVersion = {
ES2018: 2018,
ES2024: 2024,
ESNext: 2025,
ES2025: 2025,
};

const Target = /** @type {const} */ ({
ES2018: 'ES2018',
ES2024: 'ES2024',
ESNext: 'ESNext',
ES2025: 'ES2025',
});

/**
Expand Down
6 changes: 3 additions & 3 deletions src/transform.js
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,10 @@ import emojiRegex from 'emoji-regex-xs';
*/
/**
Transforms an Oniguruma AST in-place to a [Regex+](https://github.com/slevithan/regex) AST.
Targets `ESNext`, expecting the generator to then down-convert to the desired JS target version.
Assumes target ES2025, expecting the generator to down-convert to the desired JS target version.
Regex+'s syntax and behavior is a strict superset of native JavaScript, so the AST is very close
to representing native ESNext `RegExp` but with some added features (atomic groups, possessive
to representing native ES2025 `RegExp` but with some added features (atomic groups, possessive
quantifiers, recursion). The AST doesn't use some of Regex+'s extended features like flag `x` or
subroutines because they follow PCRE behavior and work somewhat differently than in Oniguruma. The
AST represents what's needed to precisely reproduce Oniguruma behavior using Regex+.
Expand All @@ -46,7 +46,7 @@ function transform(ast, options) {
// approximation based on the target, so produce the appropriate structure here.
accuracy: 'default',
avoidSubclass: false,
bestEffortTarget: 'ESNext',
bestEffortTarget: 'ES2025',
...options,
};
// AST changes that work together with a `RegExp` subclass to add advanced emulation
Expand Down

0 comments on commit e87803b

Please sign in to comment.