Skip to content

Commit

Permalink
feat(parser-js): Initial support for the v flag for unicodeSets
Browse files Browse the repository at this point in the history
  • Loading branch information
nhaajt committed Sep 12, 2023
1 parent b7e94a3 commit d324e09
Show file tree
Hide file tree
Showing 4 changed files with 211 additions and 179 deletions.
129 changes: 67 additions & 62 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@ to support [Stryker Mutator](https://github.com/stryker-mutator). Weapon regeX i
JavaScript and Scala and is used in [Stryker4s](https://github.com/stryker-mutator/stryker4s) and
[StrykerJS](https://github.com/stryker-mutator/stryker-js) flavors of Stryker.
The JavaScript version of the library is generated from Scala using [Scala.js](https://www.scala-js.org/).
The generated mutant regular expressions cover human errors, such as edge cases and typos. A list of provided mutators is given below.
The generated mutant regular expressions cover human errors, such as edge cases and typos. A list of provided mutators
is given below.
For an introduction to mutation testing, see [Stryker's website](https://stryker-mutator.io/).

The current supported versions for Scala are: `2.12`, `2.13` and `3`.
Expand All @@ -22,7 +23,8 @@ In case you want to incorporate Weapon-regeX into your project.

## Scala

Add Weapon regeX to your `build.sbt` [![Maven Central](https://img.shields.io/maven-central/v/io.stryker-mutator/weapon-regex_3.svg?label=Maven%20Central&colorB=brightgreen)](https://search.maven.org/artifact/io.stryker-mutator/weapon-regex_3):
Add Weapon regeX to
your `build.sbt` [![Maven Central](https://img.shields.io/maven-central/v/io.stryker-mutator/weapon-regex_3.svg?label=Maven%20Central&colorB=brightgreen)](https://search.maven.org/artifact/io.stryker-mutator/weapon-regex_3):

```scala
libraryDependencies += "io.stryker-mutator" %% "weapon-regex" % weaponRegexVersion
Expand All @@ -34,8 +36,8 @@ Mutate!
import weaponregex.WeaponRegeX

WeaponRegeX.mutate("^abc(d+|[xyz])$") match {
case Right(mutants) => mutants map (_.pattern)
case Left(e) => throw new RuntimeException(e)
case Right(mutants) => mutants map (_.pattern)
case Left(e) => throw new RuntimeException(e)
}
// res0: Seq[String] = List(
// "abc(d+|[xyz])$",
Expand Down Expand Up @@ -72,7 +74,7 @@ import wrx from 'weapon-regex';
let mutants = wrx.mutate('^abc(d+|[xyz])$');

mutants.forEach((mutant) => {
console.log(mutant.pattern);
console.log(mutant.pattern);
});
```

Expand All @@ -92,11 +94,11 @@ import weaponregex.mutator.BuiltinMutators
import weaponregex.parser.{ParserFlavor, ParserFlavorJVM}

def mutate(
pattern: String,
mutators: Seq[TokenMutator] = BuiltinMutators.all,
mutationLevels: Seq[Int] = null,
flavor: ParserFlavor = ParserFlavorJVM
): Either[String, Seq[Mutant]] = ???
pattern : String,
mutators : Seq[TokenMutator] = BuiltinMutators.all,
mutationLevels: Seq[Int] = null,
flavor : ParserFlavor = ParserFlavorJVM
): Either[String, Seq[Mutant]] = ???
```

With the `mutators` argument you can give a select list of mutators that should be used in
Expand All @@ -109,24 +111,27 @@ based on the levels in the list. If omitted, no filtering takes place.
The `flavor` argument allows setting the parser flavor that will be used to parse the pattern.
Currently, we support a `ParserFlavorJVM` and `ParserFlavorJS`. By default in Scala the JVM flavor is used.

This function will return a `Right` with `Seq[Mutant]` if it can be parsed, or a `Left` with the error message otherwise.
This function will return a `Right` with `Seq[Mutant]` if it can be parsed, or a `Left` with the error message
otherwise.

## JavaScript

The `mutate` function can be called with regular expression flags and an options object to control which mutators and which parser flavor should be used in the mutation process:
The `mutate` function can be called with regular expression flags and an options object to control which mutators and
which parser flavor should be used in the mutation process:

```js
import wrx from 'weapon-regex';

let mutants = wrx.mutate('^abc(d+|[xyz])$', 'u', {
mutators: Array.from(wrx.mutators.values()),
mutationLevels: [1, 2, 3],
flavor: ParserFlavorJS,
mutators: Array.from(wrx.mutators.values()),
mutationLevels: [1, 2, 3],
flavor: ParserFlavorJS,
});
```

Both options can be omitted, and have the same functionality as the options described in the Scala
API section. By default in JS the JS parser flavor is used. You can get a map of mutators from the `mutators` attribute of the library. It is
API section. By default in JS the JS parser flavor is used. You can get a map of mutators from the `mutators` attribute
of the library. It is
a `Map<string, Mutator>` from string (mutator name) to a mutator object.

This function will return a JavaScript Array of `Mutant` if it can be parsed, or throw an exception otherwise.
Expand All @@ -135,30 +140,30 @@ This function will return a JavaScript Array of `Mutant` if it can be parsed, or

All the supported mutators and at which mutation level they appear are shown in the table below.

| Name | 1 | 2 | 3 |
| --------------------------------------------------------------- | --- | --- | --- |
| [BOLRemoval](#bolremoval) | | | |
| [EOLRemoval](#eolremoval) | | | |
| [BOL2BOI](#bol2boi) | | | |
| [EOL2EOI](#eol2eoi) | | | |
| [CharClassNegation](#charclassnegation) | |
| [CharClassChildRemoval](#charclasschildremoval) | | | |
| [CharClassAnyChar](#charclassanychar) | | | |
| [CharClassRangeModification](#charclassrangemodification) | | | |
| [PredefCharClassNegation](#predefcharclassnegation) | |
| [PredefCharClassNullification](#predefcharclassnullification) | | | |
| [PredefCharClassAnyChar](#predefcharclassanychar) | | | |
| [UnicodeCharClassNegation](#unicodecharclassnegation) | |
| [QuantifierRemoval](#quantifierremoval) | |
| [QuantifierNChange](#quantifiernchange) | | | |
| [QuantifierNOrMoreModification](#quantifiernormoremodification) | | | |
| [QuantifierNOrMoreChange](#quantifiernormorechange) | | | |
| [QuantifierNMModification](#quantifiernmmodification) | | | |
| [QuantifierShortModification](#quantifiershortmodification) | | | |
| [QuantifierShortChange](#quantifiershortchange) | | | |
| [QuantifierReluctantAddition](#quantifierreluctantaddition) | | | |
| [GroupToNCGroup](#grouptoncgroup) | | | |
| [LookaroundNegation](#lookaroundnegation) | | | |
| Name | 1 | 2 | 3 |
|-----------------------------------------------------------------|---|---|---|
| [BOLRemoval](#bolremoval) ||||
| [EOLRemoval](#eolremoval) ||||
| [BOL2BOI](#bol2boi) | |||
| [EOL2EOI](#eol2eoi) | |||
| [CharClassNegation](#charclassnegation) ||
| [CharClassChildRemoval](#charclasschildremoval) | |||
| [CharClassAnyChar](#charclassanychar) | |||
| [CharClassRangeModification](#charclassrangemodification) | | ||
| [PredefCharClassNegation](#predefcharclassnegation) ||
| [PredefCharClassNullification](#predefcharclassnullification) | |||
| [PredefCharClassAnyChar](#predefcharclassanychar) | |||
| [UnicodeCharClassNegation](#unicodecharclassnegation) ||
| [QuantifierRemoval](#quantifierremoval) ||
| [QuantifierNChange](#quantifiernchange) | |||
| [QuantifierNOrMoreModification](#quantifiernormoremodification) | |||
| [QuantifierNOrMoreChange](#quantifiernormorechange) | |||
| [QuantifierNMModification](#quantifiernmmodification) | |||
| [QuantifierShortModification](#quantifiershortmodification) | |||
| [QuantifierShortChange](#quantifiershortchange) | |||
| [QuantifierReluctantAddition](#quantifierreluctantaddition) | | ||
| [GroupToNCGroup](#grouptoncgroup) | |||
| [LookaroundNegation](#lookaroundnegation) ||||

## Boundary Mutators

Expand All @@ -167,7 +172,7 @@ All the supported mutators and at which mutation level they appear are shown in
Remove the beginning of line character `^`.

| Original | Mutated |
| -------- | ------- |
|----------|---------|
| `^abc` | `abc` |

[Back to table 🔝](#supported-mutators)
Expand All @@ -177,7 +182,7 @@ Remove the beginning of line character `^`.
Remove the end of line character `$`.

| Original | Mutated |
| -------- | ------- |
|----------|---------|
| `abc$` | `abc` |

[Back to table 🔝](#supported-mutators)
Expand All @@ -187,7 +192,7 @@ Remove the end of line character `$`.
Change the beginning of line character `^` to a beginning of input character `\A`.

| Original | Mutated |
| -------- | ------- |
|----------|---------|
| `^abc` | `\Aabc` |

[Back to table 🔝](#supported-mutators)
Expand All @@ -197,7 +202,7 @@ Change the beginning of line character `^` to a beginning of input character `\A
Change the end of line character `^` to a end of input character `\z`.

| Original | Mutated |
| -------- | ------- |
|----------|---------|
| `abc$` | `abc\z` |

[Back to table 🔝](#supported-mutators)
Expand All @@ -209,7 +214,7 @@ Change the end of line character `^` to a end of input character `\z`.
Flips the sign of a character class.

| Original | Mutated |
| -------- | -------- |
|----------|----------|
| `[abc]` | `[^abc]` |
| `[^abc]` | `[abc]` |

Expand All @@ -220,7 +225,7 @@ Flips the sign of a character class.
Remove a child of a character class.

| Original | Mutated |
| -------- | ------- |
|----------|---------|
| `[abc]` | `[bc]` |
| `[abc]` | `[ac]` |
| `[abc]` | `[ab]` |
Expand All @@ -232,7 +237,7 @@ Remove a child of a character class.
Change a character class to a character class which matches any character.

| Original | Mutated |
| -------- | -------- |
|----------|----------|
| `[abc]` | `[\w\W]` |

[Back to table 🔝](#supported-mutators)
Expand All @@ -242,7 +247,7 @@ Change a character class to a character class which matches any character.
Change the high and low of a range by one in both directions if possible.

| Original | Mutated |
| -------- | ------- |
|----------|---------|
| `[b-y]` | `[a-y]` |
| `[b-y]` | `[c-y]` |
| `[b-y]` | `[b-z]` |
Expand All @@ -257,7 +262,7 @@ Change the high and low of a range by one in both directions if possible.
Flips the sign of a predefined character class. All the predefined character classes are shown in the table below.

| Original | Mutated |
| -------- | ------- |
|----------|---------|
| `\d` | `\D` |
| `\D` | `\d` |
| `\s` | `\S` |
Expand All @@ -272,7 +277,7 @@ Flips the sign of a predefined character class. All the predefined character cla
Remove the backslash from a predefined character class such as `\w`.

| Original | Mutated |
| -------- | ------- |
|----------|---------|
| `\d` | `d` |
| `\D` | `D` |
| `\s` | `s` |
Expand All @@ -288,7 +293,7 @@ Change a predefined character class to a character class containing the predefin
negation.

| Original | Mutated |
| -------- | -------- |
|----------|----------|
| `\d` | `[\d\D]` |
| `\D` | `[\D\d]` |
| `\s` | `[\s\S]` |
Expand All @@ -303,7 +308,7 @@ negation.
Flips the sign of a Unicode character class.

| Original | Mutated |
| ----------- | ----------- |
|-------------|-------------|
| `\p{Alpha}` | `\P{Alpha}` |
| `\P{Alpha}` | `\p{Alpha}` |

Expand All @@ -317,7 +322,7 @@ Remove a quantifier. This is done for all possible quantifiers, even ranges, and
and possessive variants.

| Original | Mutated |
| ----------- | ------- |
|-------------|---------|
| `abc?` | `abc` |
| `abc*` | `abc` |
| `abc+` | `abc` |
Expand All @@ -338,7 +343,7 @@ and possessive variants.
Change the fixed amount quantifier to a couple of range variants.

| Original | Mutated |
| -------- | ---------- |
|----------|------------|
| `abc{9}` | `abc{0,9}` |
| `abc{9}` | `abc{9,}` |

Expand All @@ -350,7 +355,7 @@ Change the `n` to infinity range quantifier to a couple of variants where the lo
incremented and decremented by one.

| Original | Mutated |
| --------- | ---------- |
|-----------|------------|
| `abc{9,}` | `abc{8,}` |
| `abc{9,}` | `abc{10,}` |

Expand All @@ -361,7 +366,7 @@ incremented and decremented by one.
Turn an `n` or more range quantifier into a fixed number quantifier.

| Original | Mutated |
| --------- | -------- |
|-----------|----------|
| `abc{9,}` | `abc{9}` |

[Back to table 🔝](#supported-mutators)
Expand All @@ -372,7 +377,7 @@ Alter the `n` to `m` range quantifier by decrementing or incrementing the high a
range by one.

| Original | Mutated |
| ---------- | ----------- |
|------------|-------------|
| `abc{3,9}` | `abc{2,9}` |
| `abc{3,9}` | `abc{4,9}` |
| `abc{3,9}` | `abc{3,8}` |
Expand All @@ -387,7 +392,7 @@ variant (`{0,1}`, `{0,}`, `{1,}`), and applies the same mutations as mentioned i
above.

| Original | Mutated |
| -------- | ---------- |
|----------|------------|
| `abc?` | `abc{1,1}` |
| `abc?` | `abc{0,0}` |
| `abc?` | `abc{0,2}` |
Expand All @@ -402,7 +407,7 @@ above.
Change the shorthand quantifiers `*` and `+` to their fixed range quantifier variant.

| Original | Mutated |
| -------- | -------- |
|----------|----------|
| `abc*` | `abc{0}` |
| `abc+` | `abc{1}` |

Expand All @@ -413,7 +418,7 @@ Change the shorthand quantifiers `*` and `+` to their fixed range quantifier var
Change greedy quantifiers to reluctant quantifiers.

| Original | Mutated |
| ----------- | ------------ |
|-------------|--------------|
| `abc?` | `abc??` |
| `abc*` | `abc*?` |
| `abc+` | `abc+?` |
Expand All @@ -430,7 +435,7 @@ Change greedy quantifiers to reluctant quantifiers.
Change a normal group to a non-capturing group.

| Original | Mutated |
| -------- | --------- |
|----------|-----------|
| `(abc)` | `(?:abc)` |

[Back to table 🔝](#supported-mutators)
Expand All @@ -440,7 +445,7 @@ Change a normal group to a non-capturing group.
Flips the sign of a lookaround (lookahead, lookbehind) construct.

| Original | Mutated |
| ---------- | ---------- |
|------------|------------|
| `(?=abc)` | `(?!abc)` |
| `(?!abc)` | `(?=abc)` |
| `(?<=abc)` | `(?<!abc)` |
Expand Down
4 changes: 2 additions & 2 deletions core/src/main/scala/weaponregex/parser/ParserJS.scala
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,8 @@ import weaponregex.model.regextree.*
*/
class ParserJS private[parser] (pattern: String, val flags: Option[String] = None) extends Parser(pattern) {

/** Whether the flags contain the `u` flag for Unicode mode */
private val unicodeMode: Boolean = flags.exists(_.contains("u"))
/** Whether the flags contain the `u` or `v` flag for Unicode mode */
private val unicodeMode: Boolean = flags.exists(f => f.contains("u") || f.contains("v"))

/** Regex special characters
*/
Expand Down
Loading

0 comments on commit d324e09

Please sign in to comment.