-
Notifications
You must be signed in to change notification settings - Fork 0
Quantifiers
Quantifiers help you match a pattern more than once, many times, or not at all. For example:
- You want to find a string of consecutive digits, with at least one digit and no maximum.
- You want to find all the words of five letters or more within a string.
With regex
you do this by passing a RegexQuantifier
object to an element or group
function. For example:
val regex = regex {
digit(OneOrMore)
}
will give us a regex that matches one or more consecutive digits.
And:
val regex = regex {
letter(AtLeast(5))
}
will give us a regex matching at least 5 consecutive letters.
Quantifiers can be passed into any element-matching method, or to any of the grouping functions in which case they apply to the whole group. They are specified using the following objects and classes of the RegexQuantifier
sealed class.
Quantifier | Matches | Raw regex equivalent |
---|---|---|
ZeroOrOne |
Either zero or one occurrence of the element or group | ? |
ZeroOrMore |
Any number of occurrences of the element or group, including none at all | * |
OneOrMore |
At least one occurrence of the element or group, but no maximum | + |
Exactly(times: Int) |
Exactly the specified number of occurrences of the element or group. | {x} |
AtLeast(minimum: Int) |
At least the specified minimum number of occurrences of the element or group. | {x,} |
NoMoreThan(maximum: Int) |
No more than the specified maximum number of occurrences of the element or group. | {0,x} |
Between(minimum: Int, maximum: Int) |
At least the specified minimum, and no more than the specified maximum, number of occurrences of the element or group. | {x,y} |
Quantifiers (with the exception of Exactly
) are "greedy" by default: that is, they will match the longest possible matching string. Consider this regex:
val regex = regex {
wordBoundary()
anyCharacter(OneOrMore)
wordBoundary()
}
When matched against the string "Hello there, world", this will match Hello there, world
: as many characters as it can possibly find between two word boundaries.
If we want it to match just the first word, we need to make the regex lazy: that is, tell it to match as few characters as possible between two word boundaries. We do that like this:
val regex = regex {
wordBoundary()
anyCharacter(OneOrMore.butAsFewAsPossible)
wordBoundary()
}
Now when matched against "Hello there, world", the regex will match Hello
.
RegexToolbox: Now you can be a hero without knowing regular expressions.