Skip to content

Latest commit

 

History

History
112 lines (80 loc) · 4.19 KB

readme.md

File metadata and controls

112 lines (80 loc) · 4.19 KB

Java Regex DSL

Description

Regular expressions are a powerfull way to match or extract information from a text in a very short way. But regular expressions can become lenghty and hard to read. This is especially true when you have a lot of capturing groups you want to refere to later in a match. Java regex DSL uses the builder pattern to provide a fluent API which makes it easier to create large regular expressions by splitting it into reusable named components and extract information from it.

How to use Java Regex DSL

Basic Usage

The first step is to create a Regex object that describes the regular expression using RegexBuilder. To create a Regex that matches on a string followed by a number you would write

Regex regex = RegexBuilder.create()
                     .string("#name1").number("#name2")
                     .build();

To match the created Regex against a text, e.g. "foofoofoo1234" use:

Match match = regex.match("foofoofoo1234");

You can now access the match using the specified names "name1" and "name2":

match.getByName("name1"); //will result in "foofoofoo"
match.getByName("name2"); //will result in "1234"

Groups

You can group expressions by using group(), which takes an optional parameter - the name of the group to access in a match. Every expression following the group will be considered part of the group until the group is closed using end(). Let's say you want to parse a timestamp of the following format h:m:s.ms you could write

Regex regex = RegexBuilder.create()
                     .group("#timestamp")
                          .number("#hour").constant(":").number("#min).constant(":").number(#secs).constant(":").number("#ms")
                     .end()
                     .build();

You can then access a match like so

Match match = regex.match("10:34:22.234");
match.getByName("timestamp"); //will return the group, i.e. 10:34:22.234
match.getByName("timestamp->ms") //will return the ms of the group timestamp, i.e. 22.234

You can nest as many groups as you like, as you can see in this non-sence example:

Regex regex = RegexBuilder.create()
                     .group("#g1")
                        .group("#g2")
                            .group("#g3")
                               .string("#myString")
                            .end()
                        .end()
                      .end()
                      .build();

 Match match = regex.match("foofoofoo");
 match.getByName("g1->g->g3->myString"); //will return "foofoofoo"

Optionals

To mark an expression as optional you can use the option(). It is used in exactly the same way as a group expression.

Reuse regexes

You can reuse a regex by nesting it into the builder:

Regex regexToReuse = ... //some regex

Regex regex = Builder.create()
                      .regex("#reused", regex) //reuse the previously created regex here
                      .build()

Pattern pass through

To pass through a regular expression use the pattern() expression;

Regex regex = Builder.create()
                        .pattern("#myPattern", ".*(\d)") //use any pattern you like
                      .build()

List of supported expressions

ExpressionDescription
string(name)Matches any word character
number(name)Matches any number, including floats (e.g. 0.2345)
any()Matches any character, including whitespaces
regex(name, regex)Matches the given Regex
pattern(name, pattern)Matches the given regular expression
group(name)Starts a group (has to be closed with end())
option(name)Starts an optional expression (has to be closed with end())