Skip to content

bbrother92/regex

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 

Repository files navigation

Regex in Java

docs

Escaping

String st = "+{._ ])";
//For +.{( you need escape backslash
assertThat(st, matchesPattern("\\+\\{\\._ ]\\)") );

Backslash

To find backslash in string "abc\", regex would be "\\" but in java each backslash should be escaped: Pattern.compile("\\\\");

Matcher methods

matches() attempts to match the entire region against the pattern SO use .*pattern.*
lookingAt() same as matches method but it does not require that the entire region be matched
find() attempts to find the next subsequence of the input sequence

Pattern

Pattern.quote() to escape input

Flags

Constant Equivalent Embedded Flag Expression
Pattern.CANON_EQ None
Pattern.CASE_INSENSITIVE (?i)
Pattern.COMMENTS (?x)
Pattern.MULTILINE (?m)
Pattern.DOTALL (?s)
Pattern.LITERAL None
Pattern.UNICODE_CASE (?u)
Pattern.UNIX_LINES (?d)

Unicode

Use UNICODE_CHARACTER_CLASS flag

String chinese = "四";
assertTrue(Pattern.compile("\\w", Pattern.UNICODE_CHARACTER_CLASS).matcher(chinese).matches()); // matches word
assertTrue(Pattern.compile("(?U)\\w").matcher(chinese).matches()); // matches word
assertTrue(Pattern.compile("\\w").matcher(chinese).matches()); // fails

Use unicode classes for Unicode scripts, blocks, categories. link

String greek="Ω Δ";
String mixed="Λ d";
String latinUpper="O D";

assertTrue(greek.matches("(?U)\\p{InGreek} \\w")); // matches
assertTrue(greek.matches("(?U)\\p{Lu} \\p{Lu}")); // matches Greek Uppercase

assertTrue(mixed.matches("(?U)\\p{IsLatin} \\p{IsLatin}")); // fails
assertTrue(latinUpper.matches("\\p{Lu} \\p{Lu}")); // matches Latin Uppercase
assertTrue(mixed.matches("\\P{IsLatin} \\p{IsLatin}")); // matches non latin and latin
assertTrue(mixed.matches("\\P{IsLatin} \\p{IsLatin}")); // matches non latin and latin

Releases

No releases published

Packages

No packages published

Languages