Regular expression is not a library nor is it a programming language. Instead, regular expression is a sequence of characters that specifies a search pattern in any given text (string).
A text can consist of pretty much anything from letters to numbers, space characters to special characters. As long as the string follows some sort of pattern, regex is robust enough to be able to capture this pattern and return a specific part of the string.
- Escape character:
\
- Any character:
.
- Digit:
\d
- Not a digit:
\D
- Word character:
\w
- Not a word character:
\W
- Whitespace:
\s
- Not whitespace:
\S
- Word boundary:
\b
- Not a word boundary:
\B
- Beginning of a string:
^
- End of a string:
$
- Matches characters in brackets:
[ ]
- Matches characters not in brackets:
[^ ]
- Either or:
|
- Capturing group:
( )
- 0 or more:
*
- 1 or more:
+
- 0 or 1:
?
- An exact number of characters:
{ }
- Range of number of characters:
{Minimum, Maximum}
- Phone numbers
- Dates
- Names
- URLs
- Email address
- Address
Link to full write-up on Towards Data Science here.