-
Notifications
You must be signed in to change notification settings - Fork 64
/
Copy pathregex.html
247 lines (224 loc) · 18.2 KB
/
regex.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
<!DOCTYPE html>
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Regex cheatsheet</title>
<!-- Bootstrap core CSS -->
<link href="css/bootstrap.min.css" rel="stylesheet">
<!-- Bootstrap theme -->
<link href="css/bootstrap-theme.min.css" rel="stylesheet">
<!-- Docs extras -->
<link href="css/docs.min.css" rel="stylesheet">
<!-- Custom styles for this template -->
<link href="css/theme.css" rel="stylesheet">
<style>
td.na {
background: #F4E5E5;
}
/* sidebar */
.bs-docs-sidebar {
padding-left: 20px;
margin-top: 20px;
margin-bottom: 20px;
}
.fixed {
position: fixed;
}
@media print {
/* Overrides bootstrap's' stylesheet: don't show URLs when printing */
a[href]:after {
content: none;
}
td.na {
background: #EEE;
}
/* sidebar */
.bs-docs-sidebar {
display: none;
}
}
</style>
<!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries -->
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
</head>
<body>
<div class="hidden-print">
<a href="https://github.com/remram44/regex-cheatsheet"><img style="position: absolute; top: 0; right: 0; border: 0;" src="https://camo.githubusercontent.com/38ef81f8aca64bb9a64448d0d70f1308ef5341ab/68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f6769746875622f726962626f6e732f666f726b6d655f72696768745f6461726b626c75655f3132313632312e706e67" alt="Fork me on GitHub" data-canonical-src="https://s3.amazonaws.com/github/ribbons/forkme_right_darkblue_121621.png"></a>
</div>
<div class="container" role="main">
<div class="row">
<div class="col-md-9">
<div class="hidden-print">
<section id="intro" class="group">
<h1>Regex cheatsheet</h1>
<p>Many programs use regular expression to find & replace text. However, they tend to come with their own different flavor.</p>
<p>You can probably expect most modern software and programming languages to be using some variation of the Perl flavor, "PCRE"; however command-line tools (grep, less, ...) will often use the POSIX flavor (sometimes with an extended variant, e.g. <code>egrep</code> or <code>sed -r</code>). Vim also comes with its own syntax (a superset of what Vi accepts).</p>
<p>This cheatsheet lists the respective <a href="#syntax">syntax of each flavor</a>, and the <a href="#programs">software that uses it</a>.</p>
<p>If you spot errors or missing data, or just want to make this prettier/more accurate, don't hesitate to open an <a href="https://github.com/remram44/regex-cheatsheet/issues/new">issue</a> or a <a href="https://github.com/remram44/regex-cheatsheet/compare/">pull request</a>.</p>
</section>
</div>
<section id="syntax" class="group">
<h1>Syntax</h1>
<table class="table table-bordered table-striped">
<thead>
<tr><th>What</th><th><a href="http://perldoc.perl.org/perlre.html">Perl</a>/PCRE</th><th><a href="https://docs.python.org/library/re.html">Python's <code>re</code></a></th><th>POSIX (BRE)</th><th>POSIX extended (ERE)</th><th>Vim</th></tr>
</thead>
<tbody id="syntax-basics" class="subgroup">
<tr><th colspan="6">Basics</th></tr>
<tr><td>Custom character class</td><td><code>[...]</code></td><td><code>[...]</code></td><td><code>[...]</code></td><td><code>[...]</code></td><td><code>[...]</code></td></tr>
<tr><td>Negated custom character class</td><td><code>[^...]</code></td><td><code>[^...]</code></td><td><code>[^...]</code></td><td><code>[^...]</code></td><td><code>[^...]</code></td></tr>
<tr><td>\ special in class?</td><td>yes</td><td>yes</td><td>no, <code>]</code> escaped if comes first</td><td>no, <code>]</code> escaped if comes first</td><td>yes</td></tr>
<tr><td>Ranges</td><td><code>[a-z]</code>, <code>-</code> escaped if first or last</td><td><code>[a-z]</code>, <code>-</code> escaped if first or last</td><td><code>[a-z]</code>, <code>-</code> escaped if first or last</td><td><code>[a-z]</code>, <code>-</code> escaped if first or last</td><td><code>[a-z]</code>, <code>-</code> escaped if first or last</td></tr>
<tr><td>Alternation</td><td><code>|</code></td><td><code>|</code></td><td><code>\|</code></td><td><code>|</code></td><td><code>\|</code> <code>\&</code> (low precedence)</td></tr>
<tr><td>Escaped character</td><td><code>\033</code> <code>\x1B</code> <code>\x{1234}</code> <code>\N{name}</code> <code>\N{U+263D}</code></td><td><code>\x12</code></td><td class="na"></td><td class="na"></td><td><code>\%d123</code> <code>\%x2A</code> <code>\%u1234</code> <code>\%U1234ABCD</code></td></tr>
</tbody>
<tbody id="syntax-characters" class="subgroup">
<tr><th colspan="6">Character classes</th></tr>
<tr><td>Any character (except newline)</td><td><code>.</code></td><td><code>.</code></td><td><code>.</code></td><td><code>.</code></td><td><code>.</code></td></tr>
<tr><td>Any character (including newline)</td><td class="na"></td><td class="na"></td><td class="na"></td><td class="na"></td><td><code>\_.</code></td></tr>
<tr><td>Match a "word" character (alphanumeric plus <code>_</code>)</td><td><code>\w</code> <code>[[:word:]]</code></td><td><code>\w</code></td><td><code>\w</code></td><td><code>\w</code></td><td><code>\w</code></td></tr>
<tr><td>Case</td><td><code>[[:upper:]]</code> / <code>[[:lower:]]</code></td><td class="na"></td><td><code>[[:upper:]]</code> / <code>[[:lower:]]</code></td><td><code>[[:upper:]]</code> / <code>[[:lower:]]</code></td><td><code>\u</code> <code>[[:upper:]]</code> / <code>\l</code> <code>[[:lower:]]</code></td></tr>
<tr><td>Match a non-"word" character</td><td><code>\W</code></td><td><code>\W</code></td><td class="na"></td><td class="na"></td><td><code>\W</code></td></tr>
<tr><td>Match a whitespace character (except newline)</td><td class="na"></td><td class="na"></td><td><code>\s</code> <code>[[:space:]]</code></td><td><code>\s</code> <code>[[:space:]]</code></td><td><code>\s</code> <code>[[:space:]]</code></td></tr>
<tr><td>Whitespace including newline</td><td><code>\s</code> <code>[[:space:]]</code></td><td><code>\s</code></td><td class="na"></td><td class="na"></td><td><code>\_s</code></td></tr>
<tr><td>Match a non-whitespace character</td><td><code>\S</code></td><td><code>\S</code></td><td><code>[^[:space:]]</code></td><td><code>[^[:space:]]</code></td><td><code>\S</code> <code>[^[:space:]]</code></td></tr>
<tr><td>Match a digit character</td><td><code>\d</code> <code>[[:digit:]]</code></td><td><code>\d</code></td><td><code>[[:digit:]]</code></td><td><code>[[:digit:]]</code></td><td><code>\d</code> <code>[[:digit:]]</code></td></tr>
<tr><td>Match a non-digit character</td><td><code>\D</code></td><td><code>\D</code></td><td><code>[^[:digit:]]</code></td><td><code>[^[:digit:]]</code></td><td><code>\D</code> <code>[^[:digit:]]</code></td></tr>
<tr><td>Any hexadecimal digit</td><td><code>[[:xdigit:]]</code></td><td class="na"></td><td><code>[[:xdigit:]]</code></td><td><code>[[:xdigit:]]</code></td><td><code>\x</code> <code>[[:xdigit:]]</code></td></tr>
<tr><td>Any octal digit</td><td></td><td class="na"></td><td></td><td></td><td><code>\o</code></td></tr>
<tr><td>Any graphical character excluding "word" characters</td><td><code>[[:punct:]]</code></td><td class="na"></td><td><code>[[:punct:]]</code></td><td><code>[[:punct:]]</code></td><td><code>[[:punct:]]</code></td></tr>
<tr><td>Any alphabetical character</td><td><code>[[:alpha:]]</code></td><td class="na"></td><td><code>[[:alpha:]]</code></td><td><code>[[:alpha:]]</code></td><td><code>\a</code> <code>[[:alpha:]]</code></td></tr>
<tr><td>Non-alphabetical character</td><td></td><td class="na"></td><td><code>[^[:alpha:]]</code></td><td><code>[^[:alpha:]]</code></td><td><code>\A</code> <code>[^[:alpha:]]</code></td></tr>
<tr><td>Any alphanumerical character</td><td><code>[[:alnum:]]</code></td><td class="na"></td><td><code>[[:alnum:]]</code></td><td><code>[[:alnum:]]</code></td><td><code>[[:alnum:]]</code></td></tr>
<tr><td>ASCII</td><td><code>[[:ascii:]]</code></td><td class="na"></td><td class="na"></td><td class="na"></td><td class="na"></td></tr>
<tr><td>Character equivalents (e = é = è) (as per locale)</td><td></td><td class="na"></td><td><code>[[=e=]]</code></td><td><code>[[=e=]]</code></td><td><code>[[=e=]]</code></td></tr>
</tbody>
<tbody id="syntax-assert" class="subgroup">
<tr><th colspan="6">Zero-width assertions</th></tr>
<tr><td>Word boundary</td><td><code>\b</code></td><td><code>\b</code></td><td><code>\b</code> / <code>\<</code> (start) / <code>\></code> (end)</td><td><code>\b</code> / <code>\<</code> (start) / <code>\></code> (end)</td><td><code>\<</code> (start) / <code>\></code> (end)</td></tr>
<tr><td>Anywhere but word boundary</td><td><code>\B</code></td><td><code>\B</code></td><td><code>\B</code></td><td><code>\B</code></td><td class="na"></td></tr>
<tr><td>Beginning of line/string</td><td><code>^</code> / <code>\A</code></td><td><code>^</code> / <code>\A</code></td><td><code>^</code></td><td><code>^</code></td><td><code>^</code> (beginning of pattern ) <code>\_^</code></td></tr>
<tr><td>End of line/string</td><td><code>$</code> / <code>\Z</code></td><td><code>$</code> / <code>\Z</code></td><td><code>$</code></td><td><code>$</code></td><td><code>$</code> (end of pattern) <code>\_$</code></td></tr>
</tbody>
<tbody id="syntax-groups" class="subgroup">
<tr><th colspan="6">Captures and groups</th></tr>
<tr><td>Capturing group</td><td><code>(...)</code> <code>(?<name>...)</code></td><td><code>(...)</code> <code>(?P<name>...)</code></td><td><code>\(...\)</code></td><td><code>(...)</code></td><td><code>\(...\)</code></td></tr>
<tr><td>Non-capturing group</td><td><code>(?:...)</code></td><td><code>(?:...)</code></td><td class="na"></td><td class="na"></td><td><code>\%(...\)</code></td></tr>
<tr><td>Backreference to a specific group.</td><td><code>\1</code> <code>\g1</code> <code>\g{-1}</code></td><td><code>\1</code></td><td><code>\1</code></td><td><code>\1</code> non-official</td><td><code>\1</code></td></tr>
<tr><td>Named backreference</td><td><code>\g{name}</code> <code>\k<name></code></td><td><code>(?P=name)</code></td><td class="na"></td><td class="na"></td><td class="na"></td></tr>
</tbody>
<tbody id="syntax-lookaround" class="subgroup">
<tr><th colspan="6">Look-around</th></tr>
<tr><td>Positive look-ahead</td><td><code>(?=...)</code></td><td><code>(?=...)</code></td><td class="na"></td><td class="na"></td><td><code>\(...\)\@=</code></td></tr>
<tr><td>Negative look-ahead</td><td><code>(?!...)</code></td><td><code>(?!...)</code></td><td class="na"></td><td class="na"></td><td><code>\(...\)\@!</code></td></tr>
<tr><td>Positive look-behind</td><td><code>(?<=...)</code></td><td><code>(?<=...)</code></td><td class="na"></td><td class="na"></td><td><code>\(...\)\@<=</code></td></tr>
<tr><td>Negative look-behind</td><td><code>(?<!...)</code></td><td><code>(?<!...)</code></td><td class="na"></td><td class="na"></td><td><code>\(...\)\@<!</code></td></tr>
</tbody>
<tbody id="syntax-multiplicity" class="subgroup">
<tr><th colspan="6">Multiplicity</th></tr>
<tr><td>0 or 1</td><td><code>?</code></td><td><code>?</code></td><td><code>\?</code></td><td><code>?</code></td><td><code>\?</code></td></tr>
<tr><td>0 or more</td><td><code>*</code></td><td><code>*</code></td><td><code>*</code></td><td><code>*</code></td><td><code>*</code></td></tr>
<tr><td>1 or more</td><td><code>+</code></td><td><code>+</code></td><td class="na"></td><td><code>+</code></td><td><code>\+</code></td></tr>
<tr><td>Specific number</td><td><code>{n}</code> <code>{n,m}</code> <code>{n,}</code></td><td><code>{n}</code> <code>{n,m}</code> <code>{n,}</code></td><td><code>\{n\}</code> <code>\{n,m\}</code> <code>\{n,\}</code></td><td><code>{n}</code> <code>{n,m}</code> <code>{n,}</code></td><td><code>\{n}</code> <code>\{n,m}</code> <code>\{n,}</code></td></tr>
<tr><td>0 or 1, non-greedy</td><td><code>??</code></td><td><code>??</code></td><td class="na"></td><td class="na"></td><td></td></tr>
<tr><td>0 or more, non-greedy</td><td><code>*?</code></td><td><code>*?</code></td><td class="na"></td><td class="na"></td><td><code>\{-}</code></td></tr>
<tr><td>1 or more, non-greedy</td><td><code>+?</code></td><td><code>+?</code></td><td class="na"></td><td class="na"></td><td></td></tr>
<tr><td>Specific number, non-greedy</td><td><code>{n,m}?</code> <code>{n,}?</code></td><td><code>{n,m}?</code> <code>{n,}?</code></td><td class="na"></td><td class="na"></td><td><code>\{-n,m}</code> <code>\{-n,}</code></td></tr>
<tr><td>0 or 1, don't give back on backtrack</td><td><code>?+</code></td><td class="na"></td><td class="na"></td><td class="na"></td><td class="na"></td></tr>
<tr><td>0 or more, don't give back on backtrack</td><td><code>*+</code></td><td class="na"></td><td class="na"></td><td class="na"></td><td class="na"></td></tr>
<tr><td>1 or more, don't give back on backtrack</td><td><code>++</code></td><td class="na"></td><td class="na"></td><td class="na"></td><td class="na"></td></tr>
<tr><td>Specific number, don't give back on backtrack</td><td><code>{n,m}+</code> <code>{n,}+</code></td><td class="na"></td><td class="na"></td><td class="na"></td><td class="na"></td></tr>
</tbody>
<tbody id="syntax-other" class="subgroup">
<tr><th colspan="6">Other</th></tr>
<tr><td>Independent non-backtracking pattern</td><td><code>(?>...)</code></td><td class="na"></td><td class="na"></td><td class="na"></td><td><code>\(...\)\@></code></td></tr>
<tr><td>Make case-sensitive/insensitive</td><td><code>(?i)</code> / <code>(?-i)</code></td><td><code>(?i)</code> / <code>(?-i)</code></td><td class="na"></td><td class="na"></td><td><code>\c</code> / <code>\C</code></td></tr>
</tbody>
</table>
</section>
<section id="programs" class="group">
<h1>Programs</h1>
<table class="table table-bordered table-striped">
<thead>
<tr><th>What</th><th>Syntax</th><th>Comments/gotchas</th></tr>
</thead>
<tbody id="programs-languages" class="subgroup">
<tr><th colspan="3">Programming languages</th></tr>
<tr><td><a href="http://perldoc.perl.org/perlre.html">Perl</a></td><td>PCRE</td><td>PCRE is actually a separate implementation from Perl's, with <a href="http://en.wikipedia.org/wiki/Perl_Compatible_Regular_Expressions#Differences_from_Perl">slight differences</a></td></tr>
<tr><td><a href="https://docs.python.org/library/re.html">Python's <code>re</code> standard lib</a></td><td>Python's own syntax (Perl-inspired)</td><td></td></tr>
<tr><td><a href="http://ruby-doc.org/core-2.2.0/Regexp.html">Ruby</a></td><td>Ruby's own syntax (Perl-inspired)</td><td></td></tr>
<tr><td><a href="http://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html">Java's java.util.regex</a></td><td>Almost PCRE</td><td></td></tr>
<tr><td><a href="http://www.boost.org/doc/libs/1_49_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html">Boost.Regex</a></td><td>PCRE</td><td></td></tr>
</tbody>
<tbody id="programs-editors" class="subgroup">
<tr><th colspan="3">Text editors</th></tr>
<tr><td><a href="http://www.eclipse.org/tptp/home/downloads/installguide/gla_42/ref/rregexp.html">Eclipse</a></td><td>PCRE</td><td></td></tr>
<tr><td>Emacs</td><td>?</td><td></td></tr>
<tr><td>Netbeans</td><td>PCRE</td><td></td></tr>
<tr><td>Notepad++</td><td>PCRE (Boost.Regex)</td><td></td></tr>
<tr><td>PyCharm</td><td>PCRE</td><td>Perl-inspired</td></tr>
<tr><td>Sublime Text</td><td>?</td><td></td></tr>
<tr><td>UltraEdit</td><td>PCRE</td><td></td></tr>
<tr><td>Vim</td><td>Vim</td><td>In <a href="https://vimdoc.sourceforge.net/htmldoc/pattern.html#/magic">"magic" mode</a> (default)</td></tr>
</tbody>
<tbody id="programs-cmdline" class="subgroup">
<tr><th colspan="3">Command-line tools</th></tr>
<tr><td>awk</td><td>ERE</td><td>might depend on the implementation</td></tr>
<tr><td>grep</td><td>BRE, <code>egrep</code> for ERE, <code>grep -P</code> for PCRE (optional)</td><td></td></tr>
<tr><td>less</td><td>ERE</td><td>usually; man page says "regular expression library supplied by your system"</td></tr>
<tr><td>screen</td><td>plain text</td><td></td></tr>
<tr><td>sed</td><td>BRE, <code>-E</code> switches to ERE</td><td></td></tr>
</tbody>
</table>
</section>
</div>
<nav class="col-md-3 bs-docs-sidebar">
<ul id="sidebar" class="nav nav-stacked fixed">
<li>
<a href="#intro">Introduction</a>
</li>
<li>
<a href="#syntax">Syntax</a>
<ul class="nav nav-stacked">
<li><a href="#syntax-basics">Basics</a></li>
<li><a href="#syntax-characters">Character classes</a></li>
<li><a href="#syntax-assert">Zero-width assertions</a></li>
<li><a href="#syntax-groups">Captures and groups</a></li>
<li><a href="#syntax-lookaround">Look-around</a></li>
<li><a href="#syntax-multiplicity">Multiplicity</a></li>
<li><a href="#syntax-other">Other</a></li>
</ul>
</li>
<li>
<a href="#programs">Programs</a>
<ul class="nav nav-stacked">
<li><a href="#programs-languages">Programming languages</a></li>
<li><a href="#programs-editors">Text editors</a></li>
<li><a href="#programs-cmdline">Command-line tools</a></li>
</ul>
</li>
</ul>
</nav>
</div>
</div>
<!-- Bootstrap core JavaScript
================================================== -->
<!-- Placed at the end of the document so the pages load faster -->
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
<script src="js/bootstrap.min.js"></script>
<!--<script src="js/docs.min.js"></script>-->
<!-- IE10 viewport hack for Surface/desktop Windows 8 bug -->
<script src="js/ie10-viewport-bug-workaround.js"></script>
<script>
$('body').scrollspy({
target: '.bs-docs-sidebar',
offset: 40
});
</script>
</body>
</html>