http://regexpal.com/
http://www.regular-expressions.info/
http://mundogeek.net/archivos/2004/07/29/javascript-expresiones-regulares/
http://javascript.espaciolatino.com/lengjs/jsgram/expregulares.htm
http://www.javascriptkit.com/javatutors/redev.shtml
Regular expressions allow us to search and manipulate text in a very powerful way.
A regular expression consist in:
- A pattern used to locate texts that fits it
- Modificators (optionals) that indicate how to apply the pattern
In Javascript we have available regular expressions objects that we can create:
- With the constructor function
RegExp
:new RegExp("j.*t")
- With the literal notation:
/j.*t/
;
The regular expressions objects have the following properties:
-
global
: Withfalse
(by default) it returns only the first element that is found. Withtrue
return all the found elements -
ignoreCase
: Withtrue
it does the matching sensitive to uppercase (false
by default) -
multiline
: Withtrue
it does the search in several lines (false
by default) -
lastIndex
: The position in where to begin the search (por defecto a 0) -
source
: Contains the regular expression
These properties (except lastIndex
) cannot be modified once the object has been created
The 3 first properties represent the modificators of the regular expression:
- g: global
- i: ignoreCase
- m: multiline
>>> var re = new RegExp('j.*t', 'gmi');
undefined
>>> re.global
true
>>> re.global = false;
false
>>> re.global
true
>>> var re = /j.*t/ig;
undefined
>>> re.global
true
>>> re.source
"j.*t"
The RegExp objects have the following methods:
-
test()
: Returnstrue
if it finds something andfalse
if it doesn't -
exec()
: Return an array of string that match the pattern
>>> /j.*t/.test("Javascript")
false
>>> /j.*t/i.test("Javascript")
true
>>> /s(amp)le/i.exec("Sample text")
["Sample", "amp"]
>>> /a(b+)a/g.exec("_abbba_aba_")
["abbba", "bbb"]
http://www.javascriptkit.com/javatutors/re3.shtml
We have available the following methods of the object String
to look for inside of a text by using regular expressions:
-
match()
: Return an array of occurrences -
search()
: Return the position of the first occurrence -
replace()
: Allow us to replace the found string by another string -
split()
: Accepts a regular expression to split a string in elements of an array
- If we omit the modificator g we only replace the first occurrence
- We can include in the substitution the found string by using $&
- When the regular expression contain groups we can access to the occurrence of every group with $1, $2, etc...
- When specifying the substitution we can pass a function as a parameter where:
- The first parameter is the found string
- The last parameter is the string where the search is taking place
- The penultimate parameter is the position of the occurrence
- The rest of the parameters are the occurrences of each group of the pattern
>>> var s = "HelloJavaScriptWorld"
undefined
>>> s.match(/a/);
["a"]
>>> s.match(/a/g);
["a", "a"]
>>> s.match(/j.*a/i); ["Java"]
>>> s.search(/j.*a/i); 5
>>> s.replace(/[A-Z]/g, '');
"elloavacriptorld"
>>> s.replace(/[A-Z]/, ''); "elloJavaScriptWorld"
>>> s.replace(/[A-Z]/g, "_$&"); "_Hello_Java_Script_World"
>>> s.replace(/([A-Z])/g, "_$1"); "_Hello_Java_Script_World"
>>> "[email protected]".replace(/(.*)@.*/, "$1");
"juanmanuel.garrido"
>>> function replaceCallback(match){return "_" + match.toLowerCase();}
undefined
>>> s.replace(/[A-Z]/g, replaceCallback); "_hello_java_script_world"
>>> var sMail = "[email protected]";
undefined
>>> var rRegExp = /(.*)@(.*)\.(.*)/;
undefined
>>> var fCallback = function () { args = arguments; return args[1] + " de " + args[2].toUpperCase(); }
undefined
>>> sMail.replace( rRegExp, fCallback);
"juanmanuel.garrido de SOFTONIC"
>>> args
["[email protected]", "juanmanuel.garrido",
"softonic", "com", 0, "[email protected]"]
>>> var csv = 'one, two,three ,four';
>>> csv.split(',');
["one", " two", "three ", "four"]
>>> csv.split(/\s*,\s*/)
["one", "two", "three", "four"]
>>> "test".replace('t', 'r')
"rest"
>>> "test".replace(new RegExp('t'), 'r')
"rest"
>>> "test".replace(/t/, 'r')
"rest"
https://en.wikipedia.org/wiki/Regular_expression
http://www.addedbytes.com/cheat-sheets/regular-expressions-cheat-sheet/
http://www.visibone.com/regular-expressions/
Looks for coincidences of the characters in the pattern
>>> "some text".match(/[otx]/g)
["o", "t", "x", "t"]
Looks for coincidences in that characters range
[a-d]
is the same than [abcd]
[a-z]
looks for all the lowercase characters
[a-zA-Z0-9_]
looks for all the characters, numbers and the underscore
>>> "Some Text".match(/[a-z]/g)
["o", "m", "e", "e", "x", "t"]
>>> "Some Text".match(/[a-zA-Z]/g)
["S", "o", "m", "e", "T", "e", "x", "t"]
###[^abc]
Returns everything that does NOT match the pattern
>>> "Some Text".match(/[^a-z]/g)
["S", " ", "T"]
Returns a or b (the bar indicates OR)
>>> "Some Text".match(/t|T/g);
["T", "t"]
>>> "Some Text".match(/t|T|Some/g);
["Some", "T", "t"]
Returns a only is found followed by b
>>> "Some Text".match(/Some(?=Tex)/g);
null
>>> "Some Text".match(/Some(?= Tex)/g);
["Some"]
Returns a only is found NOT followed by b
>>> "Some Text".match(/Some(?! Tex)/g);
null
>>> "Some Text".match(/Some(?!Tex)/g);
["Some"]
Escape character that are used to find special characters used in the pattern as literals
>>> "R2-D2".match(/[2-3]/g)
["2", "2"]
>>> "R2-D2".match(/[2\-3]/g)
["2", "-", "2"]
 New line
Carriage return (To begin a new line \r\n
is used in Windows, \n
in Unix and \r
in Mac)
New page
Tabulation
Vertical Tabulation
Blank espace or any of the previous 5 sequences
>>> "R2\n D2".match(/\s/g)
["\n", " "]
The opposite of the previous sequence. Returns everything but blank spaces and the 5 escape sequences. The same than [^\s]
>>> "R2\n D2".match(/\S/g)
["R", "2", "D", "2"]
Any letter, number, or underscore. The same than [A-Za-z0-9_]
>>> "Some text!".match(/\w/g)
["S", "o", "m", "e", "t", "e", "x", "t"]
The contrary than \w
>>> "Some text!".match(/\W/g)
[" ", "!"]
Locates a number. The same than [0-9]
>>> "R2-D2 and C-3PO".match(/\d/g)
["2", "2", "3"]
The contrary than \d
. It locates non-numerical characters. The same than [^0-9]
or [^\d]
>>> "R2-D2 and C-3PO".match(/\D/g)
["R", "-", "D", " ", "a", "n", "d", " ", "C", "-", "P", "O"]
A word "limit" (space, puntuation, hyphen...)
>>> "R2D2 and C-3PO".match(/[RD]2/g)
["R2", "D2"]
>>> "R2D2 and C-3PO".match(/[RD]2\b/g)
["D2"]
>>> "R2-D2 and C-3PO".match(/[RD]2\b/g)
["R2", "D2"]
The contrary than \b
>>> "R2-D2 and C-3PO".match(/[RD]2\B/g)
null
>>> "R2D2 and C-3PO".match(/[RD]2\B/g)
["R2"]
Represents the beginning of the string where we're looking for.
If we have the modificator m
it represents the beginning of every line.
>>> "regular\nregular\nexpression".match(/r/g);
["r", "r", "r", "r", "r"]
>>> "regular\nregular\nexpression".match(/^r/g);
["r"]
>>> "regular\nregular\nexpression".match(/^r/mg);
["r", "r"]
Represents the final of the string where we're looking for
If we have the modificator m
it represents the end of every line.
>>> "regular\nregular\nexpression".match(/r$/g);
null
>>> "regular\nregular\nexpression".match(/r$/mg);
["r", "r"]
Represents any character but the new line and the carriage return
>>> "regular".match(/r./g);
["re"]
>>> "regular".match(/r.../g);
["regu"]
It matches if the preceding pattern happen 0 or more times
/.*/
will return everything, including "nothing" (empty string)
>>> "".match(/.*/)
[""]
>>> "anything".match(/.*/)
["anything"]
>>> "anything".match(/n.*h/)
["nyth"]
It matches if the preceding pattern happen 0 or once
>>> "anything".match(/ny?/g)
["ny", "n"]
It matches if the preceding pattern happen 1 or more times (at least once)
>>> "anything".match(/ny+/g)
["ny"]
>>> "R2-D2 and C-3PO".match(/[a-z]/gi)
["R", "D", "a", "n", "d", "C", "P", "O"]
>>> "R2-D2 and C-3PO".match(/[a-z]+/gi)
["R", "D", "and", "C", "PO"]
It matches if the preceding pattern happen exactly n times
>>> "regular expression".match(/s/g)
["s", "s"]
>>> "regular expression".match(/s{2}/g)
["ss"]
>>> "regular expression".match(/\b\w{3}/g)
["reg", "exp"]
It matches if the preceding pattern happen between min and max times max can be omitted (will only have minimum) min cannot be omitted
>>> "doooooooooodle".match(/o/g)
["o", "o", "o", "o", "o", "o", "o", "o", "o", "o"]
>>> "doooooooooodle".match(/o{2}/g)
["oo", "oo", "oo", "oo", "oo"]
>>> "doooooooooodle".match(/o{2,}/g)
["oooooooooo"]
>>> "doooooooooodle".match(/o{2,6}/g)
["oooooo", "oooo"]
When the pattern is between parentheses, is being captured and stored so it can be used in substitutions (patterns capture).
Thse captures are available at $1
, $2
,... $9
>>> "regular expression".replace(/(r)/g, '$1$1')
"rregularr exprression"
>>> "regular expression".replace(/(r)(e)/g, '$2$1')
"ergular experssion"
Non capturable pattern (not available at $1
, $2
, ...)
>>> "regular expression".replace(/(?:r)(e)/g, '$1$1')
"eegular expeession"