You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I was running the sessions, I started by talking about a text editor or word processor's typical search functionality and then offering up "queries" that we could express easily as people but which these applications couldn't handle. For example:
Every word that begins with a number
Uncapitalized words at the beginning of a sentence.
Phone numbers
You could tell a human to search a Word document for these things and they'd understand what to do, notwithstanding how tedious the task would be. I then said that regular expressions are a language that allow us to express/articulate more sophisticated patterns.
It achieves these by allowing us to do two things that Word's search functionality doesn't:
The ability to express character classes, i.e., instead of searching for a fixed character, we can search for one of the N characters contained in a character class.
The ability to express the idea of repetition, i.e., look for an X repeated 0 or more times, 1 or more times, exactly 5 times, 0 or 1 times, and so on.
Before introducing the short-hand notation for common character classes like \d, \s, and so on, I made sure they understood regular expressions like
After that, I went into more typical regexes. You can see in the notes where I spelled out the "vocabulary."
. = any character whatsoever
? = 0 or 1 times
* = 0 or more times
+ = 1 or more times
{N} = exactly N times
{X,Y} = between X and Y times
Character classes, e.g., [aeiou], [01234]
There "convenience" shorthand ways of writing many of these
\d = [0123456789], i.e., any digit
\D = [^0123456789], i.e., any non-digit
\s = (any whitespace character, incl. space, tab, newline)
\S = any non-whitespace chracter
[A-Za-z0-9]
We can also express and combine ranges where it makes sense:
[0-9] = [0123456789]
[a-z] = [abcdefghijklmnopqrstuvwxyz]
[A-Z] = [ABCDEFGHIJKLMNOPQRSTUVWXYZ]
and combine them, e.g.,
[a-z,] = [abcdefghijklmnopqrstuvwxyz,]
[A-Za-z] = [ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz]
[a-z0-9] = [abcdefghijklmnopqrstuvwxyz0123456789]
Create some kind of supplemental kata/code/project that demonstrates how to use regex.
Things that I think could be useful to emphasize,
.match()
.sub()
The text was updated successfully, but these errors were encountered: