-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path103-8-Perform-basic-file-editing-operations-using
154 lines (122 loc) · 5.49 KB
/
103-8-Perform-basic-file-editing-operations-using
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
###########################################################
103.7 Search text files using regular expressions
###########################################################
Candidates should be able to manipulate files and text data using regular expressions. This objective includes creating simple regular expressions containing several notational elements. It also includes using regular expression tools to perform searches through a filesystem or file content.
Objectives
& Create simple regular expressions containing several notational elements.
& Use regular expression tools to perform searches through a filesystem or file content.
& grep
& egrep
& fgrep
& sed
& regex
################################
Regex
################################
Regular expression, Regex, regexp is a pattern to describe what we want to match from a text. Here will discuss the form of regex which is used with the grep (generalised regular expression processor) command.
There is two kind of regex in GNU grep: Basic an Extended.
################################
Basic blocks
################################
1- Adding two expressions
If you need to add (concat) two expressions, just write them after each other.
Regex Will match
a ali, mina, hamid, jamshid
na nasim, mina, nananana batman, mona
2- Repeating
& The * means repeating the previous character for 0 or more
& The + means repeating the previous character for 1 or more
& the ? means zero or one repeats
& {n,m} The item is matched at least n times, but not more than m times
Regex Will match Note
a*b ab, aaab, aaaaab, aaabthis
a*b b, mobser Because there is a b here with zero a before it
a+b ab, aab, aaabenz wont match sober or b because there needs to be at lear one a
a?b ab, aab, b, batman (zero a then b), ... .
3- Alternation (|)
If you say a\|b it will match a or b.
4- Character Classes
The dot (.) means any character. So .. will match anything with at least two character in it. You can also create your own classes with [abc] which will match a or b or c and [a-z] which match a to z.
You can also refer to digits with \d and
5- Ranges
There are easy ways to commonly used classes. Named classes open with [: and close with :]
Range Meaning
[:alnum:] Alphanumeric characters
[:blank:] Space and tab characters
[:digit:] The digits 0 through 9 (equivalent to 0-9)
[:upper:] and [:lower:] Upper and lower case letters, respectively.
^ (negation) As the first character after [ in a character class negates the sense of the remaining characters
A common form is .* which matches any character (zero or any length).
6- Matching specific locations
& The caret ^ means beginning of the string
& The dollar $ means the end of the string
################################
Samples
################################
^a.* Matches anything that starts with a
^a.*b$ Matches anything that starts with a and ends with b
^a.*\d+.*b$ Matches anything starting with a, have some digits in the middle and end with b
^(l|b)oo Matches anything starts with l or b and then have oo
[f-h]|[A-K]$ The last character should be f to h (capital or small)
################################
grep
################################
The grep command can search inside the files.
$ grep p friends
payam
pedram
$
There are the most important switches:
switch meaning
-c just show the count
-v reverse the search
-n show line numbers
-l show only file names
-i case insensitive
$ grep p *
friends:payam
friends:pedram
what_I_have.txt:laptop 2
what_I_have.txt:pillow 5
what_I_have.txt:apple 2
$ grep p * -n
friends:12:payam
friends:15:pedram
what_I_have.txt:2:laptop 2
what_I_have.txt:3:pillow 5
what_I_have.txt:4:apple 2
$ grep p * -l
friends
what_I_have.txt
$ grep p * -c
friends:2
what_I_have.txt:3
$
If is very common to combine grep and find: find . -type f -print0 | xargs -0 grep -c a | grep -v ali # find all files with a in them but not ali`
################################
extended grep
################################
Extended grep is a GNU extension. It does not need the escaping and much easier. It can be used with -E option or egrep command which equals to grep -E.
################################
Fixed grep
################################
If you need to search the exact string (and not interpret it as a regex), use grep -F or fgrep so the fgrep this$ wont go for the end of the line and will find this$that too.
################################
sed
################################
In previous lessons we saw simple sed usage. Here I have great news for you: sed understands regex! If is good to use -r switch to tell sed that we are using them.
$ sed -r "s/^(a|b)/STARTS WITH A OR B/" friends
STARTS WITH A OR Bmir
mina
jafar
STARTS WITH A OR Bita
STARTS WITH A OR Bli
hassan
Main switches:
switch meaning
-r use advanced regex
-n suppress output, you can use p at the end of your regex ( /something/p ) to print the output
$ sed -rn "/^(a|b)/p" friends
amir
bita
ali