-
Notifications
You must be signed in to change notification settings - Fork 11
/
Copy pathoptimization.texi
328 lines (272 loc) · 9.75 KB
/
optimization.texi
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
@SECTION Source and Bytecode Optimization
The bytecode optimizer works on two levels: source-to-source
optimizing transformations and LAP-level peephole optimizations. It's
possible to control optimization level by setting the
@var{byte-optimize} variable (defined in @file{bytecomp.el}) to
@code{nil}, @code{t}, @code{source} or @code{byte}.
The optimizing framework itself resides in @file{byte-opt.el}.
The source-to-source optimizer entry point is the
@code{byte-optimize-form} function. It works on macro-expanded forms,
recursively looking for function calls and special forms that have
optimising functions defined for them.
The peephole optimizer (see @code{byte-optimize-lapcode}) function
works on LAP directly. It goes through the instruction sequence using
a window of 3 instructions to look for subsequences that can be
dropped or replaced with more efficient versions.
Both optimizers work until a fixpoint is reached, i.e. all the
transformations are applied until there are not further changes
possible.
@menu
* Source-to-source Transformations::
* Peephole Optimization::
* Adding an Optimizing Transformation::
* Constant Folding::
* Dead Code Elimination::
* Strength Reduction::
@end menu
@c TODO:
@c
@c @itemize
@c @item
@c constant folding
@c @itemize
@c @item
@c removal of unreachable code;
@c @item
@c detecting and replacing sequences of operations with an equivalent primitive
@c @item
@c removal of calls to side-effectless functions whose return-value is unused;
@c @item
@c compile-time evaluation of safe constant forms, such as
@c (@code{consp}, @code{nil}, and @code{(ash 1 6)}
@c @item
@c open-coding of literal lambdas;
@c @item
@c peephole optimization of emitted code;
@c @item
@c trivial functions are left uncompiled for speed.
@c @end itemize
@c @item
@c support for inline functions;
@c @item
@c compile-time evaluation of arbitrary expressions;
@c @item
@c compile-time warning messages for:
@c @itemize
@c @item
@c functions being redefined with incompatible arglists;
@c @item
@c functions being redefined as macros, or vice-versa;
@c @item
@c functions or macros defined multiple times in the same file;
@c @item
@c functions being called with the incorrect number of arguments;
@c @item
@c functions being called which are not defined globally, in the
@c file, or as autoloads;
@c @item
@c assignment and reference of undeclared free variables;
@c @item
@c various syntax errors;
@c @end itemize
@c @item
@c correct compilation of nested @code{defuns}, @code{defmacros},
@c @code{defvars} and @code{defsubsts};
@c @item
@c correct compilation of top-level uses of macros;
@c @item
@c the ability to generate a histogram of functions called.
@c @end itemize
@FirstSubsection Source-to-source Transformations
All source-to-source transformations are performed by the
@code{byte-optimize-form} function. It's possible to run
@code{byte-optimize-form} on any well-formed form. E.g.,
@code{(byte-optimize-form '(if t 1 2))} returns @code{1}.
Given a form the @code{byte-optimize-form} function performs
transformations in two ways: first it runs the
@code{byte-optimize-form-code-walker} function; then it checks if the
current form is both a function application and the function symbol
has a specialized optimizing function defined for it in the
@code{byte-optimizer} property.
@code{byte-optimize-form-code-walker} recognizes special forms
(@code{if}, @code{let}, @code{let*}, @code{lambda}, @code{closure},
...) and, having performed some sanity checks, recursively runs
@code{byte-optimize-form} on subforms. Here's an example
transformation of the @code{if} special form:
@example
;; sanity checks
(when (< (length form) 3)
(byte-compile-warn "too few arguments for `if'"))
;; simplify the form by optimizing subforms
(cons fn
(cons (byte-optimize-form (nth 1 form) nil)
(cons
(byte-optimize-form (nth 2 form) for-effect)
(byte-optimize-body (nthcdr 3 form) for-effect))))
@end example
The @code{byte-optimizer} symbol property is defined for most of the
built-in side-effect-free functions. E.g. it is defined for some of
the basic math functions in @file{byte-opt.el}:
@example
(put '+ 'byte-optimizer 'byte-optimize-plus)
(put '* 'byte-optimizer 'byte-optimize-multiply)
(put '- 'byte-optimizer 'byte-optimize-minus)
(put '/ 'byte-optimizer 'byte-optimize-divide)
(put 'max 'byte-optimizer 'byte-optimize-associative-math)
(put 'min 'byte-optimizer 'byte-optimize-associative-math)
@end example
This is how constant folding (@pxref{Constant Folding}) works in the
compiler. The same approach is used for dead code elimination
(@pxref{Dead Code Elimination}) in built-in forms:
@example
(put 'and 'byte-optimizer 'byte-optimize-and)
(put 'or 'byte-optimizer 'byte-optimize-or)
(put 'cond 'byte-optimizer 'byte-optimize-cond)
(put 'if 'byte-optimizer 'byte-optimize-if)
(put 'while 'byte-optimizer 'byte-optimize-while)
@end example
@FirstSubsection Peephole Optimization
Peephole optimization is implemented by the
@code{byte-optimize-lapcode} function. Patterns to be replaced are
represented as @code{cond} condition cases. The instruction window is
3 instructions long.
There are two groups of replacements: transformations applied until no
further work can be done and transformations that should only be
applied once.
Example transformation dropping side-effect-free instructions followed
by the @code{discard} (drop the top of the stack) instruction:
@example
;; <side-effect-free> pop --> <deleted>
;; ...including:
;; const-X pop --> <deleted>
;; varref-X pop --> <deleted>
;; dup pop --> <deleted>
((and ;; drop the top of the stack in the 2nd op?
(eq 'byte-discard (car lap1))
;; and the 1st op has no side effects?
(memq (car lap0) side-effect-free))
;; safely discard both instructions
;; make sure there will be one more pass
(setq keep-going t)
;; we'll need to correct stack alignment info
(setq tmp (aref byte-stack+-info (symbol-value (car lap0))))
;; next pass will start from the next instruction
(setq rest (cdr rest))
;; drop instructions depending on what should be on the stack
(cond ((= tmp 1)
(byte-compile-log-lap
" %s discard\t-->\t<deleted>" lap0)
(setq lap (delq lap0 (delq lap1 lap))))
((= tmp 0)
(byte-compile-log-lap
" %s discard\t-->\t<deleted> discard" lap0)
(setq lap (delq lap0 lap)))
((= tmp -1)
(byte-compile-log-lap
" %s discard\t-->\tdiscard discard" lap0)
(setcar lap0 'byte-discard)
(setcdr lap0 0))
((error "Optimizer error: too much on the
stack"))))
;; more transformations...
@end example
@FirstSubsection Adding an Optimizing Transformation
TODO
@Subsection Constant Folding
In cases were constants can be evaluated at compile time to come up
with simpler results, that is done.
@code{(defun constant-fold-eg() (+ 1 2))} generates:
@c ((lexical . t) (optimize . nil))
@example
PC Byte Instruction
0 192 constant[0] 1
1 193 constant[1] 2
2 92 plus
3 135 return
Constants Vector: [1 2]
@end example
while with optimization we get:
@c @code{(defun constant-fold-eg() (+ 1 2))} generates:
@c ((lexical . t) (optimize . t))
@example
PC Byte Instruction
0 192 constant[0] 3
1 135 return
Constants Vector: [3]
@end example
In
@uref{https://lists.gnu.org/archive/html/emacs-devel/2018-04/msg00018.html,
Floating-point constant folding in Emacs byte compile} the notion was
put forth that optimization has to be portable over improving
code. (The issue here was compiling Emacs with larger integers allowed
for larger possibiles of constant folding).
Although Emacs can be compiled with different for integers and floats
depending the setting of @code{--with-wide-int}, for portability,
Emacs will assume in bytecode the smaller value of integers and will
skip opportunities that would assume larger integers.
@Subsection Dead Code Elimination
If there is no way code can be reached, it is removed. This optimization
interacts with the previous optimization: constant folding.
With bytecode optimization and lexicals scoping off:
@c @code{(defun dead-code-or-eg(a) (or t a))} generates:
@c @c ((lexical . nil) (optimize . nil))
@example
@group
(defun dead-code-eg(a)
(or t a))
@end group
@end example
generates:
@example
@group
PC Byte Instruction
0 193 constant[1] t
1 134 goto-if-not-nil-else-pop [5]
5
0
4 8 varref[0] a
5 135 return
Constants Vector: [a t]
@end group
@end example
On the other hand, with bytecode-optimization we get:
@c @code{(defun dead-code-or-eg(a) (or t a))} generates:
@c ((lexical . t) (optimize . t))
@example
@group
PC Byte Instruction
0 192 constant[0] t
1 135 return
Constants Vector: [t]
@end group
@end example
@Subsection Strength Reduction
The optimizer can recognize when there is primative instructions that
implements an equivalent longer set of instructions.
For example without optimization:
@code{(defun strength-reduce-eg(a) (+ a 1))} generates:
@c ((lexical . nil) (optimize . nil))
@example
@group
PC Byte Instruction
0 8 varref[0] a
1 193 constant[1] 1
2 92 plus
3 135 return
Constants Vector: [a 1]
@end group
@end example
However with optimizaion
@code{(defun strength-reduce-opt-eg(a) (+ a 1))} generates:
@c ((lexical . nil) (optimize . t))
@example
@group
PC Byte Instruction
0 8 varref[0] a
1 84 add1
2 135 return
Constants Vector: [a]
@end group
@end example
Notice that the optimizer took advantage of the commutative property of addition
and treated @code{(+ a 1)} as the same thing as @code{(+ 1 a)}.