-
Notifications
You must be signed in to change notification settings - Fork 25
/
chap3.tex
1221 lines (1059 loc) · 47.5 KB
/
chap3.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\chapter{Design space}
% ``technology transfer'' from PLT to TC
This chapter will attempt to locate technical computing within the large
design space of programming languages.
It will discuss how the priorities introduced in section~\ref{sec:tcproblem}
relate to type systems and dispatch systems.
% concludes with an intro to julia etc.
Our lofty goal is to preserve the flexibility and ease of popular high level
languages, while adding the expressiveness needed for performant generic
programming.
% it is important to bear in mind the many meanings of the word ``type''
% often people use it just when they want a different ``kind of thing'',
% and have not yet thought about how that relates to the means of
% expression available.
\section{A binding time dilemma}
\label{sec:bindingtimedilemma}
Consider the following fragment of a function called \texttt{factorize},
which numerically analyzes a matrix to discover structure that
can be exploited to solve problems efficiently:
\begin{singlespace}
\begin{lstlisting}[language=julia]
if istril(A)
istriu(A) && return Diagonal(A)
utri1 = true
for j = 1:n-1, i = j+1:m
if utri1 && A[i,j] != 0
utri1 = i == j + 1
end
end
utri1 && return Bidiagonal(diag(A), diag(A, -1), false)
return LowerTriangular(A)
end
\end{lstlisting}
\end{singlespace}
\noindent
The code returns different structured matrix types based on the input
data.
% TODO a bit more about how it's used
This sort of pattern can be implemented in object-oriented languages
by defining an interface or base class (perhaps called \texttt{StructuredMatrix}),
and declaring \texttt{Diagonal}, \texttt{Bidiagonal}, and so on as
subclasses.
Unfortunately, there are several problems with this solution.
Class-based OO ties together variation and dynamic dispatch: to allow
a value's representation to vary, each use of it needs to perform an
indirect call.
A major reason to use structured matrix types like \texttt{Bidiagonal}
is performance, so we'd rather avoid the overhead of indirect calls.
%for operations like element access.
A second problem is that most code operating on \texttt{Bidiagonal} would
be implementing algorithms that exploit its representation.
These algorithms would have to be implemented as methods of
\texttt{Bidiagonal}.
However this is not natural, since one cannot expect to have every
function that might be applied to bidiagonal matrices defined in
one place.
This kind of computing is function-oriented.
% other examples: CSC vs. CSR
Let's try a different abstraction mechanism: templates and overloading.
That would allow us to write code like this:
\begin{singlespace}
\begin{lstlisting}[language=julia]
if diagonal
solve(Diagonal(M), x)
elseif bidiagonal
solve(Bidiagonal(M), x)
end
\end{lstlisting}
\end{singlespace}
\noindent
Here, structured matrix types are used to immediately select an
implementation of \texttt{solve}.
But the part that selects a matrix type can't be abstracted away.
We would like to be able to write \texttt{solve(factorize(M), x)}.
% TODO: or tagged unions, but then you have to handle every case
%... no wonder people use prepackaged version of this functionality
Given these constraints, the traditional design of technical
computing systems makes sense: pre-generate a large number of
optimized routines for various cases, and be able to dispatch
to them at run time.
It's clear what's going on here: mathematics is dependently typed.
If a compiler could prove that a certain matrix were bidiagonal, or
symmetric, there would be no problem.
But knowing what a given object \emph{is} in a relevant sense can
require arbitrary proofs, and in research computing these proofs
might refer to objects that are not yet well understood.
% In numerical computing, if \emph{anybody} were able to prove the
% relevant property, there would be no point in running the program.
Most languages offer a strict divide between early and late binding,
separated by what the type system can statically prove.
The less the type system models the domain of interest, the more it
gets in the way.
% talk about the premature optimization of assuming
% (generate code for) = (known at compile time)
The compiler's ability to generate specialized code is also coupled
to the static type system, preventing it from being reused for domains
the type system does not handle well.
In general, of course, lifting this restriction implies a need to
generate code at run time, which is not always acceptable.
However run time code generation might not be necessary in every case.
A compiler could, for example, automatically generate a range of
specialized routines and select among them at run time, falling back
to slower generic code if necessary.
Consider the trade-off this entails.
The programmer is liberated from worrying about the binding time of
different language constructs.
With a JIT compiler, performance will probably be good.
But without JIT, performance will be unpredictable, with no clear
syntactic rules for what is expected to be fast.
So it is not surprising that software engineering languages shun this
trade-off.
However we believe it is the trade-off technical computing
users want by default, and it has not generally been available
in any language.
Designing for this trade-off requires a change of perspective
within the compiler.
Most compilers focus on, and make decisions according to, the
language's source-level type system.
Secondarily, they will perform flow analysis to discover
properties not captured by the type system.
Given our goals, we need to invert this relationship.
The focus needs to be on understanding program behavior
\emph{as thoroughly as possible}, since no one set of rules
will cover all cases.
Usually changing a language's type system is a momentous event,
but in the analysis-focused perspective the compiler can evolve
rapidly, with ``as thoroughly as possible'' handling more cases
all the time.
This approach is still compatible with static typing, although we will not
discuss it much.
We could pick a well-defined subset of the system,
or a separate type system, to give errors about.
The notion of ``as thorough as possible'' analysis is formalized
by domain theory.
%Instead of a split between static and dynamic resolution, we can
%instead focus on program \emph{analysis}.
%The goal is to understand a program as well as possible.
%2 problems?
% 1 - flexibility implies late binding, slowness. e.g. CSR/CSC as classes
% 2 - given a simple type system, programs will not match the lattice
% and analysis diverges
% issues:
% - how programs make choices
% - how compiler-understandable are those choices
% - when do you get specialized code
\section{Domain theory}
In the late 1960s Dana Scott and Christopher Strachey asked how to assign
meanings to programs, which otherwise just appear to be lists of symbols
\cite{scott1971toward}.
For example, given a program computing the factorial function, we
want a process by which we can assign the meaning ``factorial'' to it.
This led to the invention of domain theory, which can be interpreted
as modeling the behavior of a program without running it.
A ``domain'' is essentially a partial order of sets of values that a
program might manipulate.
The model works as follows: a program starts with no
information, the lowest point in the partial order (``bottom'').
Computation steps accumulate information, gradually moving higher through
the order.
This model provides a way to think about the meaning of a program without
running the program.
Even the ``result'' of a non-terminating program has a representation
(the bottom element).
%Other elements of the partial order might refer to intermediate results.
The idea of ``running a program without running it'' is of great interest
in compiler implementation.
A compiler would like to discover as much information as possible about a
program without running it, since running it might take a long time, or
produce side effects that the programmer does not want yet, or, of course,
not terminate.
The general recipe for doing this is to design a partial order (lattice)
that captures program properties of interest, and then describe all
primitive operations in the language based on how they transform
elements of this lattice.
Then an input program can be executed, in an approximate sense, until
it converges to a fixed point.
%For example, given a program that outputs an integer, we might decide
%that we only care whether this integer is even or odd.
%Then our posets are the even and odd integers, and we will classify operations
%in the program according to whether they are evenness-preserving,
%evenness-inverting, always even, always odd, of uncertain evenness,
%etc.
Most modern compilers use a technique like this
(sometimes called abstract interpretation \cite{abstractinterp})
to semi-decide
interesting properties like whether a variable is a constant, whether
a variable might be used before it is initialized, whether a variable's
value is never used, and so on.
%Domain theory gave rise to the study of denotational semantics and the
%design of type systems. However, the original theory is quite general
%and invites us to invent any domains we wish for any sort of language.
%Abstract interpretation \cite{abstractinterp} is an especially
%elegant and general implementation of this idea.
%It should be clear that this sort of analysis, while clearly related
%to type systems, is fairly different from what most programmers
%think of as type \emph{checking}.
Given the general undecidability of questions about programs, analyses
like these are \emph{conservative}.
If our goal is to warn programmers about use of uninitialized
variables, we want to be ``better safe than sorry'' and print a warning
if such a use \emph{might} occur.
Such a use corresponds to any lattice element greater than or equal to
the element representing ``uninitialized''.
Such conservative analyses are the essence of compiler transformations
for performance (optimizations): we only want to perform an optimization
if the program property it relies on holds for sure.
% and not if there is any uncertainty.
The generality of this approach
allows us to discover a large variety of program properties as long as we are
willing to accept some uncertainty.
%Of course, many programmers and language designers prefer to maximize
%safety, leading to different approaches that trade away some precision
%(e.g. syntactic type systems such as those in the ML language
%family \cite{hindley1969principal, MLtypeinf}).
Even if static guarantees are not the priority, or if a language considers
itself unconcerned with ``types'', the domain-theoretic model is
still our type system in some sense, whether we like it or not
(as implied by \cite{scott1976data}).
%- make it easier to ``follow the lattice''
% analyses don't work as well with programs not written to be
% ``type conscious''.
% the solution is just to make it easier to write type conscious programs.
% this doesn't require any restrictions, just a stylistic change.
% you would think if types are also just data values there would be no
% gain, but there is.
% there are run time things, and just flow-sensitive things
\subsection{Programs and domains}
\label{sec:programsanddomains}
% - checking isbidiag() vs. having a type
% - checking every array elt for integer - maybe for ^ function
Program analyses of this kind have been applied to high-level
languages many times.
A common pitfall is that the analysis can easily \emph{diverge}
for programs that make distinctions not modeled by the lattice.
Consider the following program that repeatedly applies elementwise
exponentiation (\texttt{.\^}) to an array:
% TODO: what should `f` be?
% maybe A = [sum(A.^A)]
\begin{singlespace}
\begin{lstlisting}[language=julia]
A = [n]
for i = 1:niter
A = f(A .^ A)
end
return A
\end{lstlisting}
\end{singlespace}
\noindent
We will try to analyze this program using the lattice in figure~\ref{fig:arraylattice}.
% TODO maybe add Union(Array{Int},Array{BigInt}) to this
\begin{figure}[!t]
\begin{center}
\begin{tikzpicture}[node distance=2cm]
\node(top) {$\top$};
\node(Array) [below=0.5cm of top] {\texttt{Array}};
\node(ArrayInteger) [below=0.5cm of Array] {$\exists\ T<:\texttt{Integer}\ \texttt{Array\{T\}}$};
\node(ArrayInt) [below left=1cm of ArrayInteger] {\texttt{Array\{Int\}}};
\node(ArrayBigInt) [below right=1cm of ArrayInteger] {\texttt{Array\{BigInt\}}};
\node(ArrayFloat) [right=1.25cm of ArrayBigInt] {\texttt{Array\{Float\}}};
\node(bot) [below=4.5cm of top] {$\bot$};
\draw(top) -- (Array);
\draw(Array) -- (ArrayInteger);
\draw(ArrayInteger) -- (ArrayInt);
\draw(ArrayInteger) -- (ArrayBigInt);
\draw(Array) -- (ArrayFloat);
\draw(ArrayInt) -- (bot);
\draw(ArrayBigInt) -- (bot);
\draw(ArrayFloat) -- (bot);
\end{tikzpicture}
\end{center}
\caption{
A lattice of array types
}
\label{fig:arraylattice}
\end{figure}
The result depends strongly on how the \texttt{.\^} and \texttt{\^} functions
are defined (assume that \texttt{.\^} calls \texttt{\^} on every element of an array).
The code might look like this:
\vspace{-4ex}
\begin{singlespace}
\begin{multicols}{2}
\begin{lstlisting}[language=julia]
function ^(x, y)
if trunc(y) == y
if overflows(x, y)
x = widen(x)
end
# use repeated squaring
else
# floating point algorithm
end
end
function f(a)
if all(trunc(y) .== y)
# integer case
else
# general case
end
end
\end{lstlisting}
\end{multicols}
\end{singlespace}
\noindent
This code implements the user-friendly behavior of automatically switching to the
\texttt{BigInt} type on overflow, by calling \texttt{widen}.
Assume we initially know that \texttt{n} is an \texttt{Int}, and therefore
that \texttt{A} is an \texttt{Array\{Int\}}.
Although \texttt{A} really does have this type, the code does not mention
types anywhere.
Next, the type of an element taken from \texttt{A} (\texttt{Int}) will
flow to the \texttt{y} argument of \texttt{\^}.
The function's behavior crucially depends on the test \texttt{trunc(y) == y}.
It is always true for integer arguments, but it is unlikely that the
analysis can determine this.
Therefore \texttt{\^} might return an \texttt{Int}, \texttt{BigInt}, or
\texttt{Float}, and we can only conclude that \texttt{A.\^{}A} is an
\texttt{Array}.
When function \texttt{f} is called, it performs a fairly expensive test
to see if every element is an integer.
Only through human reasoning about the whole program do we see that
this test is unnecessary.
However when the analysis considers \texttt{f} applied to type
\texttt{Array}, our type information is likely to diverge further to
\texttt{$\top$}, since we don't know anything about the array's elements.
The problem is that the program is not written in terms of the
underlying value domain, even though it could be.
We might have written \texttt{isa(y,Integer)} instead of \texttt{trunc(y) == y}.
However, there are reasons that code like this might exist.
The programmer might not be aware of the type system, or the conditional
might refer to a property that was not originally reflected in the type system
but is later, as the language and libraries evolve.
Adding type declarations is not an ideal solution since it can restrict
polymorphism, and the programmer might not know the type of \texttt{A.\^{}A}
or even the type of \texttt{n}.
\iffalse
How can we fix this?
One solution is to add type annotations.
But the author of the original code does not know the type of
\texttt{A.\^{}A}, and might not even know the type of \texttt{n}.
%%%%% is this accurate?
Another solution is to improve the analysis.
But we will never finish adding cases to the compiler.
Perhaps we can handle \texttt{trunc(y) == y}, but will we also be
able to understand \texttt{trunc(y) == 1*y}?
\fi
Our solution is to give some simple tools to the library developer
(who, of course, is not necessarily a different person).
In Julia, one can implement \texttt{\^} and \texttt{f} as follows:
\begin{singlespace}
\begin{lstlisting}[language=julia]
function ^(x, y)
# original code
end
function ^(x, y::Integer)
if overflows(x, y)
x = widen(x)
end
# ...
end
function f(a)
# original code
end
function f{T<:Integer}(a::Array{T})
# integer case
end
\end{lstlisting}
\end{singlespace}
\noindent
The syntax \texttt{y::Integer} for a function argument is a dispatch specification.
Now, from the same source program, we can conclude that only
\texttt{\^{}(x, y::Integer)} is applicable, so we know that the
result of \texttt{A.\^{}A} is some integer array.
The library writer can intercept this case with the definition
\texttt{f\{T<:Integer\}(a::Array\{T\})}\footnote{
Although the signature of this method happens to match one of the
lattice elements in figure~\ref{fig:arraylattice}, this is a coincidence.
Method applicability is determined dynamically.
}, avoiding the check \texttt{all(trunc(y) .== y)}.
Since we started with an integer array, the whole program behaves
exactly as before.
We have neither added restrictions, nor asked for redundant type annotations.
All we did is add a dispatch system, and encourage people to use it.
Using dispatch in this way is optional, but comes with benefits for
structuring programs.
For example, the function \texttt{f} above might perform much better
on integer arrays.
We can implement this case separately from the checks and extra logic
that might be needed for other data types, leading to cleaner code
and also making the core definition of \texttt{f} smaller and therefore
more likely to be inlined.
% TODO something about why we use an extensibility mechanism for collecting
% type info.
% 1 - the extra cases can be added by somebody else
% 2 - a library can pick a type to return to user code, and later
% ``intercept'' operations on it
The original version of this code can also be written in our system,
though it probably will not perform as well.
However, it might still perform better than one would expect, since the
functions that it calls are in turn defined using the same dispatch
mechanism.
This mechanism is expressive enough to be used down to the lowest levels of
the system, providing leverage one would not get from an object system
or performance add-on.
% reverse flow
Having type information attached to user or library definitions
increases the value of \emph{reverse} data flow analysis.
One way to describe why analyses of highly dynamic code diverge
is that everything in the program monotonically increases the
number of cases we need to consider; there is no way to
narrow possibilities.
But if we know that a function is only defined on type \texttt{T},
every value that passes through its argument list is narrowed to
\texttt{T}.
% there seems to be some psychology to this: ``write method definitions''
% is somehow an easier performance model than ``use conditions only involving
% type checks''
% instead of requiring that the compiler be able to resolve things, we
% just want to make it more likely.
% isa(x,Int) is simple enough, but for more complicated checks it becomes
% much harder to predict. e.g. imagine testing for everything being the
% same type.
% there has to be a split between what is specialized on and what is not.
% in c++ this requires switching to templates.
% next we will consider how this dispatch system fits in
\section{Dispatch systems}
It would be unpleasant if every piece of every program we wrote were forced
to do only one specific task.
Every time we wanted to do something slightly different, we'd have to write
a different program.
But if a language allows the same program element to do different things at
different times, we can write whole classes of programs at once.
This kind of capability is one of the main reasons object-oriented programming
is popular: it provides a way to automatically select different behaviors
according to some structured criteria
(we use the non-standard term ``criteria'' deliberately, in order
to clarify our point of view, which is independent of any particular
object system).
However in class-based OO there is essentially \emph{no way} to create an
operation that dispatches on existing types.
This has been called ``the expression problem''~\cite{wadler1998expression}).
While many kinds of object-oriented programs can ignore or work around
this problem, technical programs cannot.
In this domain most programs deal with the same
few types (e.g.\ numbers and arrays), and might sensibly want to write new
operations that dispatch on them.
% The loss of encapsulation due to multimethods weighed in \cite{binarymethods}
% is less of a problem for technical computing, and in some cases even
% advantageous.
%Somewhat unfortunately, the term \emph{object-oriented} has many
%connotations, and the \emph{object-oriented} methodology tries to address
%multiple software engineering problems --- for example modularity,
%encapsulation, implementation hiding, and code reuse. These issues are
%important, but it may be because of them that over time too little
%emphasis has been placed on expressive power.
\subsection{A hierarchy of dispatch}
%The sophistication of the available ``selection criteria'' account for a
%large part of the perceived ``power'' or leverage provided by a language.
It is possible to illustrate a hierarchy of such mechanisms.
As an example, consider a simple simulation, and how it can be written
under a series of increasingly powerful paradigms. First, written-out
imperative code:
\vspace{-3ex}
\begin{singlespace}
\begin{verbatim}
while running
for a in animals
b = nearby_animal(a)
if a isa Rabbit
if b isa Wolf then run(a)
if b isa Rabbit then mate(a,b)
else if a isa Wolf
if b isa Rabbit then eat(a,b)
if b isa Wolf then follow(a,b)
end
end
end
\end{verbatim}
\end{singlespace}
We can see how this would get tedious as we add more kinds of animals
and more behaviors.
Another problem is that the animal behavior is
implemented directly inside the control loop, so it is hard to see
what parts are simulation control logic and what parts are animal
behavior.
Adding a simple object system leads to a nicer implementation
\footnote{A perennial problem with simple examples is that better
implementations often make the code longer.}:
\vspace{-3ex}
\begin{singlespace}
\begin{verbatim}
class Rabbit
method encounter(b)
if b isa Wolf then run()
if b isa Rabbit then mate(b)
end
end
class Wolf
method encounter(b)
if b isa Rabbit then eat(b)
if b isa Wolf then follow(b)
end
end
while running
for a in animals
b = nearby_animal(a)
a.encounter(b)
end
end
\end{verbatim}
\end{singlespace}
Here all of the simulation's animal behavior has been
compressed into a single program point: \texttt{a.encounter(b)}
leads to all of the behavior by selecting an implementation based
on the first argument, \texttt{a}.
This kind of criterion is essentially indexed lookup; we can imagine
that \texttt{a} could be an integer index into a table of operations.
The next enhancement to ``selection criteria'' adds a hierarchy
of behaviors, to provide further opportunities to avoid repetition.
Here \texttt{A<:B} is used to declare a subclass relationship; it
says that an \texttt{A} is a kind of \texttt{B}:
\vspace{-3ex}
\begin{singlespace}
\begin{multicols}{2}
\begin{verbatim}
abstract class Animal
method nearby()
# search within some radius
end
end
class Rabbit <: Animal
method encounter(b::Animal)
if b isa Wolf then run()
if b isa Rabbit then mate(b)
end
end
class Wolf <: Animal
method encounter(b::Animal)
if b isa Rabbit then eat(b)
if b isa Wolf then follow(b)
end
end
while running
for a in animals
b = a.nearby()
a.encounter(b)
end
end
\end{verbatim}
\end{multicols}
\end{singlespace}
We are still essentially doing table lookup, but the tables have
more structure: every \texttt{Animal} has the \texttt{nearby}
method, and can inherit a general purpose implementation.
This brings us roughly to the level of most popular object-oriented
languages.
But still more can be done.
Notice that in the first transformation we replaced one level of \texttt{if}
statements with method lookup.
However, inside of these methods a structured set of \texttt{if} statements
remains.
We can replace these by adding another level of dispatch.
\vspace{-3ex}
\begin{singlespace}
\begin{verbatim}
class Rabbit <: Animal
method encounter(b::Wolf) = run()
method encounter(b::Rabbit) = mate(b)
end
class Wolf <: Animal
method encounter(b::Rabbit) = eat(b)
method encounter(b::Wolf) = follow(b)
end
\end{verbatim}
\end{singlespace}
We now have a \emph{double dispatch} system, where a method call
uses two lookups, first on the first argument and then on the
second argument.
This syntax might be considered a bit nicer, but the design
begs a question: why is $n=2$ special?
It isn't, and we could consider even more method arguments as part of
dispatch.
But at that point, why is the first argument special?
Why separate methods in a special way based on the first argument?
It seems arbitrary, and indeed we can remove the special treatment:
\vspace{-3ex}
\begin{singlespace}
\begin{verbatim}
abstract class Animal
end
class Rabbit <: Animal
end
class Wolf <: Animal
end
nearby(a::Animal) = # search
encounter(a::Rabbit, b::Wolf) = run(a)
encounter(a::Rabbit, b::Rabbit) = mate(a,b)
encounter(a::Wolf, b::Rabbit) = eat(a, b)
encounter(a::Wolf, b::Wolf) = follow(a, b)
while running
for a in animals
b = nearby(a)
encounter(a, b)
end
end
\end{verbatim}
\end{singlespace}
Here we made two major changes: the methods have been moved ``outside''
of any classes, and all arguments are listed explicitly.
This is sometimes called \emph{external dispatch}.
This change has significant implications.
Since methods no longer need to be ``inside'' classes, there is no syntactic
limit to where definitions may appear.
Now it is easier to add new methods after a class has been defined.
Methods also now naturally operate on combinations of objects, not single objects.
%There may be software engineering reasons to want ``ownership'' of methods
%by objects, but strictly speaking this coupling does not seem correct.
%It ought to be possible to define function behaviors independently of
%data hiding concerns.
The shift to thinking about combinations of objects is fairly revolutionary.
Many interesting properties only apply to combinations of objects, and not
individuals.
We are also now free to think of more exotic kinds of combinations.
We can define a method for \emph{any number} of objects:
\begin{verbatim}
encounter(ws::Wolf...) = pack(ws)
\end{verbatim}
\noindent
We can also abstract over more subtle properties, like whether the
arguments are two animals of the same type:
\begin{verbatim}
encounter{T<:Animal}(a::T, b::T) = mate(a, b)
\end{verbatim}
\noindent
Some systems push dispatch expressiveness even further.
% TODO more
\subsection{Predicate dispatch}
%Patterns are very powerful, but the tradeoff is that there is not
%necessarily a useful relationship between what your program does and
%what a static analysis (based on a finite-height partial order over
%patterns) can discover. Maybe julia could be considered a sweet spot
%somewhere in between.
Predicate dispatch is a powerful object-oriented mechanism that allows
methods to be selected based on arbitrary predicates \cite{ErnstKC98}.
It is, in some sense, the most powerful \emph{possible} dispatch system,
since any computation may be done as part of method selection.
Since a predicate denotes a set (the set of values for which it is true),
it also denotes a set-theoretic type.
Some type systems of this kind, notably that of Common
Lisp~\cite{steele1990common:types}, have actually included predicates as types.
However, such a type system is obviously undecidable, since it
requires computing the predicates themselves or, even worse, computing
predicate implication.\footnote{
Many type systems involving bounded quantification, such as system $F_{<:}$,
are already undecidable \cite{Pierce1994131}.
However, they seem to terminate for most practical programs, and also admit
minor variations that yield decidable systems \cite{Castagna:1994:DBQ:174675.177844}.
It is fair to say they are ``just barely'' undecidable, while predicates
are ``very'' undecidable.
}
For a language that is willing to do run time type checks anyway, the
undecidability of predicate dispatch is not a problem.
Interestingly, it can also pose no problem for \emph{static} type systems
that wish to prove that every call site has an applicable method.
Even without evaluating predicates, one can prove that the available methods
are exhaustive (e.g.\ methods for both $p$ and $\neg p$ exist).
In contrast, and most relevant to this thesis, predicate types \emph{are} a
problem for code \emph{specialization}.
Static method lookup would require evaluating the predicates, and optimal code
generation would require understanding something about what the predicates mean.
One approach would be to include a list of satisfied predicates in a type.
However, to the extent such a system is decidable, it is essentially equivalent
to multiple inheritance.
Another approach would be to separate predicates into a second ``level'' of the
type system.
The compiler would combine methods with the same ``first level'' type, and then
generate branches to evaluate the predicates.
Such a system would be useful, and could be
combined with a language like Julia or, indeed, most object-oriented
languages (this has been done for Java~\cite{Millstein:2009:EMP:1462166.1462168}).
However this comes at the expense of making predicates second-class
citizens of the type system.
In considering the problems of predicate dispatch for code specialization,
we seem to be up against a fundamental obstacle: some sets of values are
simply more robust under evaluation than others.
Programs that map integers to integers abound, but programs that map, say,
even integers to even integers are rare to the point of irrelevance.
With predicate dispatch, the first version of the code in
section~\ref{sec:programsanddomains} could have been rearranged to use
dispatch instead of \texttt{if} statements.
This might have advantages for readability and extensibility, but not for
performance.
\subsection{Symbolic programming}
Systems based on symbolic rewrite rules arguably occupy a further tier of
dispatch sophistication.
In these systems, you can dispatch on essentially anything, including arbitrary
values and structures.
Depending on details, their power is roughly equal to that of predicate
dispatch.
%These systems are typically powerful enough to concisely define the kinds of
%behaviors we are interested in.
However, symbolic programming lacks data abstraction: the concrete
representations of values are exposed to the dispatch system.
In such a system, there is no difference between being a list and being
something \emph{represented} as a list.
If the representation of a value changes, the value can be inadvertently
``captured'' by a dispatch rule that was not intended to apply to it,
violating abstraction.
% we use 2-part values instead; symbolic part and data part effectively
There has always been a divide between ``numeric'' and ``symbolic''
languages in the world of technical computing.
To many people the distinction is fundamental, and we should happily live
with both kinds of languages.
But if one insists on an answer as to which approach is the right one,
then the answer is: symbolic.
Programs are ultimately symbolic artifacts.
Computation is limited only by our ability to describe it, and
symbolic tools are needed to generate, manipulate, and query these
descriptions.
For example in numerical computing, a successful approach has been to
create high-performance kernels for important problems.
From there, the limiting factor is knowing \emph{when} to use each
kernel, which can depend on many factors from problem structure to
data size.
Symbolic approaches can be used to automate this.
We will see some examples of this in chapter~\ref{chap:casestudies}.
\subsection{Choices, choices}
\label{sec:choices}
\newcommand{\chk}{{\Large \checkmark}}
\begin{table}[!t]
\begin{center}
\begin{tabular}{|c||c|c|c|c|c|c|c|}
\hline
& Domain & Dynamic & Spec. & Pattern & S.T.S. & S.C. \\
\hline
\hline
Methods & $O(1)$ & & & & \chk & \chk \\
\hline
Virtual methods & $O(1)$ & \chk & & & \chk & \chk \\
\hline
Overloading & $O(n)$ & & & & \chk & \chk \\
\hline
Templates & $O(n)$ & & \chk & & \chk & \\
\hline
Closures & $O(1)$ & \chk & & & \chk & \chk \\
\hline
Duck typing & $O(1)$ & \chk & & & & \chk \\
\hline
Multimethods & $O(n)$ & \chk & & & ? & ? \\
\hline
Predicate dispatch & $O(n m)$ & \chk & & \small{1} & ? & ? \\
\hline
Typeclasses & $O(m)$ & \small{2} & \small{3} & & \chk & \chk \\
\hline
Term rewriting & $O(n m)$ & \chk & & \chk & & \\
\hline
Julia & $O(n m)$ & \chk & \chk & & ? & \\
\hline
\end{tabular}
\end{center}
\begin{singlespace}
\caption[Attributes of code selection features]{
\small{
Attributes of several code selection features.
Spec.\ stands for specialization.
S.T.S.\ stands for statically type safe.
S.C.\ stands for separate compilation.
1.\ Depending on design details, 2.\ When combined with existential types,
3.\ Optionally, with declarations,
% no, because it can only pattern match on types
%4.\ With the \texttt{FlexibleInstances} option.
}
}
\label{table:dispatch}
\end{singlespace}
\end{table}
% message of the table: you have to know a really large amount to
% pick which one of these to use. there is no best one.
% ``domain'' languages are all about avoiding knowledge of this table.
Table~\ref{table:dispatch} compares 11 language features.
Each provides some sort of control flow indirection, packaged into a
structure designed to facilitate reasoning (ideally human reasoning,
but often the compiler's reasoning is prioritized).
The ``domain'' column describes the amount of information considered
by the dispatch system, where $n$ is the number of arguments and
$m$ is the amount of relevant information per argument.
$O(1)$ means each use of the feature considers basically the same
amount of information.
$O(n)$ extends the process to every argument.
$O(m)$ means that one value is considered, but its structure is
taken into account.
Squares with question marks are either not fully understood, or too
sensitive to design details to answer definitively.
%% it may be that the ``power'' of a language is measured by the complexity
%% of the criteria used by the language's run time dispatch mechanisms.
This table illustrates how many combinations of dispatch semantics have
been tried.
Keeping track of these distinctions is a distraction when one is focused
on a non-programming problem.
Including more than one row of this table makes a language especially
complex and difficult to learn.
% compare to julia tradeoffs
\section{Subtyping}
\label{sec:chap3subtyping}
So far we have seen some reasons why dispatch contributes significantly
to flexibility and performance.
However we have only actually dispatched on fairly simple properties like
whether a value is an integer.
How powerful should dispatch be, exactly?
Each method signature describes some set of values to which it applies.
Some signatures might be more specific than others.
The theory governing such properties is subtyping.
% TODO
% informal convexity property. ``any # of integers'' is ok, but
% ``any # of integers except 3'' is not.
%% reflect on level of power: this dispatch system is both more and less
%% powerful than previous ones in various ways.
% normally this is used for type safety
% in our case it is used to form an ``analyzable subset'' of the language
It turns out that a lot of relevant work on this has been done in the
context of type systems for XML~\cite{hosoya2000xduce, BCF03}.
%Something about semantic subtyping and type systems for processing XML.
XML at first seems unrelated to numerical computing, and indeed it
was quite a while before we discovered these papers and noticed the
overlap.
However if one substitutes ``symbolic expression'' for ``XML document'',
the similarity becomes clearer.
In XML processing, programs match documents against patterns in order
to extract information of interest or validate document structure.
These patterns resemble regular expressions, and so also denote sets.
%and admit a subset (subtype) relation.
In our context, some argument lists and the properties of some data
structures are sufficiently complex to warrant such a treatment.
For example, consider a \texttt{SubArray} type that describes a
selection or ``view'' of part of another array.
Here is part of its definition in Julia's standard library:
\begin{singlespace}
\begin{lstlisting}[language=julia]
const ViewIndex = Union(Int, Colon, Range{Int}, UnitRange{Int},
Array{Int,1})
immutable SubArray{T, N, P<:AbstractArray,
I<:Tuple{ViewIndex...}} <: AbstractArray{T,N}
\end{lstlisting}
\end{singlespace}
\noindent
The properties of the \texttt{SubArray} type are declared within curly
braces.
A \texttt{SubArray} has an element type, number of dimensions,
underlying storage type, and a tuple of indexes (one index per dimension).
Limiting indexes to the types specified by \texttt{ViewIndex}
documents what kind of indexes can be supported efficiently.
Different index tuples can yield drastically different performance
characteristics.
Without the ability to describe these properties at the type level,
it would be difficult to implement an efficient \texttt{SubArray}.
In section~\ref{sec:programsanddomains} we only needed to test fairly
simple conditions, but the checks here would involve looping over
indexes to determine which dimensions to drop, or to determine whether
stride-1 algorithms can be used, and so on.
% TODO more
% our language is in many ways dual to ML. that family shuns subtyping,
% but in the lattice theoretic model it's inescapable.
%\cite{hindley1969principal, MLtypeinf}
\section{Specialization}
\subsection{Parametric vs.\ ad hoc polymorphism}
The term \emph{polymorphism} refers generally to reuse of code or data
structures in different contexts.
A further distinction is often made between \emph{parametric} polymorphism
and \emph{ad hoc} polymorphism.
Parametric polymorphism refers to code that is reusable for many types
because its behavior does not depend on argument types (for example,
the identity function).
%reusing the \emph{same} code for different purposes, while
Ad hoc polymorphism refers to selecting
\emph{different} code for different circumstances.