-
Notifications
You must be signed in to change notification settings - Fork 25
/
chap5.tex
1542 lines (1316 loc) · 62.9 KB
/
chap5.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\chapter{Case studies}
\label{chap:casestudies}
This chapter discusses several examples that illustrate the effectiveness of
Julia's abstractions for technical computing.
The first three sections provide an ``explication through elimination'' of
core features (numbers and arrays) that have usually been built in to
technical computing environments.
The remaining sections introduce some libraries and real-world applications
that benefit from Julia's design.
\section{Conversion and comparison}
\label{sec:conversion}
Type conversion provides a classic example of a binary method.
Multiple dispatch allows us to avoid deciding whether the converted-to type or the
converted-from type is responsible for defining the operation.
Defining a specific conversion is straightforward, and might look like this
in Julia syntax:
\begin{verbatim}
convert(::Type{Float64}, x::Int32) = ...
\end{verbatim}
\noindent
A call to this method would look like \texttt{convert(Float64, 1)}.
Using conversion in generic code requires more sophisticated definitions.
For example we might need to convert one value to the type of another,
by writing \texttt{convert(typeof(y), x)}.
What set of definitions must exist to make that call work in all reasonable cases?
Clearly we don't want to write all $O(n^2)$ possibilities.
We need abstract definitions that cover many points in the dispatch matrix.
One such family of points is particularly important: those that describe converting
a value to a type it is already an instance of.
In our system this can be handled by a single definition that performs ``triangular''
dispatch:
\begin{verbatim}
convert{T,S<:T}(::Type{T}, x::S) = x
\end{verbatim}
\noindent
``Triangular'' refers to the rough shape of the dispatch matrix covered
by such a definition: for all types \texttt{T} in the first argument slot,
match values of any type less than it in the second argument slot.
A similar trick is useful for defining equivalence relations.
It is most likely unavoidable for programming languages to need multiple
notions of equality.
Two in particular are natural: an \emph{intensional}
notion that equates objects that look identical, and an \emph{extensional}
notion that equates objects that mean the same thing for some standard
set of purposes.
Intensional equality (\texttt{===} in Julia, described
in section~\ref{sec:corecalc}) lends itself to being implemented once
by the language implementer, since it can work by directly comparing
the representations of objects.
Extensional equality (\texttt{==} in Julia), on the other hand, must be
extensible to user-defined data types.
The latter function must call the former in order to have any basis for
its comparisons.
As with conversion, we would like to provide default definitions for
\texttt{==} that cover families of cases.
Numbers are a reasonable domain to pick, since all numbers should be
equality-comparable to each other.
We might try (\texttt{Number} is the abstract supertype of all numeric types):
\begin{singlespace}
\begin{lstlisting}[language=julia]
==(x::Number, y::Number) = x === y
\end{lstlisting}
\end{singlespace}
\noindent
meaning that number comparison can simply fall back to intensional
equality.
However this definition is rather dubious.
It gets the wrong answer every time the arguments are different representations
(e.g.\ integer and floating point) of the same quantity.
We might hope that its behavior will be ``patched up'' later by more specific
definitions for various concrete number types, but it still covers a dangerous
amount of dispatch space.
If later definitions somehow miss a particular combination of number types,
we could get a silent wrong answer instead of an error.
(Note that statically checking method exhaustiveness is no help here.)
``Diagonal'' dispatch lets us improve the definition:
\begin{singlespace}
\begin{lstlisting}[language=julia]
=={T<:Number}(x::T, y::T) = x === y
\end{lstlisting}
\end{singlespace}
\noindent
Now \texttt{===} will only be used on arguments of the same type,
making it far more likely to give the right answer.
Even better, any case where it does not give the right answer can be fixed with
a single definition, i.e.\ \texttt{==(x::S, y::S)} for some
concrete type \texttt{S}.
The more general \texttt{(Number, Number)} case is left open, and in the next
section we will take advantage of this to implement ``automatic'' type promotion.
%- The abstractions of equality and comparison. Different equivalence classes between
%is/===, isequal and ==
%- Numeric vs lexicographic ordering?
%cmp, lexcmp, vs isless, <
\section{Numeric types and embeddings}
\label{sec:embeddings}
We might prefer ``number'' to be a single, concrete concept, but the history of
mathematics has seen the concept extended many times, from integers to rationals to
reals, and then to complex, quaternion, and more.
These constructions tend to follow a pattern: each new set of numbers has a subset
isomorphic to an existing set.
For example, the reals are isomorphic to the complex numbers with zero imaginary part.
Human beings happen to be good at equating and moving between isomorphic sets,
so it is easy to imagine that the reals and complexes with zero imaginary
part are one and the same.
But a computer forces us to be specific, and admit
that a real number is not complex, and a complex number is not real.
And yet the close relationship between them is too compelling not to model in a
computer somehow.
Here we have a numerical analog to the famous ``circle and ellipse'' problem in
object-oriented programming~\cite{cline1995c++}: the set of circles is
isomorphic to the set of ellipses with equal axes, yet neither ``is a''
relationship in a class hierarchy seems fully correct.
An ellipse is not a circle, and in general a circle cannot serve as an ellipse
(for example, the set of circles is not closed under the same operations that the
set of ellipses is, so a program written for ellipses might not work on circles).
This problem implies that a single built-in type hierarchy is not
sufficient: we want to model custom \emph{kinds} of relationships between
types (e.g.\ ``can be embedded in'' in addition to ``is a'').
%% Two further problems should also be kept in mind. First, the natural isomorphisms
%% between sets of numbers might not be isomorphisms on a real computer. For example,
%% due to the behavior of floating-point arithmetic, an operation on complex numbers
%% with zero imaginary part might not give an answer equal to the same operation on
%% real numbers. Second, the contexts that demand use of one type of number or
%% another are often not easily described by type systems.
% ex: generic sum function accumulating result in at least
% double precision. just using +::T->T->T doesn't work.
% \subsection{Current approaches}
Numbers tend to be among the most complex features of a language.
Numeric types usually need to be a special case: in a typical language with
built-in numeric types, describing their behavior is beyond the expressive power
of the language itself.
For example, in C arithmetic operators like \texttt{+} accept multiple types of
arguments (ints and floats), but no user-defined C function can do this (the situation
is of course improved in C++).
In Python, a special arrangement is made for \texttt{+} to call either an
\texttt{\_\_add\_\_} or \texttt{\_\_radd\_\_} method,
effectively providing double-dispatch for arithmetic in a language that is
idiomatically single-dispatch.
\subsubsection{Implementing type embeddings}
\label{sec:promotion}
Most functions are naturally implemented in the value domain, but some are
actually easier to implement in the type domain.
One reason is that there is a bottom element, which most data types lack.
It has been suggested on theoretical grounds \cite{categorytheoryoperators}
that generic binary operators should have ``key variants'' where the
arguments are of the same type.
We implement this in Julia with a default definition that uses diagonal
dispatch:
\begin{singlespace}
\begin{lstlisting}[language=julia]
+{T<:Number}(x::T, y::T) = no_op_err("+", T)
\end{lstlisting}
\end{singlespace}
Then we can implement a more general definition for different-type arguments
that tries to promote the arguments to a common type by calling \texttt{promote}:
\begin{singlespace}
\begin{lstlisting}[language=julia]
+(x::Number, y::Number) = +(promote(x,y)...)
\end{lstlisting}
\end{singlespace}
\noindent
\texttt{promote} returns a pair of the values of its arguments after conversion
to a common type, so that a ``key variant'' can be invoked.
\texttt{promote} is designed to be extended by new numeric types.
A full-featured promotion operator is a tall order.
We would like
\begin{itemize}
\item Each combination of types only needs to be defined in one order; we
don't want to redundantly define the behavior of \texttt{(T,S)} and \texttt{(S,T)}.
\item It falls back to join for types without any defined promotion.
\item It must prevent promotion above a certain point to avoid circularity.
%in promoting fallback definitions of operators.
\end{itemize}
The core of the mechanism is three functions: \texttt{promote\_type}, which
performs promotion in the type domain only, \texttt{promote\_rule}, which is
the function defined by libraries, and a utility function \texttt{promote\_result}.
The default definition of \texttt{promote\_rule} returns \texttt{Bottom},
indicating no promotion is defined.
\texttt{promote\_type} calls \texttt{promote\_rule} with its arguments in
both possible orders.
If one order is defined to give \texttt{X} and the other is not defined,
\texttt{promote\_result} recursively promotes \texttt{X} and \texttt{Bottom},
giving \texttt{X}.
If neither order is defined, the last definition below is called, which
falls back to \texttt{typejoin}:
\begin{singlespace}
\begin{lstlisting}[language=julia]
promote{T,S}(x::T, y::S) =
(convert(promote_type(T,S),x), convert(promote_type(T,S),y))
promote_type{T,S}(::Type{T}, ::Type{S}) =
promote_result(T, S, promote_rule(T,S), promote_rule(S,T))
promote_rule(T, S) = Bottom
promote_result(t,s,T,S) = promote_type(T,S)
# If no promote_rule is defined, both directions give Bottom.
# In that case use typejoin on the original types instead.
promote_result{T,S}(::Type{T}, ::Type{S},
::Type{Bottom}, ::Type{Bottom}) = typejoin(T, S)
\end{lstlisting}
\end{singlespace}
The obvious way to extend this system is to define \texttt{promote\_rule},
but it can also be extended in a less obvious way.
For the purpose of manipulating container types, we would like e.g.\ \texttt{Int}
and \texttt{Real} to promote to \texttt{Real}.
However, we do not want to introduce a rule that promotes arbitrary pairs of
numeric types to \texttt{Number}, since that would make the above
default definition of \texttt{+} circular.
The following definitions accomplish this, by promoting to the larger
numeric type if one is a subtype of the other:
\begin{samepage}
\begin{singlespace}
\begin{lstlisting}[language=julia]
promote_result{T<:Number,S<:Number}(::Type{T}, ::Type{S},
::Type{Bottom}, ::Type{Bottom}) =
promote_sup(T, S, typejoin(T,S))
# promote numeric types T and S to typejoin(T,S) if T<:S or S<:T
# for example this makes promote_type(Integer,Real) == Real without
# promoting arbitrary pairs of numeric types to Number.
promote_sup{T<:Number }(::Type{T},::Type{T},::Type{T}) = T
promote_sup{T<:Number,S<:Number}(::Type{T},::Type{S},::Type{T}) = T
promote_sup{T<:Number,S<:Number}(::Type{T},::Type{S},::Type{S}) = S
promote_sup{T<:Number,S<:Number}(::Type{T},::Type{S},::Type) =
error("no promotion exists for ", T, " and ", S)
\end{lstlisting}
\end{singlespace}
\end{samepage}
\subsubsection{Application to ranges}
\emph{Ranges} illustrate an interesting application of type promotion.
A range data type, notated \texttt{a:s:b}, represents a sequence of values
starting at \texttt{a} and ending at \texttt{b}, with a distance of \texttt{s}
between elements (internally, this notation is translated to
\texttt{colon(a, s, b)}).
Ranges seem simple enough, but a reliable, efficient, and generic implementation
is difficult to achieve.
We propose the following requirements:
\begin{itemize}
\item The start and stop values can be passed as different types, but internally
should be of the same type.
\item Ranges should work with ordinal types, not just numbers (examples include
characters, pointers, and calendar dates).
\item If any of the arguments is a floating-point number, a special
\texttt{FloatRange} type designed to cope well with roundoff is returned.
\end{itemize}
In the case of ordinal types, the step value is naturally of a different type
than the elements of the range.
For example, one may add 1 to a character to get the ``next'' encoded character,
but it does not make sense to add two characters.
It turns out that the desired behavior can be achieved with six definitions.
First, given three floats of the same type we can construct a \texttt{FloatRange}
right away:
\begin{verbatim}
colon{T<:FloatingPoint}(start::T, step::T, stop::T) =
FloatRange{T}(start, step, stop)
\end{verbatim}
Next, if \texttt{a} and \texttt{b} are of the same type and there are no floats,
we can construct a general range:
\begin{verbatim}
colon{T}(start::T, step, stop::T) = StepRange(start, step, stop)
\end{verbatim}
Now there is a problem to fix: if the first and last arguments are of some
non-floating-point numeric type, but the step is floating point, we want to
promote all arguments to a common floating point type.
We must also do this if the first and last arguments are floats, but the step
is some other kind of number:
\begin{verbatim}
colon{T<:Real}(a::T, s::FloatingPoint, b::T) = colon(promote(a,s,b)...)
colon{T<:FloatingPoint}(a::T, s::Real, b::T) = colon(promote(a,s,b)...)
\end{verbatim}
These two definitions are correct, but ambiguous: if the step is a float
of a different type than \texttt{a} and \texttt{b} both definitions are
equally applicable.
We can add the following disambiguating definition:
\begin{verbatim}
colon{T<:FloatingPoint}(a::T, s::FloatingPoint, b::T) =
colon(promote(a,s,b)...)
\end{verbatim}
All of these five definitions require \texttt{a} and \texttt{b} to be of the
same type.
If they are not, we must promote just those two arguments, and leave
the step alone (in case we are dealing with ordinal types):
\begin{verbatim}
colon{A,B}(a::A, s, b::B) =
colon(convert(promote_type(A,B),a), s, convert(promote_type(A,B),b))
\end{verbatim}
This example shows that it is not always sufficient to have a built-in set of
``promoting operators''.
Library functions like this \texttt{colon} need more control.
\subsubsection{On implicit conversion}
In order to facilitate human styles of thinking, many programming languages offer
some flexibility in when types are required to match.
%This often takes the form of implicit conversion.
For example, a C function declared to accept type \texttt{float}
can also be called with an \texttt{int}, and the compiler will
insert a conversion automatically.
We have found that most cases where this behavior is desirable
are handled simply by method applicability, as shown in this and the
previous section.
Most of the remaining cases involve assignment.
Julia has a notion of a \emph{typed location}, which is a variable
or mutable object field with an associated type.
The core language requires types to match (as determined by subtyping)
on assignment.
However, updating assignments that are syntactically identifiable
(via use of \texttt{=}) cause the front-end to insert calls to
\texttt{convert} to try to convert the assignment right-hand side
to the type of the assigned location.
Abstract data types can easily mimic this behavior by (optionally)
calling \texttt{convert} in their implementations of mutating
operations.
In our experience, this arrangement provides the needed convenience
without complicating the language.
Program analyses consuming lowered syntax trees (or any lower level
representation) do not need to know anything about conversion
or coercion.
Fortress~\cite{fortresspec} also provides powerful type conversion
machinery.
However, it does so through a special \texttt{coerce} feature.
This allows conversions to happen during dispatch, so a method can
become applicable if arguments can be converted to its declared type.
There is also a built-in notion of type widening.
This allows conversions to be inserted between nested operator
applications, which can improve the accuracy of mixed-precision code.
It is not possible to provide these features in Julia without doing
violence to our ``everything is a method call'' semantics.
We feel that it is worth giving up these features for a simpler
mental model.
We also believe that the ability to provide so much ``magic'' numeric
type behavior only through method definitions is compelling.
\subsubsection{Number-like types in practice}
Originally, our reasons for implementing all numeric types at the library
level were not entirely practical.
We had a principled opposition to including such definitions in a compiler,
and guessed that being able to define numeric types would help ensure the
language was powerful enough.
However, defining numeric and number-like types and their interactions turns
out to be surprisingly useful.
Once such types are easy to obtain, people find more and more uses for them.
Even among basic types that might reasonably be built in, there is enough
complexity to require an organizational strategy.
We might have
\begin{itemize}
\item Ordinal types \texttt{Pointer}, \texttt{Char}, \texttt{Enum}
\item Integer types \texttt{Bool}, \texttt{Int8}, \texttt{Int16}, \texttt{Int32}, \texttt{Int64}, \texttt{Int128}, \texttt{UInt8}, \texttt{UInt16}, \texttt{UInt32}, \texttt{UInt64}, \texttt{UInt128}, \texttt{BigInt}
\item Floating point types \texttt{Float16}, \texttt{Float32}, \texttt{Float64}, \texttt{Float80}$^*$, \texttt{Float128}$^*$, \texttt{BigFloat}, \texttt{DoubleDouble}
\item Extensions \texttt{Rational}, \texttt{Complex}, \texttt{Quaternion}
\end{itemize}
\noindent
Types with $^*$ have not been implemented yet, but the rest have.
In external packages, there are types for interval arithmetic,
dual and hyper-dual numbers for computing first and second derivatives,
finite fields, and decimal floating-point.
Some applications benefit in performance from fixed-point arithmetic.
This has been implemented in a package as \texttt{Fixed32\{b\}}, where
the number of fraction bits is a parameter.
A problem in the design of image processing libraries was solved by
defining a new kind of fixed-point type \cite{ufixed}.
The problem is that image scientists often want to work with fractional
pixel values in the interval $[0,1]$, but most graphics libraries (and memory
efficiency concerns) require 8-bit integer pixel components with
values in the interval $[0,255]$.
The solution is a \texttt{Ufixed8} type that uses an unsigned 8-bit integer
as its representation, but behaves like a fraction over $255$.
Many real-world quantities are not numbers exactly, but benefit from the
same mechanisms in their implementation.
Examples include colors (which form a vector space, and where many different
useful bases have been standardized), physical units, and DNA nucleotides.
Date and time arithmetic is especially intricate and irregular, and
benefits from permissiveness in operator definitions.
% (more on this in section~\ref{sec:dates}).
% dates: different cardinal and ordinal behavior
% musical notes
\section{Multidimensional array indexing}
\label{sec:ndindexing}
One-dimensional arrays are a simple and essential data structure found in
most programming languages.
The multi-dimensional arrays required in scientific computing, however, are
a different beast entirely.
Allowing any number of dimensions entails a significant increase in complexity.
Why?
The essential reason is that core properties of the data structure no
longer fit in a constant amount of space.
The space needed to store the sizes of the dimensions (the array shape) is
proportional to the number of dimensions.
This does not seem so bad, but becomes a large problem due to three additional
facts:
\begin{enumerate}
\item Code that operates on the dimension sizes needs to be highly efficient.
Typically the overhead of a loop is unacceptable, and such code needs to be fully
unrolled.
\item In some code the number of dimensions is a \emph{dynamic} property --- it is
only known at run time.
\item Programs may wish to treat arrays with different numbers of dimensions
differently.
A vector (1d) might have rather different behaviors than a matrix (2d)
(for example, when computing a norm).
This kind of behavior makes the number of dimensions a crucial part of program
semantics, preventing it from remaining a compiler implementation detail.
\end{enumerate}
These facts pull in different directions.
The first fact asks for static analysis.
The second fact asks for run time flexibility.
The third fact asks for dimensionality to be part of the type system, but
partly determined at run time (for example, via virtual method dispatch).
Current approaches choose a compromise.
In some systems, the number of dimensions has a strict limit (e.g.\ 3 or 4),
%% TODO: examples of systems limited to n==3
so that separate classes for each case can be written out in full.
Other systems choose flexibility, and just accept that most
or all operations will be dynamically dispatched.
Other systems might provide flexibility only at compile time, for example a
template library where the number of dimensions must be statically known.
Whatever trade-off is made, rules must be defined for how various operators
act on dimensions.
Here we focus on indexing, since selecting parts of arrays has particularly
rich behavior with respect to dimensionality.
For example, if a single row or column of a matrix is
selected, does the result have one or two dimensions?
Array implementations prefer to invoke general rules to answer such questions.
Such a rule might say ``dimensions indexed with scalars are dropped'', or ``trailing
dimensions of size one are dropped'', or ``the rank of the result
is the sum of the ranks of the indexes'' (as in APL).
A significant amount of work has been done on inferring properties like these
for existing array-based languages
(e.g.\ \cite{Joisha:2006:AAS:1152649.1152651,Garg:2014:JSI:2627373.2627382},
although these methods go further and attempt to infer complete size
information).
C++ and Haskell are both examples of languages with sufficiently powerful
static semantics to support defining efficient and reasonably flexible
multi-dimensional arrays in libraries.
In both languages, implementing these libraries is fairly difficult and
places some limits on how indexing can behave.
In C++, the popular Boost libraries include a multi-array
type~\cite{garcia2005multiarray}.
To index an array with this library, one builds up an \texttt{index\_gen}
object by repeatedly applying the \texttt{[]} operator.
On each application, if the argument is a \texttt{range} then the
corresponding dimension is kept, and otherwise it is dropped.
This is implemented with a template as follows:
\begin{singlespace}
\begin{lstlisting}[language=c++,style=ttcode]
template <int NumRanges, int NumDims>
struct index_gen {
index_gen<NumRanges+1,NumDims+1> operator[](const range& r) const
{ ... }
index_gen<NumRanges+1,NumDims> operator[](index idx) const
{ ... }
}
\end{lstlisting}
\end{singlespace}
\noindent
The number of dimensions in the result is determined by arithmetic
on template parameters and static overloading.
Handling arguments one at a time recursively is a common pattern
for feeding more complex type information to compilers.
In fact, the Haskell library Repa~\cite{Keller:2010rs} uses
essentially the same approach:
\begin{singlespace}
\begin{lstlisting}[language=haskell,style=ttcode]
instance Slice sl => Slice (sl :. Int) where
sliceOfFull (fsl :. _) (ssl :. _) = sliceOfFull fsl ssl
instance Slice sl => Slice (sl :. All) where
sliceOfFull (fsl :. All) (ssl :. s) = sliceOfFull fsl ssl :. s
\end{lstlisting}
\end{singlespace}
\noindent
The first instance declaration says that given an \texttt{Int} index,
the dimension corresponding to \texttt{\_} is dropped.
The second declaration says that given an \texttt{All} index,
dimension \texttt{s} is kept.
\iffalse
Our goal here is a bit unusual: we are not concerned with which rules
might work best, but merely with how they can be specified, so that
domain experts can experiment.
In fact different domains want different things.
Working with images, each dimension might be quite different, e.g.\ representing
time, space, or color, so you don't want to drop or rearrange dimensions.
\fi
%Here are our ground rules:
%\begin{enumerate}
%\item You can't manually implement the behavior inside the compiler
%\item The compiler must be able to reasonably understand the program
%\item The code must be reasonably easy to write
%\end{enumerate}
\iffalse
%How are such rules implemented?
In a language with built-in multidimensional arrays, the compiler will
analyze indexing expressions and determine an answer using hard-coded
logic.
% TODO: example from one of these compilers
However, this approach is not satisfying: we would rather
implement the behavior in libraries, so that different kinds of arrays
may be defined, or so that rules of similar complexity may be
defined for other kinds of objects.
In fact different domains want different things.
Working with images, each dimension might be quite different, e.g.\ representing
time, space, or color, so you don't want to drop or rearrange dimensions.
%But these kinds of rules are unusually difficult to implement in libraries.
%If a library writes out its indexing logic using imperative code, the host
%language compiler is not likely to be able to analyze it.
Using compile time abstraction (templates) provides better performance, but
such libraries tend to be difficult to write (and read), and the full
complement of indexing behavior expected by technical users strains the
capabilities of such systems.
\fi
Our dispatch mechanism permits a novel solution \cite{Bezanson2014}.
Indexing behavior can be defined with method signatures, using a combination
of variadic methods and argument ``splatting'' (the ability
to pass a structure of $n$ values as $n$ separate arguments to a function,
known as \texttt{apply} in Lisp dialects, and written as \texttt{f(xs...)}
in Julia).
This solution is still a compromise among the factors outlined above,
but it is a new compromise that provides a modest increment of power.
% TODO more
Below we define a function \texttt{index\_shape} that computes the
shape of a result array given a series of index arguments.
We show three versions, each implementing a different rule that users in
different domains might want:
% TODO: point out array = (shape, data), so those are the two parts
% we need to handle. In julia the ``data'' part is not a first-class
% object; it is not directly exposed to the user, but this is more
% of an implementation detail.
\vspace{-3ex}
\begin{singlespace}
\begin{verbatim}
# drop dimensions indexed with scalars
index_shape() = ()
index_shape(i::Real, I...) = index_shape(I...)
index_shape(i, I...) = (length(i), index_shape(I...)...)
# drop trailing dimensions indexed with scalars
index_shape(i::Real...) = ()
index_shape(i, I...) = (length(i), index_shape(I...)...)
# rank summing (APL)
index_shape() = ()
index_shape(i, I...) = (size(i)..., index_shape(I...)...)
\end{verbatim}
\end{singlespace}
Inferring the length of the result of \texttt{index\_shape} is sufficient
to infer the rank of the result array.
These definitions are concise, easy to write, and possible for a
compiler to understand fully using straightforward techniques.
The result type is determined using only data flow type inference, plus a
rule for splicing an immediate container (the type of \texttt{f((a,b)...)} is
the type of \texttt{f(a,b)}). Argument list destructuring takes place inside
the type intersection operator used to combine argument types with method
signatures.
This approach does not depend on any heuristics. Each call to \texttt{index\_shape}
simply requires one recursive invocation of type inference. This process reaches
the base case \texttt{()} for these definitions, since each recursive call
handles a shorter argument list (for less-well-behaved definitions, we might
end up invoking a widening operator instead).
%% \subsection{Even more elimination?}
%% Some features of the language could be even further eliminated. For example data
%% types could be implemented in terms of lambda abstractions. But certain patterns
%% are so useful that they might as well be provided in a standard form. It also
%% probably makes the compiler much more efficient not to need to pass around and
%% repeatedly analyze full representations of the meanings of such ubiquitous constructs.
\iffalse
\section{Array views}
tim holy in issue 8839:
``without staged functions in my initial post in 8235. The take-home message: generating all methods through dimension 8 resulted in more than 5000 separate methods, and required over 4 minutes of parsing \& lowering time (i.e., a 4-minute delay while compiling julia). By comparison, the stagedfunction implementation loads in a snap, and of course can go even beyond 8 dimensions.''
\fi
\section{Numerical linear algebra}
\label{sec:linalg}
The crucial roles of code selection and code specialization in technical
computing are captured well in the linear algebra software engineer's dilemma,
described in the context of LAPACK and ScaLAPACK by Demmel and Dongarra,
\textit{et al.}~\cite{lawn181}:
\vspace{-4ex}
\begin{singlespace}
\begin{small}
\begin{verbatim}
(1) for all linear algebra problems (linear systems, eigenproblems, ...)
(2) for all matrix types (general, symmetric, banded, ...)
(3) for all data types (real, complex, single, double, higher precision)
(4) for all machine architectures
and communication topologies
(5) for all programming interfaces
(6) provide the best algorithm(s) available in terms of
performance and accuracy ("algorithms" is plural
because sometimes no single one is always best)
\end{verbatim}
\end{small}
\end{singlespace}
Organizing this functionality has been a challenge.
LAPACK is associated with dense matrices, but in reality supports 28
matrix data types (e.g.\ diagonal, tridiagonal, symmetric, symmetric
positive definite, triangular, triangular packed, and other combinations).
Considerable gains in performance and accuracy are possible by exploiting
these structured cases.
LAPACK exposes these data types by assigning each a 2-letter code used to
prefix function names.
This approach, while lacking somewhat in abstraction, has the highly redeeming
feature of making LAPACK easy to call from almost any language.
Still there is a usability cost, and an abstraction layer is needed.
\begin{singlespace}
\begin{table}[!t]
\begin{center}
\begin{tabular}{|l|ll|}\hline
Structured matrix types & \multicolumn{2}{|c|}{Factorization types} \\
\hline
Bidiagonal & Cholesky & \\
Diagonal & CholeskyPivoted & \\
SymmetricRFP & QR & \\
TriangularRFP & QRCompactWY & \\
Symmetric & QRPivoted & \\
Hermitian & Hessenberg & \\
AbstractTriangular & Eigen & \\
LowerTriangular & GeneralizedEigen & \\
UnitLowerTriangular & SVD & \\
UpperTriangular & Schur & \\
UnitUpperTriangular & GeneralizedSchur & \\
SymTridiagonal & LDLt & \\
Tridiagonal & LU & \\
UniformScaling & CholeskyDenseRFP & \\
\hline
\end{tabular}
\end{center}
\caption[Structured matrices and factorizations]{
\small{
Left column: structured array storage formats currently defined in Julia's
standard library.
Right column: data types representing the results of various matrix
factorizations.
}
}
\label{tab:matrixtypes}
\end{table}
\end{singlespace}
% matrixlike but not arrays or factorizations:
% QRPackedQ QRPackedWYQ QRCompactWYQ HessenbergQ
Table~\ref{tab:matrixtypes} lists some types that have been defined in
Julia (largely by Andreas Noack) for working with structured matrices
and matrix factorizations.
The structured matrix types can generally be used anywhere a 2-d
array is expected.
The factorization types are generally used for efficiently re-solving
systems.
% ? Cholesky has inv
Sections \ref{sec:whydynamictyping} and \ref{sec:bindingtimedilemma}
alluded to some of the ways that these types interact with language
features.
There are essentially two modes of use: ``manual'', where a programmer
explicitly constructs one of these types, and ``automatic'', where a
library returns one of them to user code written only in terms of generic
functions.
Structured matrices have their own algebra.
There are embedding relationships like those discussed in section~\ref{sec:embeddings}:
diagonal is embedded in bidiagonal, which is embedded in tridiagonal.
Bidiagonal is also embedded in triangular.
After promotion, these types are closed under \texttt{+} and \texttt{-}, but
generally not under \texttt{*}.
% matrix types not closed under operators:
\iffalse
function +(A::Bidiagonal, B::Bidiagonal)
if A.isupper==B.isupper
Bidiagonal(A.dv+B.dv, A.ev+B.ev, A.isupper)
else
Tridiagonal((A.isupper ? (B.ev,A.dv+B.dv,A.ev) : (A.ev,A.dv+B.dv,B.ev))...)
end
end
\fi
\iffalse
Multiple dispatch on special matrices
28 LAPACK types via composition of 9 types, issue 8240
storage formats:
Matrix
Banded
Packed
RFP (rectangular full packed, LAWN 199)
symmetry labels:
Hermitian
Hessenberg
Symmetric
Triangular
Trapezoid
Bidiagonal{T} <: AbstractMatrix{T}
Diagonal{T} <: AbstractMatrix{T}
SymmetricRFP{T<:BlasFloat} <: AbstractMatrix{T}
TriangularRFP{T<:BlasFloat} <: AbstractMatrix{T}
Symmetric{T,S<:AbstractMatrix} <: AbstractMatrix{T}
Hermitian{T,S<:AbstractMatrix} <: AbstractMatrix{T}
AbstractTriangular{T,S<:AbstractMatrix} <: AbstractMatrix{T}
LowerTriangular{T,S<:AbstractMatrix} <: AbstractTriangular{T,S}
UnitLowerTriangular{T,S<:AbstractMatrix} <: AbstractTriangular{T,S}
UpperTriangular{T,S<:AbstractMatrix} <: AbstractTriangular{T,S}
UnitUpperTriangular{T,S<:AbstractMatrix} <: AbstractTriangular{T,S}
SymTridiagonal{T} <: AbstractMatrix{T}
Tridiagonal{T} <: AbstractMatrix{T}
UniformScaling{T<:Number}
abstract Factorization{T}
Cholesky{T,S<:AbstractMatrix} <: Factorization{T}
CholeskyPivoted{T,S<:AbstractMatrix} <: Factorization{T}
QR{T,S<:AbstractMatrix} <: Factorization{T}
QRCompactWY{S,M<:AbstractMatrix} <: Factorization{S}
QRPivoted{T,S<:AbstractMatrix} <: Factorization{T}
QRPackedQ{T,S<:AbstractMatrix} <: AbstractMatrix{T}
QRPackedWYQ{S,M<:AbstractMatrix} <: AbstractMatrix{S}
QRCompactWYQ{S, M<:AbstractMatrix} <: AbstractMatrix{S}
Hessenberg{T,S<:AbstractMatrix} <: Factorization{T}
HessenbergQ{T,S<:AbstractMatrix} <: AbstractMatrix{T}
Eigen{T,V,S<:AbstractMatrix,U<:AbstractVector} <: Factorization{T}
GeneralizedEigen{T,V,S<:AbstractMatrix,U<:AbstractVector} <: Factorization{T}
SVD{T<:BlasFloat,Tr,M<:AbstractArray} <: Factorization{T}
Schur{Ty<:BlasFloat, S<:AbstractMatrix} <: Factorization{Ty}
GeneralizedSchur{Ty<:BlasFloat, M<:AbstractMatrix} <: Factorization{Ty}
LDLt{T,S<:AbstractMatrix} <: Factorization{T}
LU{T,S<:AbstractMatrix} <: Factorization{T}
CholeskyDenseRFP{T<:BlasFloat} <: Factorization{T}
\fi
\iffalse
QR factorization: want to support it on all input types, but many
types (integer, rational) are not closed under the needed operations.
compare to eigen~\cite{guennebaud2014eigen}: in C++, if you want a qrfact
of an integer matrix,
you might get a compile time error, or it might work but all the values
will be truncated when stored. this is a big generic programming
challenge. you can't expect types to have a ``type to use for QR fact''
trait.
qrfact computes the type of the result needed:
\begin{singlespace}
\begin{lstlisting}[language=julia]
function qrfact{T}(A::StridedMatrix{T}; pivot=false)
S = typeof(one(T)/norm(one(T)))
if S != T
qrfact!(convert(AbstractMatrix{S},A), pivot=pivot)
else
qrfact!(copy(A),pivot=pivot))
end
end
\end{lstlisting}
\end{singlespace}
This code is not statically typeable, and yet with specialization a
compiler could in fact determine the type of each call site.
It just happens to be convenient to specify this behavior with a
branch.
\fi
\subsubsection{Calling BLAS and LAPACK}
\label{sec:callingblas}
Writing interfaces to BLAS~\cite{blas} and LAPACK is an avowed tradition
in technical computing language implementation.
Julia's type system is capable of describing the cases where routines
from these libraries are applicable.
For example the \texttt{axpy} operation in BLAS (computing $y\leftarrow \alpha x + y$)
supports arrays with a single stride, and four numeric types.
In Julia this can be expressed as follows:
\begin{singlespace}
\begin{lstlisting}[language=julia]
const BlasFloat = Union(Float64,Float32,Complex128,Complex64)
axpy!{T<:BlasFloat}(alpha::Number,
x::Union(DenseArray{T},StridedVector{T}),
y::Union(DenseArray{T},StridedVector{T}))
\end{lstlisting}
\end{singlespace}
%\noindent
%TODO
\section{Units}
Despite their enormous importance in science, unit quantities have
not reached widespread use in programming.
This is not surprising considering the technical difficulties involved.
Units are symbolic objects, so attaching them to numbers can bring
significant overhead.
To restore peak performance, units need to be ``lifted'' into a type
system of some kind, to move their overhead to compile time.
However at that point we encounter a trade-off similar to that present
in multi-dimensional arrays.
Will it be possible to have an array of numbers with different units?
What if a program wants to return different units based on criteria
the compiler cannot resolve?
Julia's automatic blending of binding times is again helpful.
The SIUnits package by Keno Fischer~\cite{Fischer:2014si} implements
Julia types for SI units and unit quantities:
\begin{singlespace}
\begin{lstlisting}[language=julia]
immutable SIUnit{m,kg,s,A,K,mol,cd} <: Number
end
immutable SIQuantity{T<:Number,m,kg,s,A,K,mol,cd} <: Number
val::T
end
\end{lstlisting}
\end{singlespace}
\noindent
This implementation uses a type parameter to store the exponent (an integer,
or perhaps in the near future a rational number) of each base unit associated
with a value.
An \texttt{SIUnit} has only the symbolic part, and can be used to mention
units without committing to a representation.
Its size, as a data type, is zero.
An \texttt{SIQuantity} contains the same symbolic information, but wraps a
number used to represent the scalar part of the unit quantity.
Definitions like the following are used to provide convenient names for units:
\begin{singlespace}
\begin{lstlisting}[language=julia]
const Meter = SIUnit{1,0,0,0,0,0,0}()
const KiloGram = SIUnit{0,1,0,0,0,0,0}()
\end{lstlisting}
\end{singlespace}
\noindent
The approach described in section~\ref{sec:promotion} is used to combine
numbers with units.
Arithmetic is implemented as follows:
\begin{singlespace}
\begin{lstlisting}[language=julia]
function +{T,S,m,kg,s,A,K,mol,cd}(x::SIQuantity{T,m,kg,s,A,K,mol,cd},
y::SIQuantity{S,m,kg,s,A,K,mol,cd})
val = x.val+y.val
SIQuantity{typeof(val),m,kg,s,A,K,mol,cd}(val)
end
function +(x::SIQuantity, y::SIQuantity)
error("Unit mismatch.")
end
\end{lstlisting}
\end{singlespace}
\noindent
In the first definition, the representation types of the two
arguments do not have to match, but the units do (checked via diagonal
dispatch).
Any combination of \texttt{SIQuantity}s that is not otherwise implemented
is a unit mismatch error.
When storing unit quantities in arrays, the array constructor in Julia's
standard library is able to automatically select an appropriate type.
If all elements have the same units, the symbolic unit information is
stored only once in the array's type tag, and the array data uses a
compact representation:
\vspace{-2ex}
\begin{singlespace}
\begin{verbatim}
julia> a = [1m,2m,3m]
3-element Array{SIQuantity{Int64,1,0,0,0,0,0,0},1}:
1 m
2 m
3 m
julia> reinterpret(Int,a)
3-element Array{Int64,1}:
1
2
3
\end{verbatim}
\end{singlespace}
\noindent
If different units are present, a wider type is chosen via the array
constructor invoking \texttt{promote\_type}:
\vspace{-2ex}
\begin{singlespace}
\begin{verbatim}
julia> a = [1m,2s]
2-element Array{SIQuantity{Int64,m,kg,s,A,K,mol,cd},1}:
1 m
2 s
\end{verbatim}
\end{singlespace}
Unit quantities are different from most number types in that they are not
closed under multiplication.
Nevertheless, generic functions behave as expected.
Consider a generic \texttt{prod} function that multiplies elements of a
collection.
With units, its result type can depend on the size of the collection: