This repository has been archived by the owner on Aug 6, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 172
/
matrix-mult.xml
990 lines (882 loc) · 48.4 KB
/
matrix-mult.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
<?xml version="1.0" encoding="UTF-8"?>
<!--********************************************************************
Copyright 2017 Georgia Institute of Technology
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation. A copy of
the license is included in gfdl.xml.
*********************************************************************-->
<section xml:id="matrix-multiplication">
<title>Matrix Multiplication</title>
<objectives>
<ol>
<li>Understand compositions of transformations.</li>
<li>Understand the relationship between matrix products and compositions of matrix transformations.</li>
<li>Become comfortable doing basic algebra involving matrices.</li>
<li><em>Recipe:</em> matrix multiplication (two ways).</li>
<li><em>Picture:</em> composition of transformations.</li>
<li><em>Vocabulary word:</em> <term>composition</term>.</li>
</ol>
</objectives>
<introduction>
<p>
In this section, we study compositions of transformations. As we will see, composition is a way of chaining transformations together. The composition of matrix transformations corresponds to a notion of <em>multiplying</em> two matrices together. We also discuss addition and scalar multiplication of transformations and of matrices.
</p>
</introduction>
<subsection>
<title>Composition of linear transformations</title>
<p>Composition means the same thing in linear algebra as it does in Calculus. Here is the definition.</p>
<definition>
<idx><h>Transformation</h><h>composition of</h></idx>
<idx><h>Linear transformation</h><h>composition of</h><see>Transformation</see></idx>
<idx><h>Matrix transformation</h><h>composition of</h><see>Transformation</see></idx>
<idx><h>Matrix transformation</h><h>composition of</h><see>Linear transformation</see></idx>
<statement>
<p>
Let <m>T\colon\R^n\to\R^m</m> and <m>U\colon\R^p\to\R^n</m> be transformations. Their <term>composition</term> is the transformation <m>T\circ U\colon\R^p\to\R^m</m> defined by
<me>(T\circ U)(x) = T(U(x)).</me>
</p>
</statement>
</definition>
<p>
Composing two transformations means chaining them together: <m>T\circ U</m> is the transformation that first applies <m>U</m>, then applies <m>T</m> (note the order of operations). More precisely, to evaluate <m>T\circ U</m> on an input vector <m>x</m>, first you evaluate <m>U(x)</m>, then you take this output vector of <m>U</m> and use it as an input vector of <m>T</m>: that is, <m>(T\circ U)(x) = T(U(x))</m>. Of course, this only makes sense when the outputs of <m>U</m> are valid inputs of <m>T</m>, that is, when the range of <m>U</m> is contained in the domain of <m>T</m>.
<latex-code>
\begin{tikzpicture}[every label/.append style={text=black, thin border}, scale=.65]
\draw[grid lines] (-3,-3) grid (3,3)
(0,-3.5) node[black] {$\R^p$}
(2,1) node[point, "$x$" left] (x) {};
\draw[grid lines,xshift=8cm] (-3,-3) grid (3,3)
(0,-3.5) node[black] {$\R^n$}
(-1,-2) node[point, "$U(x)$" above] (Ux) {};
\draw[grid lines,xshift=16cm] (-3,-3) grid (3,3)
(0,-3.5) node[black] {$\R^m$}
(-2,2) node[point, "$T\circ U(x)$" below right] (TUx) {};
\draw[|->, shorten=1.5pt, out=0, in=180]
(x) to node[below,midway] {$U$} (Ux);
\draw[|->, shorten=1.5pt, out=0, in=180]
(Ux) to node[below,midway] {$T$} (TUx);
\draw[shorten=1.5pt, out=0, in=180]
(x) to node[above=2pt,midway,thin border] {$T\circ U$} (TUx);
\end{tikzpicture}
</latex-code>
Here is a picture of the composition <m>T\circ U</m> as a <q>machine</q> that first runs <m>U</m>, then takes its output and feeds it into <m>T</m>; there is a similar picture in this <xref ref="matrix-trans-transformations"/>.
<latex-code>
<![CDATA[
\tikzset{
pics/gear/.style args={#1/#2}{
code = { % #1 = number of gears, #2 = tooth length
\filldraw[fill=black!30] (1cm-#2/2,0)
let \n{angle} = {360/#1} in
\foreach \gear [evaluate=\gear as \startangle using \gear*\n{angle}]
in {1,...,#1}
{
arc[radius=1cm-#2/2, start angle=\startangle-\n{angle},
delta angle=\n{angle}/4]
-- (\startangle-3*\n{angle}/4+\n{angle}/10:1cm+#2/2)
arc[radius=1cm+#2/2,
start angle=\startangle-3*\n{angle}/4+\n{angle}/10,
end angle =\startangle- \n{angle}/4-\n{angle}/10]
-- (\startangle- \n{angle}/4:1cm-#2/2)
arc[radius=1cm-#2/2, start angle=\startangle-\n{angle}/4,
delta angle=\n{angle}/4]
};
\draw (0,0) circle[radius=.4cm];
}
},
% machine is an input/output machine for illustrating functions
machine/.pic = {
\filldraw[rounded corners=.3mm, fill=steel!30] (1.5, -1)
-- (1.5, -.5) -- (1.5-.2, -.5) -- (1.5-.2, .4) -- (1.5, .4)
-- (1.5, 1) -- (-1.5, 1) -- (-1.5, -1) -- cycle;
\filldraw[fill=steel!30] (-1.5, -.2)
-- (-1.5-.5, -.5) -- (-1.5-.5, .5) -- (-1.5, .2);
\coordinate (-input) at (-1.5-.5, 0);
%\fill (-1.5-.4, -.05) rectangle (-1.5+.5, .05);
%\fill (-1.5+.3, -.15) -- (-1.5+.7, 0) -- (-1.5+.3, .15);
\filldraw[yshift=-.1cm, fill=black!30] (1.5-.2, -.3)
-- (1.5+.4, -.3)
arc[radius=.15, start angle=-90, end angle=90]
-- (1.5-.2,0);
\draw[yshift=-.1cm] (1.5+.4, -.15) circle[radius=.1];
%\fill (1.5-.7, -.05) rectangle (1.5+.2, .05);
%\fill (1.5+.0, -.15) -- (1.5+.4, 0) -- (1.5+.0, .15);
\coordinate (-output) at (1.5+.5, 0);
\pic[transform shape, scale=.4] at (-.4, -.4) {gear={15/.2cm}};
\pic[transform shape, scale=.4] at ( .4, -.4) {gear={15/.2cm}};
% Need to expand \tikzpictextoptions *first* so as not to confuse \pgfkeys
\expandafter\node\expandafter[\tikzpictextoptions] at (0, .5) {\tikzpictext};
},
machine2/.pic = {
\filldraw[rounded corners=.3mm, fill=steel!30] (3.5, -1)
-- (3.5, -.5) -- (3.5-.2, -.5) -- (3.5-.2, .4) -- (3.5, .4)
-- (3.5, 1) -- (-1.5, 1) -- (-1.5, -1) -- cycle;
\filldraw[fill=steel!30] (-1.5, -.2)
-- (-1.5-.3, -.4) -- (-1.5-.3, .4) -- (-1.5, .2);
\coordinate (-input) at (-1.5-.3, 0);
%\fill (-1.5-.4, -.05) rectangle (-1.5+.5, .05);
%\fill (-1.5+.3, -.15) -- (-1.5+.7, 0) -- (-1.5+.3, .15);
\filldraw[yshift=-.1cm, fill=black!30] (3.5-.2, -.3)
-- (3.5+.1, -.3)
arc[radius=.15, start angle=-90, end angle=90]
-- (3.5-.2,0);
\draw[yshift=-.1cm] (3.5+.1, -.15) circle[radius=.1];
%\fill (1.5-.7, -.05) rectangle (1.5+.2, .05);
%\fill (1.5+.0, -.15) -- (1.5+.4, 0) -- (1.5+.0, .15);
\coordinate (-output) at (3.5-.2, 0);
% Need to expand \tikzpictextoptions *first* so as not to confuse \pgfkeys
\expandafter\node\expandafter[\tikzpictextoptions] at (1, .75) {\tikzpictext};
},
}
\begin{tikzpicture}[thin border nodes]
\pic[scale=2, "$T\circ U$"] (machine) at (0, 0) {machine2};
\filldraw[dashed, rounded corners=3mm, thick, fill=white!96!black]
(-2.5, -1.2) rectangle (6, 1.2);
\pic[scale=.75, "$U$"] (machine1) at (-.5, 0) {machine};
\pic[scale=.75, "$T$"] (machine2) at (4, 0) {machine};
\draw[thick, shorten=1mm, arrows={<[scale=1.5]-}]
(machine-input) -- node[above=.7mm] {$\R^p$} ++(-1cm,0) node[left] {$x$};
\draw[thick, shorten=1mm, arrows={-[scale=1.5]>}]
(machine-output) -- node[above=.7mm] {$\R^m$} ++(1.1cm,0) node[right] {$T\circ U(x)$};
\draw[shorten=1mm, ->] (machine-input) -- (machine1-input);
\draw[shorten=1mm, ->] (machine1-output) --
node[above=1mm, font=\small] {$U(x)$}
node[below=1mm, font=\small] {$\R^n$} (machine2-input);
\draw[shorten=1mm, ->] (machine2-output) -- (machine-output);
\end{tikzpicture}
]]>
</latex-code>
</p>
<bluebox>
<title>Domain and codomain of a composition</title>
<p>
<ul>
<li>In order for <m>T\circ U</m> to be defined, the codomain of <m>U</m> must equal the domain of <m>T</m>.</li>
<li>The domain of <m>T\circ U</m> is the domain of <m>U</m>.</li>
<li>The codomain of <m>T\circ U</m> is the codomain of <m>T</m>.</li>
</ul>
</p>
</bluebox>
<example>
<title>Functions of one variable</title>
<p>
Define <m>f\colon\R\to\R</m> by <m>f(x) = x^2</m> and <m>g\colon\R\to\R</m> by <m>g(x) = x^3</m>. The composition <m>f\circ g\colon\R\to\R</m> is the transformation defined by the rule
<me>f\circ g(x) = f(g(x)) = f(x^3) = (x^3)^2 = x^6.</me>
For instance, <m>f\circ g(-2) = f(-8) = 64.</m>
</p>
</example>
<example hide-type="true">
<title>Interactive: A composition of matrix transformations</title>
<p>
Define <m>T\colon\R^3\to\R^2</m> and <m>U\colon\R^2\to\R^3</m> by
<me>T(x) = \mat{1 1 0; 0 1 1}x \sptxt{and}
U(x) = \mat{1 0; 0 1; 1 0}x.</me>
Their composition is a transformation <m>T\circ U\colon\R^2\to\R^2</m>; it turns out to be the matrix transformation associated to the matrix <m>\bigl(\begin{smallmatrix}1\amp1\\1\amp1\end{smallmatrix}\bigr)</m>.
</p>
<figure>
<caption>A composition of two matrix transformations, i.e., a transformation performed in two steps. On the left is the domain of <m>U</m>/the domain of <m>T\circ U</m>; in the middle is the codomain of <m>U</m>/the domain of <m>T</m>, and on the right is the codomain of <m>T</m>/the codomain of <m>T\circ U</m>. The vector <m>x</m> is the input of <m>U</m> and of <m>T\circ U</m>; the vector in the middle is the output of <m>U</m>/the input of <m>T</m>, and the vector on the right is the output of <m>T</m>/of <m>T\circ U</m>. Click and drag <m>x</m>.</caption>
<mathbox source="demos/compose3d.html?mat2=1,1,0:0,1,1&mat1=1,0:0,1:1,0&rangeT=off&rangeU=off&rangeTU=off&range=5&closed=true" height="500px"/>
</figure>
</example>
<example hide-type="true">
<title>Interactive: A transformation defined in steps</title>
<p>
Let <m>S\colon\R^3\to\R^3</m> be the linear transformation that first reflects over the <m>xy</m>-plane and then projects onto the <m>yz</m>-plane, as in this <xref ref="linear-trans-in-steps"/>. The transformation <m>S</m> is the composition <m>T\circ U</m>, where <m>U\colon\R^3\to\R^3</m> is the transformation that reflects over the <m>xy</m>-plane, and <m>T\colon\R^3\to\R^3</m> is the transformation that projects onto the <m>yz</m>-plane.
</p>
<figure>
<caption>Illustration of a transformation defined in steps. On the left is the domain of <m>U</m>/the domain of <m>S</m>; in the middle is the codomain of <m>U</m>/the domain of <m>T</m>, and on the right is the codomain of <m>T</m>/the codomain of <m>S</m>. The vector <m>u</m> is the input of <m>U</m> and of <m>S</m>; the vector in the middle is the output of <m>U</m>/the input of <m>T</m>, and the vector on the right is the output of <m>T</m>/of <m>S</m>. Click and drag <m>u</m>.</caption>
<mathbox source="demos/steps.html?out=S" height="500px"/>
</figure>
</example>
<example hide-type="true">
<title>Interactive: A transformation defined in steps</title>
<p>
Let <m>S\colon\R^3\to\R^3</m> be the linear transformation that first projects onto the <m>xy</m>-plane, and then projects onto the <m>xz</m>-plane. The transformation <m>S</m> is the composition <m>T\circ U</m>, where <m>U\colon\R^3\to\R^3</m> is the transformation that projects onto the <m>xy</m>-plane, and <m>T\colon\R^3\to\R^3</m> is the transformation that projects onto the <m>xz</m>-plane.
</p>
<figure>
<caption>Illustration of a transformation defined in steps. Note that projecting onto the <m>xy</m>-plane, followed by projecting onto the <m>xz</m>-plane, is the projection onto the <m>x</m>-axis.</caption>
<mathbox source="demos/compose3d.html?mat1=1,0,0:0,1,0:0,0,0&mat2=1,0,0:0,0,0:0,0,1&rangeT=off&rangeU=off&rangeTU=off&range=5&closed=true" height="500px"/>
</figure>
</example>
<p>
Recall from this <xref ref="matrix-trans-identity"/> that the <em>identity transformation</em> is the transformation <m>\operatorname{Id}_{\R^n}\colon\R^n\to\R^n</m> defined by <m>\operatorname{Id}_{\R^n}(x) = x</m> for every vector <m>x</m>.
</p>
<note hide-type="true" xml:id="matrix-mult-compose-trans">
<title>Properties of composition</title>
<idx><h>Identity transformation</h><h>and composition</h></idx>
<p>
Let <m>S,T,U</m> be transformations and let <m>c</m> be a scalar. Suppose that <m>T\colon\R^n\to\R^m</m>, and that in each of the following identities, the domains and the codomains are compatible when necessary for the composition to be defined. The following properties are easily verified:
<md>
<mrow>
S\circ(T+U) \amp= S\circ T+S\circ U \amp
(S + T)\circ U \amp= S\circ U + T\circ U \amp
</mrow>
<mrow>
c(T\circ U) \amp= (cT)\circ U \amp
c(T\circ U) \amp= T\circ(cU) \rlap{\;\;\text{ if $T$ is linear}}
</mrow>
<mrow>
T\circ\operatorname{Id}_{\R^n} \amp= T \amp
\operatorname{Id}_{\R^m}\circ T \amp= T
</mrow>
<mrow>
\amp \amp S\circ(T\circ U) \amp= (S\circ T)\circ U \amp
</mrow>
</md>
</p>
</note>
<p>
The final property is called <term>associativity</term>. Unwrapping both sides, it says:
<me>S\circ(T\circ U)(x) = S(T\circ U(x)) = S(T(U(x))) = S\circ T(U(x)) = (S\circ T)\circ U(x).</me>
In other words, both <m>S\circ (T\circ U)</m> and <m>(S\circ T)\circ U</m> are the transformation defined by first applying <m>U</m>, then <m>T</m>, then <m>S</m>.
</p>
<bluebox>
<idx><h>Transformation</h><h>composition of</h><h>noncommutativity of</h></idx>
<p>
Composition of transformations is <em>not</em> commutative in general. That is, in general, <m>T\circ U\neq U\circ T</m>, even when both compositions are defined.
</p>
</bluebox>
<example>
<title>Functions of one variable</title>
<p>
Define <m>f\colon\R\to\R</m> by <m>f(x) = x^2</m> and <m>g\colon\R\to\R</m> by <m>g(x) = e^x</m>. The composition <m>f\circ g\colon\R\to\R</m> is the transformation defined by the rule
<me>f\circ g(x) = f(g(x)) = f(e^x) = (e^x)^2 = e^{2x}.</me>
The composition <m>g\circ f\colon\R\to\R</m> is the transformation defined by the rule
<me>g\circ f(x) = g(f(x)) = g(x^2) = e^{x^2}.</me>
Note that <m>e^{x^2}\neq e^{2x}</m> in general; for instance, if <m>x=1</m> then <m>e^{x^2} = e</m> and <m>e^{2x} = e^2</m>. Thus <m>f \circ g</m> is not equal to <m>g \circ f</m>, and we can already see with functions of one variable that composition of functions is not commutative.
</p>
</example>
<example xml:id="matrix-mult-compose-noncom">
<title>Non-commutative composition of transformations</title>
<p>
Define matrix transformations <m>T,U\colon\R^2\to\R^2</m> by
<me>T(x) = \mat{1 1; 0 1}x \sptxt{and}
U(x) = \mat{1 0; 1 1}x.</me>
Geometrically, <m>T</m> is a shear in the <m>x</m>-direction, and <m>U</m> is a shear in the <m>Y</m>-direction. We evaluate
<me>T\circ U\vec{1 0} = T\vec{1 1} = \vec{2 1}</me>
and
<me>U\circ T\vec{1 0} = U\vec{1 0} = \vec{1 1}.</me>
Since <m>T\circ U</m> and <m>U\circ T</m> have different outputs for the input vector <m>1\choose 0</m>, they are different transformations. (See this <xref ref="matrix-mult-compose-noncom2"/>.)
</p>
<figure>
<caption>Illustration of the composition <m>T\circ U</m>.</caption>
<mathbox source="demos/compose2d.html?mat1=1,0,1,1&mat2=1,1,0,1&closed&vec=1,0&show1=on" height="500px"/>
</figure>
<figure>
<caption>Illustration of the composition <m>U\circ T</m>.</caption>
<mathbox source="demos/compose2d.html?mat2=1,0,1,1&mat1=1,1,0,1&closed&vec=1,0&names=U,T&show1=on" height="500px"/>
</figure>
</example>
</subsection>
<subsection>
<title>Matrix multiplication</title>
<p>In this subsection, we introduce a seemingly unrelated operation on matrices, namely, matrix multiplication. As we will see in the next subsection, matrix multiplication exactly corresponds to the composition of the corresponding linear transformations. First we need some terminology.</p>
<definition type-name="Notation">
<notation><usage>a_{ij}</usage><description>The <m>i,j</m> entry of a matrix</description></notation>
<statement>
<p>
Let <m>A</m> be an <m>m\times n</m> matrix. We will generally write <m>a_{ij}</m> for the entry in the <m>i</m>th row and the <m>j</m>th column. It is called the <term><m>i,j</m> entry</term> of the matrix.
<latex-code>
<![CDATA[
\def\spvdots{\vphantom{\vbox{\hbox{(}\kern0pt}}\smash{\vdots}}
\begin{tikzpicture}[scale=1.2]
\matrix[math matrix] (aij)
{
a_{11} \& \cdots \& a_{1j} \& \cdots \& a_{1n} \\
\spvdots \& \& \spvdots \& \& \spvdots \\
a_{i1} \& \cdots \& a_{ij} \& \cdots \& a_{in} \\
\spvdots \& \& \spvdots \& \& \spvdots \\
a_{m1} \& \cdots \& a_{mj} \& \cdots \& a_{mn} \& \\
};
\node[fit=(aij-1-3) (aij-5-3), inner sep=2pt,
draw=blue!50, thick, rounded corners,
label={[text=blue!50]below:\small$j$th column}] {};
\node[fit=(aij-3-1) (aij-3-5), inner sep=2pt,
draw=green!70!black, thick, rounded corners] (row) {};
\node[text=green!70!black, rotate=90, anchor=north, yshift=.5mm, font=\small]
at (row.east) {$i$th row};
\end{tikzpicture}
]]>
</latex-code>
</p>
</statement>
</definition>
<definition xml:id="matrix-mult-defn-of">
<title>Matrix multiplication</title>
<idx><h>Matrix multiplication</h><h>definition of</h></idx>
<idx><h>Matrix</h><h>multiplication</h><see>Matrix multiplication</see></idx>
<statement>
<p>
Let <m>A</m> be an <m>m\times n</m> matrix and let <m>B</m> be an <m>n\times p</m> matrix. Denote the columns of <m>B</m> by <m>v_1,v_2,\ldots,v_p</m>:
<me>B = \mat{| | ,, |; v_1 v_2 \cdots, v_p; | | ,, |}.</me>
The <term>product</term> <m>AB</m> is the <m>m\times p</m> matrix with columns <m>Av_1,Av_2,\ldots,Av_p</m>:
<me>AB = \mat{| | ,, |; Av_1 Av_2 \cdots, Av_p; | | ,, |}.</me>
</p>
</statement>
</definition>
<p>
In other words, matrix multiplication is defined column-by-column, or <q>distributes over the columns of <m>B</m>.</q>
</p>
<example xml:id="matrix-mult-eg-mult1">
<p>
<me>
\begin{split}
\mat{1 1 0; 0 1 1}\mat{1 0; 0 1; 1 0}
\amp= \mat{\mat{1 1 0; 0 1 1}\vec{1,0,1} \mat{1 1 0; 0 1 1}\vec{0,1,0}} \\
\amp= \mat{\vec{1,1} \vec{1,1}} = \mat{1 1 ; 1 1}
\end{split}
</me>
</p>
</example>
<p>
In order for the vectors <m>Av_1,Av_2,\ldots,Av_p</m> to be defined, the numbers of rows of <m>B</m> has to equal the number of columns of <m>A</m>.
</p>
<bluebox>
<title>The sizes of the matrices in the matrix product</title>
<idx><h>Matrix multiplication</h><h>size of matrices</h></idx>
<p>
<ul>
<li>
In order for <m>AB</m> to be defined, the number of rows of <m>B</m> has to equal the number of columns of <m>A</m>.
</li>
<li>
<latex-code mode="bare">
\def\r{\textcolor{red}}\def\b{\textcolor{blue}}
</latex-code>
The product of an <m>\b m\times\r n</m> matrix and an <m>\r n\times\b p</m> matrix is an <m>\b m\times\b p</m> matrix.</li>
</ul>
</p>
</bluebox>
<p>
<idx><h>Matrix multiplication</h><h>and the matrix-vector product</h></idx>
<idx><h>Matrix-vector product</h><h>and matrix multiplication</h></idx>
If <m>B</m> has only one column, then <m>AB</m> also has one column. A matrix with one column is the same as a vector, so the definition of the matrix product generalizes the definition of the matrix-vector product from this <xref ref="matrixeq-defn-Ax1"/>.
</p>
<p>
<idx><h>Matrix multiplication</h><h>powers</h></idx>
<idx><h>Power of a matrix</h><see>Matrix multiplication</see></idx>
If <m>A</m> is a square matrix, then we can multiply it by itself; we define its <term>powers</term> to be
<me>A^2 = AA \qquad A^3 = AAA \qquad \text{etc.}</me>
</p>
<paragraphs>
<title>The row-column rule for matrix multiplication</title>
<p>
Recall from this <xref ref="matrixeq-row-column-prod"/> that the product of a row vector and a column vector is the scalar
<me>\mat{a_1 a_2 \cdots, a_n} \vec{x_1 x_2 \vdots, x_n}
= a_1x_1 + a_2x_2 + \cdots + a_nx_n.</me>
The following procedure for finding the matrix product is much better adapted to computations by hand; the previous <xref ref="matrix-mult-defn-of"/> is more suitable for proving theorems, such as this <xref ref="matrix-mult-comp-is-prod"/> below.
</p>
<bluebox xml:id="matrix-mult-row-column">
<title>Recipe: The row-column rule for matrix multiplication</title>
<idx><h>Matrix multiplication</h><h>row-column rule</h></idx>
<p>
Let <m>A</m> be an <m>m\times n</m> matrix, let <m>B</m> be an <m>n\times p</m> matrix, and let <m>C = AB</m>. Then the <m>ij</m> entry of <m>C</m> is the <m>i</m>th row of <m>A</m> times the <m>j</m>th column of <m>B</m>:
<me>c_{ij} = a_{i1}b_{1j} + a_{i2}b_{2j} + \cdots + a_{in}b_{nj}.</me>
Here is a diagram:
<me>
<![CDATA[
\def\spvdots{\vphantom{\vbox{\hbox{(}\kern0pt}}\smash{\vdots}}
\begin{tikzpicture}[baseline, scale=.95]
\matrix[math matrix] (aij)
{
a_{11} \& \cdots \& a_{1k} \& \cdots \& a_{1n} \\
\spvdots \& \& \spvdots \& \& \spvdots \\
a_{i1} \& \cdots \& a_{ik} \& \cdots \& a_{in} \\
\spvdots \& \& \spvdots \& \& \spvdots \\
a_{m1} \& \cdots \& a_{mk} \& \cdots \& a_{mn} \\
};
\node[fit=(aij-3-1) (aij-3-5), inner sep=2pt,
draw=green!70!black, thick, rounded corners] (row) {};
\node[text=green!70!black, rotate=90, anchor=north, yshift=2mm, font=\small]
at (row.east) {$i$th row};
\end{tikzpicture}\hskip-1mm
\begin{tikzpicture}[baseline, scale=.95]
\matrix[math matrix] (bij)
{
b_{11} \& \cdots \& b_{1j} \& \cdots \& b_{1p} \\
\spvdots \& \& \spvdots \& \& \spvdots \\
b_{k1} \& \cdots \& b_{kj} \& \cdots \& b_{kp} \\
\spvdots \& \& \spvdots \& \& \spvdots \\
b_{n1} \& \cdots \& b_{nj} \& \cdots \& b_{np} \\
};
\node[fit=(bij-1-3) (bij-5-3), inner sep=2pt,
draw=blue!50, thick, rounded corners,
label={[text=blue!50]below:\small$j$th column}] {};
\end{tikzpicture}
\hskip-4pt=\hskip-4pt
\begin{tikzpicture}[baseline, scale=.95]
\matrix[math matrix,
label=below:$\textcolor{green!70!black}i\textcolor{blue!50}j$ entry]
(cij)
{
c_{11} \& \cdots \& c_{1j} \& \cdots \& c_{1p} \\
\spvdots \& \& \spvdots \& \& \spvdots \\
c_{i1} \& \cdots \& c_{ij} \& \cdots \& c_{ip} \\
\spvdots \& \& \spvdots \& \& \spvdots \\
c_{m1} \& \cdots \& c_{mj} \& \cdots \& c_{mp} \\
};
\draw[thick, green!70!black]
(cij-3-3.center) circle[radius=1.35ex];
\draw[thick, blue!50]
(cij-3-3.center) circle[radius=1.35ex+\pgflinewidth];
\end{tikzpicture}
]]>
</me>
</p>
</bluebox>
<!-- If we don't end <paragraphs> here then the proof shows up twice... -->
</paragraphs>
<proof>
<p>
The <xref ref="matrixeq-row-column">row-column rule for matrix-vector multiplication</xref> says that if <m>A</m> has rows <m>r_1,r_2,\ldots,r_m</m> and <m>x</m> is a vector, then
<me>Ax = \mat[c]{ \matrow{r_1};
\matrow{r_2};
\vdots ;
\matrow{r_m}}
x
= \vec{r_1x r_2x \vdots, r_mx}.
</me>
The <xref ref="matrix-mult-defn-of"/> of matrix multiplication is
<me>A\mat{| | ,, |; c_1 c_2 \cdots, c_p; | | ,, |} =
\mat{| | ,, |; Ac_1 Ac_2 \cdots, Ac_p; | | ,, |}.</me>
It follows that
<me>
\mat[c]{ \matrow{r_1};
\matrow{r_2};
\vdots ;
\matrow{r_m}}
\mat{| | ,, |; c_1 c_2 \cdots, c_p; | | ,, |}
= \mat{ r_1c_1 r_1c_2 \cdots, r_1c_p;
r_2c_1 r_2c_2 \cdots, r_2c_p;
\vdots, \vdots, , \vdots;
r_mc_1 r_mc_2 \cdots, r_mc_p}.
</me>
</p>
</proof>
<example>
<p>
The row-column rule allows us to compute the product matrix one entry at a time:
<me>
\def\g{\textcolor{green!70!black}}
\def\b{\textcolor{blue!50}}
\begin{aligned}
\mat{\g1 \g2 \g3; 4 5 6}
\mat{\b1 -3; \b2 -2; \b3 -1}
\amp= \mat{\g1\cdot\b1+\g2\cdot\b2+\g3\cdot\b3 \fbox{\phantom 8};
\fbox{\phantom 8} \fbox{\phantom 8}}
= \mat{\textcolor{purple}{14} \fbox{\phantom 8}; \fbox{\phantom 8} \fbox{\phantom 8}} \\
\mat{1 2 3; \g4 \g5 \g6}
\mat{\b1 -3; \b2 -2; \b3 -1}
\amp= \mat{\fbox{\phantom 8} \fbox{\phantom 8};
\g4\cdot\b1+\g5\cdot\b2+\g6\cdot\b3 \fbox{\phantom 8}}
= \mat{14 \fbox{\phantom 8}; \textcolor{purple}{32} \fbox{\phantom 8}}
\end{aligned}
</me>
You should try to fill in the other two boxes!
</p>
</example>
<p>
Although matrix multiplication satisfies many of the properties one would expect (see the end of the section), one must be careful when doing matrix arithmetic, as there are several properties that are not satisfied in general.
</p>
<bluebox>
<title>Matrix multiplication caveats</title>
<idx><h>Matrix multiplication</h><h>caveats</h></idx>
<p>
<ul>
<li>
Matrix multiplication is not commutative: <m>AB</m> is not usually equal to <m>BA</m>, even when both products are defined and have the same size. See this <xref ref="matrix-mult-compose-noncom2"/>.
</li>
<li>
Matrix multiplication does not satisfy the cancellation law: <m>AB=AC</m> does not imply <m>B=C</m>, even when <m>A\neq 0</m>. For example,
<me>\mat{1 0; 0 0}\mat{1 2; 3 4} = \mat{1 2; 0 0}
= \mat{1 0; 0 0}\mat{1 2; 5 6}.</me>
</li>
<li>
It is possible for <m>AB=0</m>, even when <m>A\neq 0</m> and <m>B\neq 0</m>. For example,
<me>\mat{1 0; 1 0}\mat{0 0; 1 1} = \mat{0 0; 0 0}.</me>
</li>
</ul>
</p>
</bluebox>
<p>
While matrix multiplication is not commutative in general there are examples of matrices <m>A</m> and <m>B</m> with <m>AB=BA</m>. For example, this always works when <m>A</m> is the zero matrix, or when <m>A=B</m>. The reader is encouraged to find other examples.
</p>
<example xml:id="matrix-mult-compose-noncom2">
<title>Non-commutative multiplication of matrices</title>
<idx><h>Matrix multiplication</h><h>noncommutativity of</h></idx>
<p>
Consider the matrices
<me>A = \mat{1 1; 0 1} \sptxt{and}
B = \mat{1 0; 1 1},</me>
as in this <xref ref="matrix-mult-compose-noncom"/>.
The matrix <m>AB</m> is
<me>\mat{1 1; 0 1}\mat{1 0; 1 1} = \mat{2 1; 1 1},</me>
whereas the matrix <m>BA</m> is
<me>\mat{1 0; 1 1}\mat{1 1; 0 1} = \mat{1 1; 1 2}.</me>
In particular, we have
<me>AB \neq BA.</me>
And so matrix multiplication is not always commutative. It is not a coincidence that this example agrees with the previous <xref ref="matrix-mult-compose-noncom"/>; we are about to see that multiplication of matrices corresponds to composition of transformations.
</p>
</example>
<example hide-type="true" xml:id="matrix-mult-order-ops">
<title>Order of Operations</title>
<idx><h>Matrix multiplication</h><h>order of operations</h></idx>
<idx><h>Transformation</h><h>composition of</h><h>order of operations</h></idx>
<p>
Let <m>T\colon\R^n\to\R^m</m> and <m>U\colon\R^p\to\R^n</m> be linear transformations, and let <m>A</m> and <m>B</m> be their standard matrices, respectively. Recall that <m>T\circ U(x)</m> is the vector obtained by first applying <m>U</m> to <m>x</m>, and then <m>T</m>.
</p>
<p>
On the matrix side, the standard matrix of <m>T\circ U</m> is the product <m>AB</m>, so <m>T\circ U(x) = (AB)x</m>. By associativity of matrix multiplication, we have <m>(AB)x = A(Bx)</m>, so the product <m>(AB)x</m> can be computed by first multiplying <m>x</m> by <m>B</m>, <em>then</em> multipyling the product by <m>A</m>.
</p>
<p>
Therefore, matrix multiplication happens in the same order as composition of transformations. In other words, <em>both matrices and transformations are written in the order opposite from the order in which they act.</em> But matrix multiplication and composition of transformations are written in the same order as each other: the matrix for <m>T\circ U</m> is <m>AB</m>.
</p>
</example>
</subsection>
<subsection>
<title>Composition and Matrix Multiplication</title>
<p>
The point of this subsection is to show that matrix multiplication corresponds to composition of transformations, that is, the standard matrix for <m>T \circ U</m> is the product of the standard matrices for <m>T</m> and for <m>U</m>. It should be hard to believe that our complicated formula for matrix multiplication actually means something intuitive such as <q>chaining two transformations together</q>!
</p>
<theorem xml:id="matrix-mult-comp-is-prod">
<idx><h>Matrix multiplication</h><h>and composition of transformations</h></idx>
<idx><h>Linear transformation</h><h>composition of</h><h>linearity of</h></idx>
<idx><h>Linear transformation</h><h>composition of</h><h>and matrix multiplication</h></idx>
<statement>
<p>Let <m>T\colon\R^n\to\R^m</m> and <m>U\colon\R^p\to\R^n</m> be linear transformations, and let <m>A</m> and <m>B</m> be their standard matrices, respectively, so <m>A</m> is an <m>m\times n</m> matrix and <m>B</m> is an <m>n\times p</m> matrix. Then <m>T\circ U\colon\R^p\to\R^m</m> is a linear transformation, and its standard matrix is the product <m>AB</m>.</p>
</statement>
<proof>
<p>
First we verify that <m>T\circ U</m> is linear. Let <m>u,v</m> be vectors in <m>\R^p</m>. Then
<me>
\begin{split}
T\circ U(u+v) \amp= T(U(u+v)) = T(U(u)+U(v)) \\
\amp= T(U(u))+T(U(v)) = T\circ U(u) + T\circ U(v).
\end{split}
</me>
If <m>c</m> is a scalar, then
<me>T\circ U(cv) = T(U(cv)) = T(cU(v)) = cT(U(v)) = cT\circ U(v).</me>
Since <m>T\circ U</m> satisfies the two <xref ref="linear-trans-defn">defining properties</xref>, it is a linear transformation.
</p>
<p>
Now that we know that <m>T\circ U</m> is linear, it makes sense to compute its standard matrix. Let <m>C</m> be the standard matrix of <m>T\circ U</m>, so <m>T(x) = Ax,</m> <m>U(x) = Bx</m>, and <m>T\circ U(x) = Cx</m>. By this <xref ref="matrix-of-transformation"/>, the first column of <m>C</m> is <m>Ce_1</m>, and the first column of <m>B</m> is <m>Be_1</m>. We have
<me>T\circ U(e_1) = T(U(e_1)) = T(Be_1) = A(Be_1).</me>
By definition, the first column of the product <m>AB</m> is the product of <m>A</m> with the first column of <m>B</m>, which is <m>Be_1</m>, so
<me>Ce_1 = T\circ U(e_1) = A(Be_1) = (AB)e_1.</me>
It follows that <m>C</m> has the same first column as <m>AB</m>. The same argument as applied to the <m>i</m>th standard coordinate vector <m>e_i</m> shows that <m>C</m> and <m>AB</m> have the same <m>i</m>th column; since they have the same columns, they are the same matrix.
</p>
</proof>
</theorem>
<p>
The theorem justifies our choice of definition of the matrix product. This is the one and only reason that matrix products are defined in this way. To rephrase:
</p>
<bluebox>
<title>Products and compositions</title>
<p>
The matrix of the composition of two linear transformations is the product of the matrices of the transformations.
</p>
</bluebox>
<example>
<title>Composition of rotations</title>
<idx><h>Rotation</h><h>composition of</h></idx>
<p>
In this <xref ref="linear-trans-rotation-matrix"/>, we showed that the standard matrix for the counterclockwise rotation of the plane by an angle of <m>\theta</m> is
<me>A = \mat{\cos\theta, -\sin\theta; \sin\theta, \cos\theta}.</me>
Let <m>T\colon\R^2\to\R^2</m> be counterclockwise rotation by <m>45^\circ</m>, and let <m>U\colon\R^2\to\R^2</m> be counterclockwise rotation by <m>90^\circ</m>. The matrices <m>A</m> and <m>B</m> for <m>T</m> and <m>U</m> are, respectively,
<md>
<mrow>
A \amp= \mat{\cos(45^\circ) -\sin(45^\circ); \sin(45^\circ) \cos(45^\circ)}
= \frac 1{\sqrt 2}\mat{1 -1; 1 1}
</mrow>
<mrow>
B \amp= \mat{\cos(90^\circ) -\sin(90^\circ); \sin(90^\circ) \cos(90^\circ)}
= \mat{0 -1; 1 0}.
</mrow>
</md>
Here we used the trigonometric identities
<md>
<mrow>
\cos(45^\circ) \amp= \frac 1{\sqrt2} \amp
\sin(45^\circ) \amp= \frac 1{\sqrt2}
</mrow>
<mrow>
\cos(90^\circ) \amp= 0 \amp
\sin(90^\circ) \amp= 1.
</mrow>
</md>
The standard matrix of the composition <m>T\circ U</m> is
<me>AB = \frac 1{\sqrt 2}\mat{1 -1; 1 1}\mat{0 -1; 1 0}
= \frac 1{\sqrt 2}\mat{-1 -1; 1 -1}.
</me>
This is consistent with the fact that <m>T\circ U</m> is counterclockwise rotation by <m>90^\circ + 45^\circ = 135^\circ</m>: we have
<me>\mat{\cos(135^\circ) -\sin(135^\circ); \sin(135^\circ) \cos(135^\circ)}
= \frac 1{\sqrt 2}\mat{-1 -1; 1 -1}</me>
because <m>\cos(135^\circ) = -1/\sqrt2</m> and <m>\sin(135^\circ) = 1/\sqrt2</m>.
</p>
</example>
<remark type-name="Challenge">
<p>
Derive the trigonometric identities
<me>\sin(\alpha\pm\beta) = \sin(\alpha)\cos(\beta) \pm \cos(\alpha)\sin(\beta)</me>
and
<me>\cos(\alpha\pm\beta) = \cos(\alpha)\cos(\beta) \mp \sin(\alpha)\sin(\beta)</me>
using the above <xref ref="matrix-mult-comp-is-prod"/> as applied to rotation transformations, as in the previous example.
</p>
</remark>
<example hide-type="true">
<title>Interactive: A composition of matrix transformations</title>
<p>
Define <m>T\colon\R^3\to\R^2</m> and <m>U\colon\R^2\to\R^3</m> by
<me>T(x) = \mat{1 1 0; 0 1 1}x \sptxt{and}
U(x) = \mat{1 0; 0 1; 1 0}x.</me>
Their composition is a linear transformation <m>T\circ U\colon\R^2\to\R^2</m>. By the <xref ref="matrix-mult-comp-is-prod"/>, its standard matrix is
<me> \mat{1 1 0; 0 1 1}\mat{1 0; 0 1; 1 0} = \mat{1 1; 1 1},</me>
as we computed in the above <xref ref="matrix-mult-eg-mult1"/>.
</p>
<figure>
<caption>The matrix of the composition <m>T\circ U</m> is the product of the matrices for <m>T</m> and <m>U</m>.</caption>
<mathbox source="demos/compose3d.html?mat2=1,1,0:0,1,1&mat1=1,0:0,1:1,0&rangeT=off&rangeU=off&rangeTU=off&range=5&closed=true" height="500px"/>
</figure>
</example>
<example hide-type="true">
<title>Interactive: A transformation defined in steps</title>
<p>
Let <m>S\colon\R^3\to\R^3</m> be the linear transformation that first reflects over the <m>xy</m>-plane and then projects onto the <m>yz</m>-plane, as in this <xref ref="linear-trans-in-steps"/>. The transformation <m>S</m> is the composition <m>T\circ U</m>, where <m>U\colon\R^3\to\R^3</m> is the transformation that reflects over the <m>xy</m>-plane, and <m>T\colon\R^3\to\R^3</m> is the transformation that projects onto the <m>yz</m>-plane.
</p>
<p>
Let us compute the matrix <m>B</m> for <m>U</m>.
<latex-code mode="bare">
\def\drawarrow#1{
\begin{tikzpicture}[myxyz, y={(1cm,-.28cm)}, scale=1, thin border nodes, baseline]
\path[clip, resetxy] (-2,-2) rectangle (2,2);
\begin{scope}[transformxy]
\fill[blue, opacity=.05] (-1.5,-1.5) rectangle (1.5,1.5);
\draw[step=1cm, help lines, blue!40!black]
(-1.5, -1.5) grid (1.5, 1.5);
\end{scope}
\node[blue!40!black,right] at (-.5,.8,-.5) {$xy$};
\begin{scope}[transformyz]
\fill[green, opacity=.05] (-1.5,-1.5) rectangle (1.5,1.5);
\draw[step=1cm, help lines, green!40!black]
(-1.5, -1.5) grid (1.5, 1.5);
\end{scope}
\node[green!40!black] at (.75,-1,1) {$yz$};
#1
\end{tikzpicture}
}
</latex-code>
<latex-code>
\drawarrow{\draw[vector] (0,0,0) -- (1,0,0) node[above left, pos=.5] {$e_1$};}
$\xrightarrow{\text{reflect $xy$}}$
\drawarrow{\draw[vector] (0,0,0) -- (1,0,0) node[above left, pos=.5] {$U(e_1)$};}
</latex-code>
Since <m>e_1</m> lies on the <m>xy</m>-plane, reflecting it over the <m>xy</m>-plane does not move it:
<me>U(e_1) = \vec{1 0 0}.</me>
<latex-code>
\drawarrow{\draw[vector] (0,0,0) -- (0,1,0) node[below left, pos=.5] {$e_2$};}
$\xrightarrow{\text{reflect $xy$}}$
\drawarrow{\draw[vector] (0,0,0) -- (0,1,0) node[below left, pos=.5] {$U(e_2)$};}
</latex-code>
Since <m>e_2</m> lies on the <m>xy</m>-plane, reflecting over the <m>xy</m>-plane does not move it either:
<me>U(e_2) = e_2 = \vec{0 1 0}.</me>
<latex-code>
\drawarrow{\draw[vector] (0,0,0) -- (0,0,1) node[left, pos=.4] {$e_3$};}
$\xrightarrow{\text{reflect $xy$}}$
\drawarrow{\draw[vector] (0,0,0) -- (0,0,-1) node[left, pos=.5] {$U(e_3)$};}
</latex-code>
Since <m>e_3</m> is perpendicular to the <m>xy</m>-plane, reflecting over the <m>xy</m>-plane takes <m>e_3</m> to its negative:
<me>U(e_3) = -e_3 = \vec{0 0 -1}.</me>
We have computed all of the columns of <m>B</m>:
<me>B = \mat{| | |; U(e_1) U(e_2) U(e_3); | | |}
= \mat{1 0 0; 0 1 0; 0 0 -1}.</me>
By a similar method, we find
<me>A = \mat{0 0 0; 0 1 0; 0 0 1}.</me>
It follows that the matrix for <m>S = T\circ U</m> is
<me>
\begin{split}
AB \amp= \mat{0 0 0; 0 1 0; 0 0 1}\mat{1 0 0; 0 1 0; 0 0 -1} \\
\amp= \mat{\mat{0 0 0; 0 1 0; 0 0 1}\vec{1,0,0}
\mat{0 0 0; 0 1 0; 0 0 1}\vec{0,1,0}
\mat{0 0 0; 0 1 0; 0 0 1}\vec{0,0,-1}} \\
\amp= \mat{0 0 0; 0 1 0; 0 0 -1},
\end{split}
</me>
as we computed in this <xref ref="linear-trans-in-steps"/>.
</p>
<figure>
<caption></caption>
<mathbox source="demos/compose3d.html?mat1=1,0,0:0,1,0:0,0,-1&mat2=0,0,0:0,1,0:0,0,1&rangeT=off&rangeU=off&rangeTU=off&range=5&closed=true" height="500px"/>
</figure>
</example>
<p>
Recall from this <xref ref="linear-trans-identity-mat"/> that the <em>identity matrix</em> is the <m>n\times n</m> matrix <m>I_n</m> whose columns are the standard coordinate vectors in <m>\R^n</m>. The identity matrix is the standard matrix of the identity transformation: that is, <m>x = \operatorname{Id}_{\R^n}(x) = I_nx</m> for all vectors <m>x</m> in <m>\R^n</m>. For any linear transformation <m>T : \R^n \to \R^m</m> we have <me>I_{\R^m} \circ T = T</me>
and by the same token we have for any <m>m \times n</m> matrix <m>A</m> we have <me>I_mA=A</me>. Similarly, we have <m>T \circ I_{\R^n} = T</m> and <m>AI_n=A</m>.
</p>
</subsection>
<subsection>
<title>The algebra of transformations and matrices</title>
<p>
In this subsection we describe two more operations that one can perform on transformations: addition and scalar multiplication. We then translate these operations into the language of matrices. This is analogous to what we did for the composition of linear transformations, but much less subtle.
</p>
<definition>
<idx><h>Transformation</h><h>addition of</h></idx>
<idx><h>Linear transformation</h><h>addition of</h><see>Transformation</see></idx>
<idx><h>Matrix transformation</h><h>addition of</h><see>Transformation</see></idx>
<idx><h>Transformation</h><h>scalar multiplication of</h></idx>
<idx><h>Linear transformation</h><h>scalar multiplication of</h><see>Transformation</see></idx>
<idx><h>Matrix transformation</h><h>scalar multiplication of</h><see>Transformation</see></idx>
<statement>
<p>
<ul>
<li>
Let <m>T,U\colon\R^n\to\R^m</m> be two transformations. Their <term>sum</term> is the transformation <m>T+U\colon\R^n\to\R^m</m> defined by
<me>(T+U)(x) = T(x) + U(x).</me>
Note that addition of transformations is only defined when both transformations have the same domain and codomain. <em></em><!-- workaround css bug -->
</li>
<li>
Let <m>T\colon\R^n\to\R^m</m> be a transformation, and let <m>c</m> be a scalar. The <term>scalar product</term> of <m>c</m> with <m>T</m> is the transformation <m>cT\colon\R^n\to\R^m</m> defined by
<me>(cT)(x) = c\cdot T(x).</me>
</li>
</ul>
</p>
</statement>
</definition>
<p>
To emphasize, the sum of two transformations <m>T,U\colon\R^n\to\R^m</m> is another transformation called <m>T+U</m>; its value on an input vector <m>x</m> is the sum of the outputs of <m>T</m> and <m>U</m>. Similarly, the product of <m>T</m> with a scalar <m>c</m> is another transformation called <m>cT</m>; its value on an input vector <m>x</m> is the vector <m>c\cdot T(x)</m>.
</p>
<example>
<title>Functions of one variable</title>
<p>
Define <m>f\colon\R\to\R</m> by <m>f(x) = x^2</m> and <m>g\colon\R\to\R</m> by <m>g(x) = x^3</m>. The sum <m>f+g\colon\R\to\R</m> is the transformation defined by the rule
<me>(f+g)(x) = f(x) + g(x) = x^2 + x^3.</me>
For instance, <m>(f+g)(-2) = (-2)^2 + (-2)^3 = -4</m>.
</p>
<p>
Define <m>\exp\colon\R\to\R</m> by <m>\exp(x) = e^x</m>. The product <m>2\exp\colon\R\to\R</m> is the transformation defined by the rule
<me>(2\exp)(x) = 2\cdot\exp(x) = 2e^x.</me>
For instance, <m>(2\exp)(1) = 2\cdot\exp(1) = 2e</m>.
</p>
</example>
<note hide-type="true" xml:id="matrix-mult-add-mult-trans">
<title>Properties of addition and scalar multiplication for transformations</title>
<p>
Let <m>S,T,U\colon\R^n\to\R^m</m> be transformations and let <m>c,d</m> be scalars. The following properties are easily verified:
<md>
<mrow>
T + U \amp= U + T \amp
S + (T + U) \amp= (S + T) + U
</mrow>
<mrow>
c(T + U) \amp= cT + cU \amp
(c + d)T \amp= cT + dT
</mrow>
<mrow>
c(dT) \amp= (cd)T \amp
T + 0 \amp= T
</mrow>
</md>
</p>
</note>
<p>
<notation><usage>0</usage><description>The zero transformation</description></notation>
In one of the above properties, we used <m>0</m> to denote the transformation <m>\R^n\to\R^m</m> that is zero on every input vector: <m>0(x) = 0</m> for all <m>x</m>. This is called the <term>zero transformation</term>.
</p>
<p>We now give the analogous operations for matrices.</p>
<definition>
<idx><h>Matrix</h><h>addition of</h></idx>
<idx><h>Matrix</h><h>scalar multiplication of</h></idx>
<statement>
<p>
<ul>
<li>
The <term>sum</term> of two <m>m\times n</m> matrices is the matrix obtained by summing the entries of <m>A</m> and <m>B</m> individually:
<me>
\mat{a_{11} a_{12} a_{13}; a_{21} a_{22} a_{23}} +
\mat{b_{11} b_{12} b_{13}; b_{21} b_{22} b_{23}} =
\mat{a_{11}+b_{11} a_{12}+b_{12} a_{13}+b_{13};
a_{21}+b_{21} a_{22}+b_{22} a_{23}+b_{23}}
</me>
In other words, the <m>i,j</m> entry of <m>A+B</m> is the sum of the <m>i,j</m> entries of <m>A</m> and <m>B</m>. Note that addition of matrices is only defined when both matrices have the same size.<em></em> <!-- workaround css bug -->
</li>
<li>
The <term>scalar product</term> of a scalar <m>c</m> with a matrix <m>A</m> is obtained by scaling all entries of <m>A</m> by <m>c</m>:
<me>
\def\r{\textcolor{red}}
\r c\mat{a_{11} a_{12} a_{13}; a_{21} a_{22} a_{23}} =
\mat{\r ca_{11} \r ca_{12} \r ca_{13}; \r ca_{21} \r ca_{22} \r ca_{23}}
</me>
In other words, the <m>i,j</m> entry of <m>cA</m> is <m>c</m> times the <m>i,j</m> entry of <m>A</m>.
</li>
</ul>
</p>
</statement>
</definition>
<fact>
<p>
Let <m>T,U\colon\R^n\to\R^m</m> be linear transformations with standard matrices <m>A,B</m>, respectively, and let <m>c</m> be a scalar.
<ul>
<li>The standard matrix for <m>T+U</m> is <m>A+B</m>.</li>
<li>The standard matrix for <m>cT</m> is <m>cA</m>.</li>
</ul>
</p>
</fact>
<p>
In view of the above fact, the following properties are consequences of the corresponding <xref ref="matrix-mult-add-mult-trans">properties</xref> of transformations. They are easily verified directly from the definitions as well.
</p>
<note hide-type="true">
<title>Properties of addition and scalar multiplication for matrices</title>
<p>
Let <m>A,B,C</m> be <m>m\times n</m> matrices and let <m>c,d</m> be scalars. Then:
<md>
<mrow>
A + B \amp= B + A \amp
C + (A + B) \amp= (C + A) + B
</mrow>
<mrow>
c(A + B) \amp= cA + cB \amp
(c + d)A \amp= cA + dA
</mrow>
<mrow>
c(dA) \amp= (cd)A \amp
A + 0 \amp= A
</mrow>
</md>
</p>
</note>
<p>
<notation><usage>0</usage><description>The zero matrix</description></notation>
In one of the above properties, we used <m>0</m> to denote the <m>m\times n</m> matrix whose entries are all zero. This is the standard matrix of the zero transformation, and is called the <term>zero matrix</term>.
</p>
<p>
We can also combine addition and scalar multiplication of matrices with multiplication of matrices. Since matrix multiplication corresponds to composition of transformations (<xref ref="matrix-mult-comp-is-prod"/>), the following properties are consequences of the corresponding <xref ref="matrix-mult-compose-trans">properties</xref> of transformations.
</p>
<note hide-type="true">
<title>Properties of matrix multiplication</title>
<idx><h>Identity matrix</h><h>and matrix multiplication</h></idx>
<idx><h>Matrix multiplication</h><h>properties of</h></idx>
<idx><h>Matrix multiplication</h><h>associativity of</h></idx>
<p>
Let <m>A,B,C</m> be matrices and let <m>c</m> be a scalar. Suppose that <m>A</m> is an <m>m\times n</m> matrix, and that in each of the following identities, the sizes of <m>B</m> and <m>C</m> are compatible when necessary for the product to be defined. Then:
<md>
<mrow>
C(A+B) \amp= C A+C B \amp
(A + B) C \amp= A C + B C \amp
</mrow>
<mrow>
c(A B) \amp= (cA) B \amp
c(A B) \amp= A(cB)
</mrow>
<mrow>
A I_n \amp= A \amp
I_m A \amp= A
</mrow>
<mrow>
(A B)C \amp= A (BC) \amp
</mrow>
</md>
</p>
</note>
<p>
Most of the above properties are easily verified directly from the definitions. The <em>associativity</em> property <m>(AB)C=A(BC)</m>, however, is not (try it!). It is much easier to prove by relating matrix multiplication to composition of transformations, and using the obvious fact that composition of transformations is associative.
</p>
</subsection>
</section>