This repository has been archived by the owner on Aug 6, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 172
/
projections.xml
999 lines (939 loc) · 47.9 KB
/
projections.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
<?xml version="1.0" encoding="UTF-8"?>
<!--********************************************************************
Copyright 2017 Georgia Institute of Technology
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation. A copy of
the license is included in gfdl.xml.
*********************************************************************-->
<section xml:id="projections">
<title>Orthogonal Projection</title>
<objectives>
<ol>
<li>Understand the orthogonal decomposition of a vector with respect to a subspace.</li>
<li>Understand the relationship between orthogonal decomposition and orthogonal projection.</li>
<li>Understand the relationship between orthogonal decomposition and the closest vector on / distance to a subspace.</li>
<li>Learn the basic properties of orthogonal projections as linear transformations and as matrix transformations.</li>
<li><em>Recipes:</em> orthogonal projection onto a line, orthogonal decomposition by solving a system of equations, orthogonal projection via a complicated matrix product.</li>
<li><em>Pictures:</em> orthogonal decomposition, orthogonal projection.</li>
<li><em>Vocabulary words:</em> <term>orthogonal decomposition</term>, <term>orthogonal projection</term>.</li>
</ol>
</objectives>
<introduction>
<p>
Let <m>W</m> be a subspace of <m>\R^n</m> and let <m>x</m> be a vector in <m>\R^n</m>. In this section, we will learn to compute the <em>closest vector</em> <m>x_W</m> to <m>x</m> in <m>W</m>. The vector <m>x_W</m> is called the <em>orthogonal projection</em> of <m>x</m> onto <m>W</m>. This is exactly what we will use to almost solve matrix equations, as discussed in the introduction to <xref ref="chap-orthogonality"/>.
</p>
</introduction>
<subsection>
<title>Orthogonal Decomposition</title>
<p>
We begin by fixing some notation.
</p>
<definition type-name="Notation" xml:id="projections-closest-notn">
<statement>
<p>
Let <m>W</m> be a subspace of <m>\R^n</m> and let <m>x</m> be a vector in <m>\R^n</m>. We denote the closest vector to <m>x</m> on <m>W</m> by <m>x_W</m>.
</p>
</statement>
</definition>
<p>
To say that <m>x_W</m> is the closest vector to <m>x</m> on <m>W</m> means that the difference <m>x-x_W</m> is <em>orthogonal</em> to the vectors in <m>W</m>:
<latex-code>
\begin{tikzpicture}[myxyz, thin border nodes]
\coordinate (u) at (0,1,0);
\coordinate (v) at (1.1,0,-.2);
\coordinate (uxv) at (.2,0,1.1);
\coordinate (x) at ($-1.1*(u)+(v)+1.5*(uxv)$);
\begin{scope}[x=(u),y=(v),transformxy]
\fill[seq-violet!30] (-2,-2) rectangle (2,2);
\draw[seq-violet, help lines] (-2,-2) grid (2,2);
\node[seq-violet] at (2.6,1) {$W$};
\end{scope}
\point[seq-blue, "$x_W$" {below,text=seq-blue}] (y) at ($-1.1*(u)+1*(v)$);
\coordinate (yu) at ($(y)+(u)$);
\coordinate (yv) at ($(y)+(v)$);
\pic[draw, right angle len=3mm] {right angle=(x)--(y)--(yu)};
\pic[draw, right angle len=3mm] {right angle=(x)--(y)--(yv)};
\point[seq-red, "$x$" {above,text=seq-red}] (xx) at (x);
\draw[vector, thin] (y) -- node[auto] {$x-x_W$} (xx);
\point at (0,0,0);
\end{tikzpicture}
</latex-code>
In other words, if <m>x_{W^\perp} = x - x_W</m>, then we have <m>x = x_W + x_{W^\perp}</m>, where <m>x_W</m> is in <m>W</m> and <m>x_{W^\perp}</m> is in <m>W^\perp</m>. The first order of business is to prove that the closest vector always exists.
</p>
<theorem xml:id="projections-decomp-exists">
<title>Orthogonal decomposition</title>
<idx><h>Orthogonal decomposition</h><see>Orthogonal projection</see></idx>
<idx><h>Orthogonal projection</h><h>existence of</h></idx>
<statement>
<p>
Let <m>W</m> be a subspace of <m>\R^n</m> and let <m>x</m> be a vector in <m>\R^n</m>. Then we can write <m>x</m> uniquely as
<me>x = x_W + x_{W^\perp}</me>
where <m>x_W</m> is the closest vector to <m>x</m> on <m>W</m> and <m>x_{W^\perp}</m> is in <m>W^\perp</m>.
</p>
</statement>
<proof>
<p>
Let <m>m = \dim(W)</m>, so <m>n-m = \dim(W^\perp)</m> by this <xref ref="orthocomp-facts-basic"/>. Let <m>v_1,v_2,\ldots,v_m</m> be a basis for <m>W</m> and let <m>v_{m+1},v_{m+2},\ldots,v_n</m> be a basis for <m>W^\perp</m>. We showed in the proof of this <xref ref="orthocomp-facts-basic"/> that <m>\{v_1,v_2,\ldots,v_m,v_{m+1},v_{m+2},\ldots,v_n\}</m> is linearly independent, so it forms a basis for <m>\R^n</m>. Therefore, we can write
<me>
x = (c_1v_1 + \cdots + c_mv_m) + (c_{m+1}v_{m+1} + \cdots + c_nv_n)
= x_W + x_{W^\perp},
</me>
where <m>x_W = c_1v_1 + \cdots + c_mv_m</m> and <m>x_{W^\perp} = c_{m+1}v_{m+1} + \cdots + c_nv_n</m>. Since <m>x_{W^\perp}</m> is orthogonal to <m>W</m>, the vector <m>x_W</m> is the closest vector to <m>x</m> on <m>W</m>, so this proves that such a decomposition exists.
</p>
<p>
As for uniqueness, suppose that
<me>x = x_W + x_{W^\perp} = y_W + y_{W^\perp}</me>
for <m>x_W,y_W</m> in <m>W</m> and <m>x_{W^\perp},y_{W^\perp}</m> in <m>W^\perp</m>. Rearranging gives
<me>x_W - y_W = y_{W^\perp} - x_{W^\perp}.</me>
Since <m>W</m> and <m>W^\perp</m> are subspaces, the left side of the equation is in <m>W</m> and the right side is in <m>W^\perp</m>. Therefore, <m>x_W-y_W</m> is in <m>W</m> and in <m>W^\perp</m>, so it is orthogonal to itself, which implies <m>x_W-y_W=0</m>. Hence <m>x_W = y_W</m> and <m>x_{W^\perp} = y_{W^\perp}</m>, which proves uniqueness.
</p>
</proof>
</theorem>
<definition xml:id="projections-defn-of">
<idx><h>Orthogonal projection</h><h>definition of</h></idx>
<notation><usage>x_W</usage><description>Orthogonal projection of <m>x</m> onto <m>W</m></description></notation>
<notation><usage>x_{W^\perp}</usage><description>Orthogonal part of <m>x</m> with respect to <m>W</m></description></notation>
<statement>
<p>
Let <m>W</m> be a subspace of <m>\R^n</m> and let <m>x</m> be a vector in <m>\R^n</m>. The expression
<me>x = x_W + x_{W^\perp}</me>
for <m>x_W</m> in <m>W</m> and <m>x_{W^\perp}</m> in <m>W^\perp</m>, is called the <term>orthogonal decomposition</term> of <m>x</m> with respect to <m>W</m>, and the closest vector <m>x_W</m> is the <term>orthogonal projection</term> of <m>x</m> onto <m>W</m>.
</p>
</statement>
</definition>
<p>
Since <m>x_W</m> is the closest vector on <m>W</m> to <m>x</m>, the distance from <m>x</m> to the subspace <m>W</m> is the length of the vector from <m>x_W</m> to <m>x</m>, i.e., the length of <m>x_{W^\perp}</m>. To restate:
</p>
<bluebox xml:id="projections-closest-vector">
<title>Closest vector and distance</title>
<idx><h>Orthogonal projection</h><h>is the closest vector</h></idx>
<idx><h>Orthogonal projection</h><h>distance from</h></idx>
<p>
Let <m>W</m> be a subspace of <m>\R^n</m> and let <m>x</m> be a vector in <m>\R^n</m>.
<ul>
<li>
The orthogonal projection <m>x_W</m> is the closest vector to <m>x</m> in <m>W</m>.
</li>
<li>
The distance from <m>x</m> to <m>W</m> is <m>\|x_{W^\perp}\|</m>.
</li>
</ul>
</p>
</bluebox>
<example xml:id="projections-eg-xy-plane">
<title>Orthogonal decomposition with respect to the <m>xy</m>-plane</title>
<p>
Let <m>W</m> be the <m>xy</m>-plane in <m>\R^3</m>, so <m>W^\perp</m> is the <m>z</m>-axis. It is easy to compute the orthogonal decomposition of a vector with respect to this <m>W</m>:
<me>
\begin{split}
x \amp= \vec{1 2 3} \implies x_W = \vec{1 2 0} \quad x_{W^\perp} = \vec{0 0 3}\\
x \amp= \vec{a b c} \implies x_W = \vec{a b 0} \quad x_{W^\perp} = \vec{0 0 c}.
\end{split}
</me>
We see that the orthogonal decomposition in this case expresses a vector in terms of a <q>horizontal</q> component (in the <m>xy</m>-plane) and a <q>vertical</q> component (on the <m>z</m>-axis).
<latex-code>
\begin{tikzpicture}[myxyz, thin border nodes]
\draw (0,0,-2)--(0,0,0);
\fill[transformxy, help lines, seq-violet!20, opacity=.7]
(-2.5,-2.5) rectangle (2.5,2.5);
\draw[transformxy, help lines, seq-violet!50] (-2.5,-2.5) grid (2.5,2.5);
\draw[->] (-2.5,0,0)--(2.5,0,0);
\draw[->] (0,-2.5,0)--(0,2.5,0);
\draw[->] (0,0,0)--(0,0,2);
\point[black, "$x$" {above, text=black}] (x) at (2,1,3);
\coordinate (xW) at (2,1,0);
\draw[vector, seq-violet] (0,0,0) -- node [right=1pt, text=seq-violet] {$x_W$} (xW);
\point (o) at (0,0,0);
\draw[vector, seq-green] (xW) --
node[auto, seq-green!80!black] {$x_{W^\perp}$} (x);
\coordinate (xWu) at ($(xW)+(0,1,0)$);
\coordinate (xWv) at ($(xW)+(1,0,0)$);
\pic[draw] {right angle=(x)--(xW)--(xWu)};
\pic[draw] {right angle=(x)--(xW)--(xWv)};
\node[seq-violet] at (1,3,0) {$W$};
\end{tikzpicture}
</latex-code>
</p>
<figure>
<caption>Orthogonal decomposition of a vector with respect to the <m>xy</m>-plane in <m>\R^3</m>. Note that <m>\color{seq-violet}x_W</m> is in the <m>xy</m>-plane and <m>\color{seq-green}x_{W^\perp}</m> is in the <m>z</m>-axis. Click and drag the head of the vector <m>x</m> to see how the orthogonal decomposition changes.</caption>
<mathbox source="demos/projection.html?u1=1,0,0&u2=0,1,0&vec=-1.1,2,1.5&range=3&mode=decomp&closed" height="500px"/>
</figure>
</example>
<example xml:id="projections-eg-inW">
<title>Orthogonal decomposition of a vector in <m>W</m></title>
<idx><h>Orthogonal projection</h><h>of a vector in <m>W</m></h></idx>
<p>
If <m>x</m> is in a subspace <m>W</m>, then the closest vector to <m>x</m> in <m>W</m> is itself, so <m>x = x_W</m> and <m>x_{W^\perp} = 0</m>. Conversely, if <m>x = x_W</m> then <m>x</m> is contained in <m>W</m> because <m>x_W</m> is contained in <m>W</m>.
</p>
</example>
<example xml:id="projections-eg-inWperp">
<title>Orthogonal decomposition of a vector in <m>W^\perp</m></title>
<idx><h>Orthogonal projection</h><h>of a vector in <m>W^\perp</m></h></idx>
<p>
If <m>W</m> is a subspace and <m>x</m> is in <m>W^\perp</m>, then the orthogonal decomposition of <m>x</m> is <m>x = 0 + x</m>, where <m>0</m> is in <m>W</m> and <m>x</m> is in <m>W^\perp</m>. It follows that <m>x_W = 0</m>. Conversely, if <m>x_W = 0</m> then the orthogonal decomposition of <m>x</m> is <m>x = x_W + x_{W^\perp} = 0 + x_{W^\perp}</m>, so <m>x = x_{W^\perp}</m> is in <m>W^\perp</m>.
</p>
</example>
<example hide-type="true">
<title>Interactive: Orthogonal decomposition in <m>\R^2</m></title>
<figure>
<caption>Orthogonal decomposition of a vector with respect to a line <m>W</m> in <m>\R^2</m>. Note that <m>\color{seq-violet}x_W</m> is in <m>W</m> and <m>\color{seq-green}x_{W^\perp}</m> is in the line perpendicular to <m>W</m>. Click and drag the head of the vector <m>x</m> to see how the orthogonal decomposition changes.</caption>
<mathbox source="demos/projection.html?u1=1,.5&vec=1,-1&range=3&mode=decomp&closed&subname=W" height="500px"/>
</figure>
</example>
<example hide-type="true">
<title>Interactive: Orthogonal decomposition in <m>\R^3</m></title>
<figure>
<caption>Orthogonal decomposition of a vector with respect to a plane <m>W</m> in <m>\R^3</m>. Note that <m>\color{seq-violet}x_W</m> is in <m>W</m> and <m>\color{seq-green}x_{W^\perp}</m> is in the line perpendicular to <m>W</m>. Click and drag the head of the vector <m>x</m> to see how the orthogonal decomposition changes.</caption>
<mathbox source="demos/projection.html?u1=1,0,0&u2=0,1.1,-.2&vec=-1.1,2,1.5&range=3&mode=decomp&closed" height="500px"/>
</figure>
</example>
<example hide-type="true">
<title>Interactive: Orthogonal decomposition in <m>\R^3</m></title>
<figure>
<caption>Orthogonal decomposition of a vector with respect to a line <m>W</m> in <m>\R^3</m>. Note that <m>\color{seq-violet}x_W</m> is in <m>W</m> and <m>\color{seq-green}x_{W^\perp}</m> is in the plane perpendicular to <m>W</m>. Click and drag the head of the vector <m>x</m> to see how the orthogonal decomposition changes.</caption>
<mathbox source="demos/projection.html?u1=0,1.1,.2&vec=-1.1,2,1.5&range=3&mode=decomp&closed&subname=W" height="500px"/>
</figure>
</example>
<p>
Now we turn to the problem of computing <m>x_W</m> and <m>x_{W^\perp}</m>. Of course, since <m>x_{W^\perp} = x - x_W</m>, really all we need is to compute <m>x_W</m>. The following theorem gives a method for computing the orthogonal projection onto a column space. To compute the orthogonal projection onto a general subspace, usually it is best to rewrite the subspace as the column space of a matrix, as in this <xref ref="subspace-is-col-or-nul"/>.
</p>
<theorem xml:id="projections-ATA-formula">
<idx><h>Orthogonal projection</h><h>computation of</h><h>row reduction</h></idx>
<statement>
<p>
Let <m>A</m> be an <m>m \times n</m> matrix, let <m>W = \Col(A)</m>, and let <m>x</m> be a vector in <m>\R^m</m>. Then the matrix equation
<me>
A^TAc=A^Tx
</me>
in the unknown vector <m>c</m> is consistent, and <m>x_W</m> is equal to <m>Ac</m> for any solution <m>c</m>.
</p>
</statement>
<proof visible="true">
<p>
Let <m>x = x_W + x_{W^\perp}</m> be the orthogonal decomposition with respect to <m>W</m>. By definition <m>x_W</m> lies in <m>W=\Col(A)</m> and so there is a vector <m>c</m> in <m>\R^n</m> with <m>Ac = x_W</m>. Choose any such vector <m>c</m>. We know that <m>x-x_W=x-Ac</m> lies in <m>W^\perp</m>, which is equal to <m>\Nul(A^T)</m> by this <xref ref="shortcuts-for-orthog-comp"/>. We thus have
<me>
0=A^T(x-Ac) = A^Tx-A^TAc
</me>
and so
<me>
A^TAc = A^Tx.
</me>
This exactly means that <m>A^TAc = A^Tx</m> is consistent. If <m>c</m> is any solution to <m>A^TAc=A^Tx</m> then by reversing the above logic, we conclude that <m>x_W = Ac</m>.
</p>
</proof>
</theorem>
<specialcase xml:id="projections-onto-line">
<title>Orthogonal projection onto a line</title>
<p>
Let <m>L = \Span\{u\}</m> be a line in <m>\R^n</m> and let <m>x</m> be a vector in <m>\R^n</m>. By the <xref ref="projections-ATA-formula"/>, to find <m>x_L</m> we must solve the matrix equation <m>u^Tuc = u^Tx</m>, where we regard <m>u</m> as an <m>n\times 1</m> matrix (the column space of this matrix is exactly <m>L</m>!). But <m>u^Tu = u\cdot u</m> and <m>u^Tx = u\cdot x</m>, so <m>c = (u\cdot x)/(u\cdot u)</m> is a solution of <m>u^Tuc = u^Tx</m>, and hence
<m>x_L = uc = (u\cdot x)/(u\cdot u)\,u.</m>
<latex-code>
\begin{tikzpicture}[thin border nodes]
\draw[seq-violet] (-3,-2) -- node[below right, very near start] {$L$} (3,2);
\draw[vector] (0,0) -- node[below right] {$u$} (1.5,1);
\point[seq-red] (x) at (-3,2);
\point (o) at (0,0);
\draw[vector,seq-red] (o) -- node[auto,swap] {$x$} (x);
\point[seq-blue, "$x_L = \dfrac{u\cdot x}{u\cdot u}\,u$" {below right,seq-blue}]
(p) at (${-2.5/(1.5*1.5+1)}*(1.5,1)$);
\draw[vector, seq-green] (p) -- node[below left] {$x_{L^\perp}$} (x);
\pic[draw] {right angle=(x)--(p)--(o)};
\end{tikzpicture}
</latex-code>
</p>
</specialcase>
<p>To reiterate:</p>
<bluebox>
<title>Recipe: Orthogonal projection onto a line</title>
<idx><h>Orthogonal projection</h><h>onto a line</h></idx>
<idx><h>Line</h><h>orthogonal projection onto</h></idx>
<p>
If <m>L = \Span\{u\}</m> is a line, then
<me>
x_L = \frac{u\cdot x}{u\cdot u}\,u
\sptxt{and}
x_{L^\perp} = x - x_L
</me>
for any vector <m>x</m>.
</p>
</bluebox>
<remark xml:id="simple-proof-for-projection-to-line">
<title>Simple proof for the formula for projection onto a line</title>
<p>
In the special case where we are projecting a vector <m>x</m> in <m>\R^n</m> onto a line <m>L = \Span\{u\}</m>, our formula for the projection can be derived very directly and simply. The vector <m>x_L</m> is a multiple of <m>u</m>, say <m>x_L=cu</m>. This multiple is chosen so that <m>x-x_L=x-cu</m> is perpendicular to <m>u</m>, as in the following picture.
<latex-code>
\begin{tikzpicture}[thin border nodes]
\draw[seq-violet] (-3,-2) -- node[below right, very near start] {$L$} (3,2);
\draw[vector] (0,0) -- node[below right] {$u$} (1.5,1);
\draw[vector] (0,0) -- node[below right] {$cu$} (-15/13,-10/13);
\point[seq-red] (x) at (-3,2);
\point (o) at (0,0);
\draw[vector,seq-red] (o) -- node[auto,swap] {$x$} (x);
\point[seq-blue, "$c = \dfrac{u\cdot x}{u\cdot u}$" {below right,seq-blue}]
(p) at (${-2.5/(1.5*1.5+1)}*(1.5,1)$);
\draw[vector, seq-green] (p) -- node[below left] {$x-cu$} (x);
\pic[draw] {right angle=(x)--(p)--(o)};
\end{tikzpicture}
</latex-code>
In other words,
<me>
(x-cu) \cdot u = 0.
</me>
Using the distributive property for the dot product and isolating the variable <m>c</m> gives us that
<me>
c = \frac{u\cdot x}{u\cdot u}
</me>
and so
<me>x_L = cu = \frac{u\cdot x}{u\cdot u}\,u.</me>
</p>
</remark>
<example xml:id="projections-onto-line2">
<title>Projection onto a line in <m>\R^2</m></title>
<statement>
<p>
Compute the orthogonal projection of <m>x = {-6\choose 4}</m> onto the line <m>L</m> spanned by <m>u = {3\choose 2}</m>, and find the distance from <m>x</m> to <m>L</m>.
</p>
</statement>
<solution>
<p>
First we find
<me>
x_L = \frac{x\cdot u}{u\cdot u}\,u = \frac{-18+8}{9+4}\vec{3 2} = -\frac{10}{13}\vec{3 2}
\qquad
x_{L^\perp} = x - x_L = \frac 1{13}\vec{-48 72}.
</me>
The distance from <m>x</m> to <m>L</m> is
<me>
\|x_{L^\perp}\| = \frac 1{13}\sqrt{48^2 + 72^2} \approx 6.656.
</me>
<latex-code>
\begin{tikzpicture}[thin border nodes]
\draw[seq-violet] (-3,-2) -- node[below right, very near start] {$L$} (3,2);
\draw[vector] (0,0) -- node[below right, at end] {$\vec{3 2}$} (1.5,1);
\point[seq-red] (x) at (-3,2);
\point (o) at (0,0);
\draw[vector,seq-red] (o) -- node[at end, left=1mm] {$\vec{-6 4}$} (x);
\point[seq-blue, "$\displaystyle-\frac{10}{13}\vec{3 2}$" {below right,xshift=-2mm,yshift=2mm,seq-blue}]
(p) at (${-2.5/(1.5*1.5+1)}*(1.5,1)$);
\draw (p) -- (x);
\pic[draw] {right angle=(x)--(p)--(o)};
\end{tikzpicture}
</latex-code>
</p>
<figure>
<caption>Distance from the line <m>L</m>.</caption>
<mathbox source="demos/projection.html?u1=3,2&vec=-6,4&labels=u&closed&mode=distance" height="500px"/>
</figure>
</solution>
</example>
<example xml:id="projections-onto-line3">
<title>Projection onto a line in <m>\R^3</m></title>
<statement>
<p>
Let
<me>
x = \vec{-2 3 -1} \qquad u = \vec{-1 1 1},
</me>
and let <m>L</m> be the line spanned by <m>u</m>. Compute <m>x_L</m> and <m>x_L^\perp</m>.
</p>
</statement>
<solution>
<p>
<me>
x_L = \frac{x\cdot u}{u\cdot u}\,u = \frac{2+3-1}{1+1+1}\vec{-1 1 1} = \frac{4}{3}\vec{-1 1 1}
\qquad
x_{L^\perp} = x - x_L = \frac 13\vec{-2 5 -7}.
</me>
</p>
<figure>
<caption>Orthogonal projection onto the line <m>L</m>.</caption>
<mathbox source="demos/projection.html?u1=-1,1,1&vec=-2,3,-1&labels=u&range=3.5&closed" height="500px"/>
</figure>
</solution>
</example>
<p>
When <m>A</m> is a matrix with more than one column, computing the orthogonal projection of <m>x</m> onto <m>W = \Col(A)</m> means solving the matrix equation <m>A^TAc = A^Tx</m>. In other words, we can compute the closest vector by <em>solving a system of linear equations</em>. To be explicit, we state the <xref ref="projections-ATA-formula"/> as a recipe:
</p>
<bluebox>
<title>Recipe: Compute an orthogonal decomposition</title>
<idx><h>Orthogonal projection</h><h>computation of</h><h>row reduction</h></idx>
<p>
Let <m>W</m> be a subspace of <m>\R^m</m>. Here is a method to compute the orthogonal decomposition of a vector <m>x</m> with respect to <m>W</m>:
<ol start="0">
<li>
Rewrite <m>W</m> as the column space of a matrix <m>A</m>. In other words, find a a spanning set for <m>W</m>, and let <m>A</m> be the matrix with those columns.
</li>
<li>
Compute the matrix <m>A^TA</m> and the vector <m>A^Tx</m>.
</li>
<li>
Form the augmented matrix for the matrix equation <m>A^TAc = A^Tx</m> in the unknown vector <m>c</m>, and row reduce.
</li>
<li>
This equation is always consistent; choose one solution <m>c</m>. Then
<me>
x_W = Ac \qquad x_{W^\perp} = x - x_W.
</me>
</li>
</ol>
</p>
</bluebox>
<example xml:id="projections-onto-xy-plane">
<title>Projection onto the <m>xy</m>-plane</title>
<statement>
<p>
Use the <xref ref="projections-ATA-formula"/> to compute the orthogonal decomposition of a vector with respect to the <m>xy</m>-plane in <m>\R^3</m>.
</p>
</statement>
<solution>
<p>
A basis for the <m>xy</m>-plane is given by the two standard coordinate vectors
<me>
e_1 = \vec{1 0 0} \qquad e_2 = \vec{0 1 0}.
</me>
Let <m>A</m> be the matrix with columns <m>e_1,e_2</m>:
<me>
A = \mat{1 0; 0 1; 0 0}.
</me>
Then
<me>
A^TA = \mat{1 0; 0 1} = I_2 \qquad
A^T\vec{x_1 x_2 x_3} = \mat{1 0 0; 0 1 0}\vec{x_1 x_2 x_3} = \vec{x_1 x_2}.
</me>
It follows that the unique solution <m>c</m> of <m>A^TAc = I_2c = A^Tx</m> is given by the first two coordinates of <m>x</m>, so
<me>
x_W = A\vec{x_1 x_2} = \mat{1 0; 0 1; 0 0}\vec{x_1 x_2} = \vec{x_1 x_2 0}
\qquad
x_{W^\perp} = x - x_W = \vec{0 0 x_3}.
</me>
We have recovered this <xref ref="projections-eg-xy-plane"/>.
</p>
</solution>
</example>
<example xml:id="projections-onto-plane">
<title>Projection onto a plane in <m>\R^3</m></title>
<statement>
<p>
Let
<me>
W = \Span\left\{\vec{1 0 -1},\;\vec{1 1 0}\right\}
\qquad
x = \vec{1 2 3}.
</me>
Compute <m>x_W</m> and the distance from <m>x</m> to <m>W</m>.
</p>
</statement>
<solution>
<p>
We have to solve the matrix equation <m>A^TAc = A^Tx</m>, where
<me>A = \mat{1 1; 0 1; -1 0}.</me>
We have
<me>
A^TA = \mat{2 1; 1 2} \qquad A^Tx = \vec{-2 3}.
</me>
We form an augmented matrix and row reduce:
<me>
\amat{2 1 -2; 1 2 3} \rref
\amat{1 0 -7/3; 0 1 8/3}
\implies
c = \frac 13\vec{-7 8}.
</me>
It follows that
<me>
x_W = Ac = \frac 13\vec{1 8 7}
\qquad
x_{W^\perp} = x - x_W = \frac 13\vec{2 -2 2}.
</me>
The distance from <m>x</m> to <m>W</m> is
<me>
\|x_{W^\perp}\| = \frac 1{3}\sqrt{4+4+4} \approx 1.155.
</me>
</p>
<figure>
<caption>Orthogonal projection onto the plane <m>W</m>.</caption>
<mathbox source="demos/projection.html?u1=1,0,-1&u2=-1,-2,-1&vec=1,2,3&labels=v1,v2&range=3.5&closed&mode=decomp" height="500px"/>
</figure>
</solution>
</example>
<example xml:id="projections-onto-plane-2">
<title>Projection onto another plane in <m>\R^3</m></title>
<statement>
<p>
Let
<me>
W = \left\{\vec{x_1 x_2 x_3}\bigm|x_1 - 2x_2 = x_3\right\}
\sptxt{and}
x = \vec{1 1 1}.
</me>
Compute <m>x_W</m>.
</p>
</statement>
<solution>
<p>
<em>Method 1:</em> First we need to find a spanning set for <m>W</m>. We notice that <m>W</m> is the solution set of the homogeneous equation <m>x_1 - 2x_2 - x_3 = 0</m>, so <m>W = \Nul\mat{1 -2 -1}</m>. We know how to compute a basis for a null space: we row reduce and find the parametric vector form. The matrix <m>\mat{1 -2 -1}</m> is already in reduced row echelon form. The parametric form is <m>x_1 = 2x_2 + x_3</m>, so the parametric vector form is
<me>\vec{x_1 x_2 x_3} = x_2\vec{2 1 0} + x_3\vec{1 0 1},</me>
and hence a basis for <m>V</m> is given by
<me>\left\{\vec{2 1 0},\;\vec{1 0 1}\right\}.</me>
We let <m>A</m> be the matrix whose columns are our basis vectors:
<me>A = \mat{2 1; 1 0; 0 1}.</me>
Hence <m>\Col(A) = \Nul\mat{1 -2 -1} = W</m>.
</p>
<p>
Now we can continue with step 1 of the recipe. We compute
<me>
A^TA = \mat{5 2; 2 2} \qquad A^Tx = \vec{3 2}.
</me>
We write the linear system <m>A^TAc = A^Tx</m> as an augmented matrix and row reduce:
<me>
\amat{5 2 3; 2 2 2} \rref \amat{1 0 1/3; 0 1 2/3}.
</me>
Hence we can take <m>c = {1/3\choose 2/3}</m>, so
<me>
x_W = Ac = \mat{2 1; 1 0; 0 1}\vec{1/3 2/3} = \frac 13\vec{4 1 2}.
</me>
</p>
<figure>
<caption>Orthogonal projection onto the plane <m>W</m>.</caption>
<mathbox source="demos/projection.html?u1=1,1,-1&u2=1,0,1&vec=1,1,1&labels=v1,v2&range=3.5&closed&mode=decomp" height="500px"/>
</figure>
<p>
<em>Method 2:</em> In this case, it is easier to compute <m>x_{W^\perp}</m>. Indeed, since <m>W = \Nul\mat{1 -2 -1},</m> the orthogonal complement is the line
<me>V = W^\perp = \Col\vec{1 -2 -1}.</me>
Using the formula for <xref ref="projections-onto-line">projection onto a line</xref> gives
<me>
x_{W^\perp} = x_V = \frac{\vec{1 1 1}\cdot\vec{1 -2 -1}}{\vec{1 -2 -1}\cdot\vec{1 -2 -1}}\vec{1 -2 -1} = \frac 13\vec{-1 2 1}.
</me>
Hence we have
<me>
x_W = x - x_{W^\perp} = \vec{1 1 1} - \frac 13\vec{-1 2 1}.
= \frac 13\vec{4 1 2},
</me>
as above.
</p>
</solution>
</example>
<example>
<title>Projection onto a <m>3</m>-space in <m>\R^4</m></title>
<statement>
<p>
Let
<me>
W = \Span\left\{\vec{1 0 -1 0},\;\vec{0 1 0 -1},\;\vec{1 1 1 -1}\right\}
\qquad
x = \vec{0 1 3 4}.
</me>
Compute the orthogonal decomposition of <m>x</m> with respect to <m>W</m>.
</p>
</statement>
<solution>
<p>
We have to solve the matrix equation <m>A^TAc = A^Tx</m>, where
<me>
A = \mat[r]{1 0 1; 0 1 1; -1 0 1; 0 -1 -1}.
</me>
We compute
<me>
A^TA = \mat{2 0 0; 0 2 2; 0 2 4} \qquad
A^Tx = \vec{-3 -3 0}.
</me>
We form an augmented matrix and row reduce:
<me>
\amat{2 0 0 -3; 0 2 2 -3; 0 2 4 0}
\rref
\amat{1 0 0 -3/2; 0 1 0 -3; 0 0 1 3/2}
\implies
c = \frac 12\vec{-3 -6 3}.
</me>
It follows that
<me>
x_W = Ac = \frac 12\vec{0 -3 6 3} \qquad
x_{W^\perp} = \frac 12\vec{0 5 0 5}.
</me>
</p>
</solution>
</example>
<p>
In the context of the above recipe, if we start with a <em>basis</em> of <m>W</m>, then it turns out that the square matrix <m>A^TA</m> is automatically invertible! (It is always the case that <m>A^TA</m> is square and the equation <m>A^TAc = A^Tx</m> is consistent, but <m>A^TA</m> need not be invertible in general.)
</p>
<corollary xml:id="projections-ATA-formula2">
<idx><h>Orthogonal projection</h><h>computation of</h><h>complicated matrix formula</h></idx>
<idx><h>Basis</h><h>and orthogonal projection</h></idx>
<statement>
<p>
Let <m>A</m> be an <m>m \times n</m> matrix with linearly independent columns and let <m>W = \Col(A)</m>. Then the <m>n\times n</m> matrix <m>A^TA</m> is invertible, and for all vectors <m>x</m> in <m>\R^m</m>, we have
<me>
x_W = A(A^TA)\inv A^Tx.
</me>
</p>
</statement>
<proof visible="true">
<p>
We will show that <m>\Nul(A^TA)=\{0\}</m>, which implies invertibility by the <xref ref="imt-2"/>. Suppose that <m>A^TAc = 0</m>. Then <m>A^TAc = A^T0</m>, so <m>0_W = Ac</m> by the <xref ref="projections-ATA-formula"/>. But <m>0_W = 0</m> (the orthogonal decomposition of the zero vector is just <m>0 = 0 + 0)</m>, so <m>Ac = 0</m>, and therefore <m>c</m> is in <m>\Nul(A)</m>. Since the columns of <m>A</m> are linearly independent, we have <m>c=0</m>, so <m>\Nul(A^TA)=0</m>, as desired.
</p>
<p>
Let <m>x</m> be a vector in <m>\R^n</m> and let <m>c</m> be a solution of <m>A^TAc = A^Tx</m>. Then <m>c = (A^TA)\inv A^Tx</m>, so <m>x_W =Ac = A(A^TA)\inv A^Tx</m>.
</p>
</proof>
</corollary>
<p>
The corollary applies in particular to the case where we have a subspace <m>W</m> of <m>\R^m</m>, and a basis <m>v_1,v_2,\ldots,v_n</m> for <m>W</m>. To apply the corollary, we take <m>A</m> to be the <m>m\times n</m> matrix with columns <m>v_1,v_2,\ldots,v_n</m>.
</p>
<example xml:id="projections-onto-plane2">
<title>Computing a projection</title>
<statement>
<p>
Continuing with the above <xref ref="projections-onto-plane"/>, let
<me>
W = \Span\left\{\vec{1 0 -1},\;\vec{1 1 0}\right\}
\qquad
x = \vec{x_1 x_2 x_3}.
</me>
Compute <m>x_W</m> using the formula <m>x_W = A(A^TA)\inv A^Tx</m>.
</p>
</statement>
<solution>
<p>
Clearly the spanning vectors are noncollinear, so according to the <xref ref="projections-ATA-formula2"/>, we have <m>x_W = A(A^TA)\inv A^Tx</m>, where
<me>A = \mat{1 1; 0 1; -1 0}.</me>
We compute
<me>
A^TA = \mat{2 1; 1 2} \implies (A^TA)\inv = \frac 13\mat{2 -1; -1 2},
</me>
so
<me>
\spalignsysdelims()\spalignsystabspace=0pt
\begin{split}
x_W \amp= A(A^TA)\inv A^Tx
= \mat{1 1; 0 1; -1 0}\frac 13\mat{2 -1; -1 2}\mat{1 0 -1; 1 1 0}\vec{x_1 x_2 x_3} \\
\amp= \frac 13\mat{2 1 -1; 1 2 1; -1 1 2}\vec{x_1 x_2 x_3}
= \frac 13\syseq{2x_1 + x_2 - x_3; x_1 + 2x_2 + x_3; -x_1 + x_2 + 2x_3}.
\end{split}
</me>
So, for example, if <m>x=(1,0,0)</m>, this formula tells us that <m>x_W = (2,1,-1)</m>.
</p>
</solution>
</example>
</subsection>
<subsection>
<title>Orthogonal Projection</title>
<idx><h>Orthogonal projection</h><h>as a transformation</h></idx>
<p>
In this subsection, we change perspective and think of the orthogonal projection <m>x_W</m> as a <em>function</em> of <m>x</m>. This function turns out to be a linear transformation with many nice properties, and is a good example of a linear transformation which is not originally defined as a matrix transformation.
</p>
<proposition hide-type="true" xml:id="projections-properties-of">
<title>Properties of Orthogonal Projections</title>
<idx><h>Orthogonal projection</h><h>properties of</h></idx>
<idx><h>Orthogonal projection</h><h>linearity of</h></idx>
<idx><h>Orthogonal projection</h><h>composed with itself</h></idx>
<idx><h>Orthogonal projection</h><h>range of</h></idx>
<statement>
<p>
Let <m>W</m> be a subspace of <m>\R^n</m>, and define <m>T\colon\R^n\to\R^n</m> by <m>T(x) = x_W</m>. Then:
<ol>
<li><m>T</m> is a linear transformation.</li>
<li><m>T(x)=x</m> if and only if <m>x</m> is in <m>W</m>.</li>
<li><m>T(x)=0</m> if and only if <m>x</m> is in <m>W^\perp</m>.</li>
<li><m>T\circ T = T</m>.</li>
<li>The range of <m>T</m> is <m>W</m>.</li>
</ol>
</p>
</statement>
<proof>
<p>
<ol>
<li>
We have to verify the <xref ref="linear-trans-defn">defining properties of linearity</xref>. Let <m>x,y</m> be vectors in <m>\R^n</m>, and let <m>x = x_W + x_{W^\perp}</m> and <m>y = y_W + y_{W^\perp}</m> be their orthogonal decompositions. Since <m>W</m> and <m>W^\perp</m> are subspaces, the sums <m>x_W+y_W</m> and <m>x_{W^\perp}+y_{W^\perp}</m> are in <m>W</m> and <m>W^\perp</m>, respectively. Therefore, the orthogonal decomposition of <m>x+y</m> is <m>(x_W+y_W)+(x_{W^\perp}+y_{W^\perp})</m>, so
<me>
T(x+y) = (x+y)_W = x_W+y_W = T(x) + T(y).
</me>
Now let <m>c</m> be a scalar. Then <m>cx_W</m> is in <m>W</m> and <m>cx_{W^\perp}</m> is in <m>W^\perp</m>, so the orthogonal decomposition of <m>cx</m> is <m>cx_W + cx_{W^\perp}</m>, and therefore,
<me>
T(cx) = (cx)_W = cx_W = cT(x).
</me>
Since <m>T</m> satisfies the two <xref ref="linear-trans-defn">defining properties</xref>, it is a linear transformation.
</li>
<li>
See this <xref ref="projections-eg-inW"/>.
</li>
<li>
See this <xref ref="projections-eg-inWperp"/>.
</li>
<li>
For any <m>x</m> in <m>\R^n</m> the vector <m>T(x)</m> is in <m>W</m>, so <m>T\circ T(x) = T(T(x)) = T(x)</m> by 2.
</li>
<li>
Any vector <m>x</m> in <m>W</m> is in the range of <m>T</m>, because <m>T(x) = x</m> for such vectors. On the other hand, for any vector <m>x</m> in <m>\R^n</m> the output <m>T(x) = x_W</m> is in <m>W</m>, so <m>W</m> is the range of <m>T</m>.
</li>
</ol>
</p>
</proof>
</proposition>
<p>
<idx><h>Orthogonal projection</h><h>standard matrix of</h></idx>
<idx><h>Linear transformation</h><h>standard matrix of</h><h>orthogonal projection</h></idx>
We compute the standard matrix of the orthogonal projection in the same way as for <xref ref="matrix-of-transformation" text="title">any other transformation</xref>: by evaluating on the standard coordinate vectors. In this case, this means projecting the standard coordinate vectors onto the subspace.
</p>
<example>
<title>Matrix of a projection</title>
<statement>
<p>
Let <m>L</m> be the line in <m>\R^2</m> spanned by the vector <m>u = {3\choose 2}</m>, and define <m>T\colon\R^2\to\R^2</m> by <m>T(x)=x_L</m>. Compute the standard matrix <m>B</m> for <m>T</m>.
</p>
</statement>
<solution>
<p>
The columns of <m>B</m> are <m>T(e_1) = (e_1)_L</m> and <m>T(e_2) = (e_2)_L</m>. We have
<me>
\left.
\begin{split}
(e_1)_L \amp= \frac{u\cdot e_1}{u\cdot u}\,u
= \frac{3}{13}\vec{3 2} \\
(e_2)_L \amp= \frac{u\cdot e_2}{u\cdot u}\,u
= \frac{2}{13}\vec{3 2}
\end{split}
\right\} \quad\implies\quad
B = \frac 1{13}\mat{9 6; 6 4}.
</me>
</p>
</solution>
</example>
<example>
<title>Matrix of a projection</title>
<statement>
<p>
Let <m>L</m> be the line in <m>\R^2</m> spanned by the vector
<me>
u = \vec{-1 1 1},
</me>
and define <m>T\colon\R^3\to\R^3</m> by <m>T(x)=x_L</m>. Compute the standard matrix <m>B</m> for <m>T</m>.
</p>
</statement>
<solution>
<p>
The columns of <m>B</m> are <m>T(e_1) = (e_1)_L</m>, <m>T(e_2) = (e_2)_L</m>, and <m>T(e_3) = (e_3)_L</m>. We have
<me>
\left.
\begin{split}
(e_1)_L \amp= \frac{u\cdot e_1}{u\cdot u}\,u
= \frac{-1}{3}\vec{-1 1 1} \\
(e_2)_L \amp= \frac{u\cdot e_2}{u\cdot u}\,u
= \frac{1}{3}\vec{-1 1 1} \\
(e_3)_L \amp= \frac{u\cdot e_3}{u\cdot u}\,u
= \frac{1}{3}\vec{-1 1 1}
\end{split}
\right\} \quad\implies\quad
B = \frac 1{3}\mat{1 -1 -1; -1 1 1; -1 1 1}.
</me>
</p>
</solution>
</example>
<example xml:id="projections-onto-plane3">
<title>Matrix of a projection</title>
<statement>
<p>
Continuing with this <xref ref="projections-onto-plane"/>, let
<me>
W = \Span\left\{\vec{1 0 -1},\;\vec{1 1 0}\right\},
</me>
and define <m>T\colon\R^3\to\R^3</m> by <m>T(x)=x_W</m>. Compute the standard matrix <m>B</m> for <m>T</m>.
</p>
</statement>
<solution>
<p>
The columns of <m>B</m> are <m>T(e_1) = (e_1)_W</m>, <m>T(e_2) = (e_2)_W</m>, and <m>T(e_3) = (e_3)_W</m>. Let
<me>
A = \mat{1 1; 0 1; -1 0}.
</me>
To compute each <m>(e_i)_W</m>, we solve the matrix equation <m>A^TAc = A^Te_i</m> for <m>c</m>, then use the equality <m>(e_i)_W = Ac</m>. First we note that
<me>
A^TA = \mat{2 1; 1 2};
\qquad
A^Te_i = \text{the $i$th column of } A^T = \mat{1 0 -1; 1 1 0}.
</me>
For <m>e_1</m>, we form an augmented matrix and row reduce:
<me>
\amat{2 1 1; 1 2 1} \rref
\amat{1 0 1/3; 0 1 1/3}
\implies
(e_1)_W = A\vec{1/3 1/3} = \frac 13\vec{2 1 -1}.
</me>
We do the same for <m>e_2</m>:
<me>
\amat{2 1 0; 1 2 1} \rref
\amat{1 0 -1/3; 0 1 2/3}
\implies
(e_1)_W = A\vec{-1/3 2/3} = \frac 13\vec{1 2 1}
</me>
and for <m>e_3</m>:
<me>
\amat{2 1 -1; 1 2 0} \rref
\amat{1 0 -2/3; 0 1 1/3}
\implies
(e_1)_W = A\vec{-2/3 1/3} = \frac 13\vec{-1 1 2}.
</me>
It follows that
<me>
B = \frac 13\mat{2 1 -1; 1 2 1; -1 1 2}.
</me>
</p>
</solution>
</example>
<p>
In the previous <xref ref="projections-onto-plane3"/>, we could have used the fact that
<me>
\left\{\vec{1 0 -1},\;\vec{1 1 0}\right\}
</me>
forms a <em>basis</em> for <m>W</m>, so that
<me>
T(x) = x_W = \bigl[A(A^TA)\inv A^T\bigr]x
\sptxt{for}
A = \mat{1 1; 0 1; -1 0}
</me>
by the <xref ref="projections-ATA-formula2"/>. In this case, we have already expressed <m>T</m> as a matrix transformation with matrix <m>A(A^TA)\inv A^T</m>. See this <xref ref="projections-onto-plane2"/>.
</p>
<bluebox>
<idx><h>Orthogonal projection</h><h>standard matrix of</h><h>complicated matrix formula</h></idx>
<p>
Let <m>W</m> be a subspace of <m>\R^n</m> with basis <m>v_1,v_2,\ldots,v_m</m>, and let <m>A</m> be the matrix with columns <m>v_1,v_2,\ldots,v_m</m>. Then the standard matrix for <m>T(x) = x_W</m> is
<me>
A(A^TA)\inv A^T.
</me>
</p>
</bluebox>
<p>
We can translate the above <xref ref="projections-properties-of">properties of orthogonal projections</xref> into properties of the associated standard matrix.
</p>
<proposition hide-type="true" xml:id="projections-matrices-properties-of">
<title>Properties of Projection Matrices</title>
<idx><h>Projection matrix</h><see>Orthogonal projection, standard matrix of</see></idx>
<idx><h>Matrix</h><h>projection</h><see>Orthogonal projection, standard matrix of</see></idx>
<idx><h>Orthogonal projection</h><h>standard matrix of</h><h>properties of</h></idx>
<idx><h>Orthogonal projection</h><h>standard matrix of</h><h>column space of</h></idx>
<idx><h>Column space</h><h>of an orthogonal projection</h></idx>
<idx><h>Orthogonal projection</h><h>standard matrix of</h><h>null space of</h></idx>
<idx><h>Null space</h><h>of an orthogonal projection</h></idx>
<idx><h>Orthogonal projection</h><h>standard matrix of</h><h>square of</h></idx>
<idx><h>Orthogonal projection</h><h>standard matrix of</h><h>diagonalizability of</h></idx>
<idx><h>Orthogonal projection</h><h>standard matrix of</h><h>eigenvalues of</h></idx>
<idx><h>Orthogonal projection</h><h>standard matrix of</h><h>eigenvectors of</h></idx>
<idx><h>Orthogonal projection</h><h>standard matrix of</h><h>noninvertibility of</h></idx>
<idx><h>Eigenvalue</h><h>of a projection matrix</h></idx>
<idx><h>Eigenvector</h><h>of a projection matrix</h></idx>
<idx><h>Eigenspace</h><h>of a projection matrix</h></idx>
<idx><h>Diagonalizability</h><h>of a projection matrix</h></idx>
<statement>
<p>
Let <m>W</m> be a subspace of <m>\R^n</m>, define <m>T\colon\R^n\to\R^n</m> by <m>T(x) = x_W</m>, and let <m>B</m> be the standard matrix for <m>T</m>. Then:
<ol>
<li><m>\Col(B) = W.</m></li>
<li><m>\Nul(B) = W^\perp.</m></li>
<li><m>B^2 = B.</m></li>
<li>If <m>W \neq \{0\}</m>, then 1 is an eigenvalue of <m>B</m> and the 1-eigenspace for <m>B</m> is <m>W</m>.</li>
<li>If <m>W \neq \R^n</m>, then 0 is an eigenvalue of <m>B</m> and the 0-eigenspace for <m>B</m> is <m>W^\perp</m>.</li>
<li><m>B</m> is similar to the diagonal matrix with <m>m</m> ones and <m>n-m</m> zeros on the diagonal, where <m>m = \dim(W).</m></li>
</ol>
</p>
</statement>
<proof>
<p>
The first four assertions are translations of <xref ref="projections-properties-of">properties 5, 3, 4, and 2</xref>, respectively, using this <xref ref="matrix-trans-dictionary"/> and this <xref ref="matrix-mult-comp-is-prod"/>. The fifth assertion is equivalent to the second, by this <xref ref="evecs-eval0"/>.
</p>
<p>
For the final assertion, we showed in the proof of this <xref ref="projections-decomp-exists"/> that there is a basis of <m>\R^n</m> of the form <m>\{v_1,\ldots,v_m,v_{m+1},\ldots,v_n\}</m>, where <m>\{v_1,\ldots,v_m\}</m> is a basis for <m>W</m> and <m>\{v_{m+1},\ldots,v_n\}</m> is a basis for <m>W^\perp</m>. Each <m>v_i</m> is an eigenvector of <m>B</m>: indeed, for <m>i\leq m</m> we have
<me>
Bv_i = T(v_i) = v_i = 1\cdot v_i
</me>
because <m>v_i</m> is in <m>W</m>, and for <m>i > m</m> we have
<me>
Bv_i = T(v_i) = 0 = 0\cdot v_i
</me>
because <m>v_i</m> is in <m>W^\perp</m>. Therefore, we have found a basis of eigenvectors, with associated eigenvalues <m>1,\ldots,1,0,\ldots,0</m> (<m>m</m> ones and <m>n-m</m> zeros). Now we use the <xref ref="diagonalization-thm"/>.
</p>
</proof>
</proposition>
<p>We emphasize that the <xref ref="projections-matrices-properties-of">properties of projection matrices</xref> would be very hard to prove in terms of matrices. By translating all of the statements into statements about linear transformations, they become much more transparent. For example, consider the projection matrix we found in this <xref ref="projections-onto-plane3"/>. Just by looking at the matrix it is not at all obvious that when you square the matrix you get the same matrix back.</p>
<example xml:id="projections-onto-plane4">
<p>
Continuing with the above <xref ref="projections-onto-plane3"/>, we showed that
<me>
B = \frac 13\mat{2 1 -1; 1 2 1; -1 1 2}
</me>
is the standard matrix of the orthogonal projection onto
<me>
W = \Span\left\{\vec{1 0 -1},\;\vec{1 1 0}\right\}.
</me>
One can verify by hand that <m>B^2=B</m> (try it!). We compute <m>W^\perp</m> as the null space of
<me>
\mat{1 0 -1; 1 1 0} \rref \mat{1 0 -1; 0 1 1}.
</me>
The free variable is <m>x_3</m>, and the parametric form is <m>x_1 = x_3,\,x_2 = -x_3</m>, so that
<me>
W^\perp = \Span\left\{\vec{1 -1 1}\right\}.
</me>
It follows that <m>B</m> has eigenvectors
<me>
\vec{1 0 -1},\qquad \vec{1 1 0},\qquad \vec{1 -1 1}
</me>
with eigenvalues <m>1,1,0</m>, respectively, so that
<me>
B = \mat{1 1 1; 0 1 -1; -1 0 1}\mat{1 0 0; 0 1 0; 0 0 0}\mat{1 1 1; 0 1 -1; -1 0 1}\inv.
</me>
</p>
</example>
<remark>
<p>
As we saw in this <xref ref="projections-onto-plane4"/>, if you are willing to compute bases for <m>W</m> and <m>W^\perp</m>, then this provides a third way of finding the standard matrix <m>B</m> for projection onto <m>W</m>: indeed, if <m>\{v_1,v_2,\ldots,v_m\}</m> is a basis for <m>W</m> and <m>\{v_{m+1},v_{m+2},\ldots,v_n\}</m> is a basis for <m>W^\perp</m>, then
<me>
B = \mat{| | ,, |; v_1 v_1 \cdots, v_n; | | ,, |}
\mat{
1 \cdots, 0 0 \cdots, 0;
\vdots, \ddots, \vdots, \vdots, \ddots, \vdots;
0 \cdots, 1 0 \cdots, 0;
0 \cdots, 0 0 \cdots, 0;
\vdots, \ddots, \vdots, \vdots, \ddots, \vdots;
0 \cdots, 0 0 \cdots, 0}
\mat{| | ,, |; v_1 v_1 \cdots, v_n; | | ,, |}\inv,
</me>
where the middle matrix in the product is the diagonal matrix with <m>m</m> ones and <m>n-m</m> zeros on the diagonal. However, since you already have a basis for <m>W</m>, it is faster to multiply out the expression <m>A(A^TA)\inv A^T</m> as in the <xref ref="projections-ATA-formula2"/>.
</p>
</remark>
<remark>
<title>Reflections</title>
<idx><h>Reflection</h><h>in general</h></idx>
<p>
Let <m>W</m> be a subspace of <m>\R^n</m>, and let <m>x</m> be a vector in <m>\R^n</m>. The <em>reflection</em> of <m>x</m> over <m>W</m> is defined to be the vector
<me>
\refl_W(x) = x - 2x_{W^\perp}.
</me>
In other words, to find <m>\refl_W(x)</m> one starts at <m>x</m>, then moves to <m>x-x_{W^\perp} = x_W</m>, then continues in the same direction one more time, to end on the opposite side of <m>W</m>.
<latex-code>
\begin{tikzpicture}[thin border nodes]
\draw[seq-violet] (-3,-2) -- node[below right, very near start] {$W$} (3,2);
\coordinate (x) at (-3,2);
\coordinate (o) at (0,0);
\draw[vector,seq-red] (o) -- node[auto,swap] {$x$} (x);
\point[seq-blue, "$x_W$" {seq-blue, below, yshift=-2mm}] (p) at (${-2.5/(1.5*1.5+1)}*(1.5,1)$);
\coordinate (p2) at ($-1*(x)+2*(p)$);
\draw[vector, seq-green] (x) -- node[below left] {$-x_{W^\perp}$} (p);
\draw[vector, seq-green] (p) -- node[below left] {$-x_{W^\perp}$} ($-1*(x)+2*(p)$);
\draw[vector, seq-green] (p) -- node[below left] {$-x_{W^\perp}$} (p2);
\pic[draw] {right angle=(o)--(p)--(p2)};
\draw[vector, seq-orange] (o) -- node[right=2pt] {$\refl_W(x)$} (p2);
\point at (o);
\end{tikzpicture}
</latex-code>
Since <m>x_{W^\perp} = x - x_W</m>, we also have
<me>
\refl_W(x) = x - 2(x - x_W) = 2x_W - x.
</me>
We leave it to the reader to check using the definition that:
<ol>
<li><m>\refl_W\circ\refl_W = \Id_{\R^n}.</m></li>
<li>The <m>1</m>-eigenspace of <m>\refl_W</m> is <m>W</m>, and the <m>-1</m>-eigenspace of <m>\refl_W</m> is <m>W^\perp</m>.</li>
<li><m>\refl_W</m> is similar to the diagonal matrix with <m>m = \dim(W)</m> ones on the diagonal and <m>n-m</m> negative ones.</li>
</ol>
</p>
</remark>
</subsection>
</section>