linear-trans.xml

<?xml version="1.0" encoding="UTF-8"?>

<!--********************************************************************
Copyright 2017 Georgia Institute of Technology

Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation.  A copy of
the license is included in gfdl.xml.
*********************************************************************-->

<section xml:id="linear-transformations">
  <title>Linear Transformations</title>

  <objectives>
    <ol>
      <li>Learn how to verify that a transformation is linear, or prove that a transformation is not linear.</li>
      <li>Understand the relationship between linear transformations and matrix transformations.</li>
      <li><em>Recipe:</em> compute the matrix of a linear transformation.</li>
      <li><em>Theorem:</em> linear transformations and matrix transformations.</li>
      <li><em>Notation:</em> the <term>standard coordinate vectors</term> <m>e_1,e_2,\ldots</m>.</li>
      <li><em>Vocabulary words:</em> <term>linear transformation</term>, <term>standard matrix</term>, <term>identity matrix</term>.</li>
    </ol>
  </objectives>

  <introduction>
    <p>
      In <xref ref="matrix-transformations"/>, we studied the geometry of matrices by regarding them as functions, i.e., by considering the associated <em>matrix transformations</em>.  We defined some vocabulary (domain, codomain, range), and asked a number of natural questions about a transformation.  For a matrix transformation, these translate into questions about matrices, which we have many tools to answer.
    </p>
    <p>
      In this section, we make a change in perspective.  Suppose that we are given a <em>transformation</em> that we would like to study.  If we can prove that our transformation is a matrix transformation, then we can use linear algebra to study it.  This raises two important questions:
      <ol>
        <li>
          How can we tell if a transformation is a matrix transformation?
        </li>
        <li>
          If our transformation is a matrix transformation, how do we find its matrix?
        </li>
      </ol>
      For example, we saw in this <xref ref="matrix-trans-eg-rotation"/> that the matrix transformation
      <me>T\colon\R^2\To\R^2 \qquad T(x) = \mat{0 -1; 1 0}x</me>
      is a counterclockwise rotation of the plane by <m>90^\circ</m>.  However, we could have <em>defined</em> <m>T</m> in this way:
      <me>T\colon\R^2\To\R^2 \qquad T(x) = \text{the counterclockwise rotation of $x$ by $90^\circ$}.</me>
      Given this definition, it is not at all obvious that <m>T</m> is a matrix transformation, or what matrix it is associated to.
    </p>
  </introduction>

  <subsection>
    <title>Linear Transformations: Definition</title>

    <p>
      In this section, we introduce the class of transformations that come from matrices.
    </p>

    <definition xml:id="linear-trans-defn">
      <idx><h>Linear transformation</h><h>definition of</h></idx>
      <idx><h>Transformation</h><h>linear</h><see>Linear transformation</see></idx>
      <statement>
        <p>
          A <term>linear transformation</term> is a transformation <m>T\colon\R^n\to\R^m</m> satisfying
          <md>
            <mrow>T(u + v) &amp;= T(u) + T(v)</mrow>
            <mrow>T(cu) &amp;= cT(u)</mrow>
          </md>
          for all vectors <m>u,v</m> in <m>\R^n</m> and all scalars <m>c</m>.
        </p>
      </statement>
    </definition>

    <p>
      <idx><h>Matrix transformation</h><h>linearity of</h></idx>
      Let <m>T\colon\R^n\to\R^m</m> be a matrix transformation: <m>T(x)=Ax</m> for an <m>m\times n</m> matrix <m>A</m>.  By this <xref ref="matrix-linearity"/>, we have
      <md>
        <mrow>T(u + v) = A(u + v) &amp;= Au + Av = T(u) + T(v)</mrow>
        <mrow>T(cu) = A(cu) &amp;= cAu = cT(u)</mrow>
      </md>
      for all vectors <m>u,v</m> in <m>\R^n</m> and all scalars <m>c</m>.  Since a matrix transformation satisfies the two defining properties, it is a linear transformation
    </p>

    <p>
      We will see in the next <xref ref="linear-trans-is-matrix-trans"/> that the opposite is true: every linear transformation is a matrix transformation; we just haven't computed its matrix yet.
    </p>

    <fact hide-type="true">
      <title>Facts about linear transformations</title>
      <idx><h>Linear transformation</h><h>basic facts</h></idx>
      <statement>
        <p>
          Let <m>T\colon\R^n\to\R^m</m> be a linear transformation.  Then:
          <ol>
            <li>
              <m>T(0)=0</m>.
            </li>
            <li>
              For any vectors <m>v_1,v_2,\ldots,v_k</m> in <m>\R^n</m> and scalars <m>c_1,c_2,\ldots,c_k</m>, we have
              <me>T\bigl( c_1v_1 + c_2v_2 + \cdots + c_kv_k \bigr)
              = c_1 T(v_1) + c_2 T(v_2) + \cdots + c_k T(v_k).</me>
            </li>
          </ol>
        </p>
      </statement>
      <proof>
        <p>
          <ol>
            <li>
              Since <m>0 = -0</m>, we have
              <me>T(0) = T(-0) = -T(0)</me>
              by the second <xref ref="linear-trans-defn">defining property</xref>.  The only vector <m>w</m> such that <m>w=-w</m> is the zero vector.
            </li>
            <li>
              Let us suppose for simplicity that <m>k=2</m>.  Then
              <md>
                <mrow>
                  T(c_1v_1 + c_2v_2) &amp;= T(c_1v_1) + T(c_2v_2) &amp; \text{first property}
                </mrow>
                <mrow>
                  &amp;= c_1T(v_1) + c_2T(v_2) &amp; \text{second property.}
                </mrow>
              </md>
            </li>
          </ol>
        </p>
      </proof>
    </fact>

    <p>
      <idx><h>Superposition principle</h></idx>
      In engineering, the second fact is called the <em>superposition principle</em>; it should remind you of the distributive property.
      For example, <m>T(cu + dv) = cT(u) + dT(v)</m> for any vectors <m>u,v</m> and any scalars <m>c,d</m>.  To restate the first fact:
    </p>

    <bluebox xml:id="linear-trans-T0-0">
      <p>A linear transformation necessarily takes the zero vector to the zero vector.</p>
    </bluebox>

    <example xml:id="linear-trans-eg-T0-non0">
      <title>A non-linear transformation</title>
      <statement>
        <p>
          Define <m>T\colon\R\to\R</m> by <m>T(x) = x+1</m>.  Is <m>T</m> a linear transformation?
        </p>
      </statement>
      <solution>
        <p>
          We have <m>T(0) = 0 + 1 = 1</m>.  Since any linear transformation necessarily takes zero to zero by the above <xref ref="linear-trans-T0-0"/>, we conclude that <m>T</m> is <em>not</em> linear (even though its graph is a line).
        </p>
        <p>
          <em>Note:</em> in this case, it was not necessary to check explicitly that <m>T</m> does not satisfy both <xref ref="linear-trans-defn">defining properties</xref>: since <m>T(0)=0</m> is a consequence of these properties, at least one of them must not be satisfied.  (In fact, this <m>T</m> satisfies neither.)
        </p>
      </solution>
    </example>

    <example>
      <title>Verifying linearity: dilation</title>
      <statement>
        <p>Define <m>T\colon\R^2\to\R^2</m> by <m>T(x)=1.5x</m>.  Verify that <m>T</m> is linear.</p>
      </statement>
      <solution>
        <p>
          We have to check the <xref ref="linear-trans-defn">defining properties</xref> for <em>all</em> vectors <m>u,v</m> and <em>all</em> scalars <m>c</m>.  In other words, we have to treat <m>u,v,</m> and <m>c</m> as <em>unknowns</em>.  The only thing we are allowed to use is the definition of <m>T</m>.
          <md>
            <mrow>
              T(u + v) &amp;= 1.5(u + v) = 1.5u + 1.5v = T(u) + T(v)
            </mrow>
            <mrow>
              T(cu) &amp;= 1.5(cu) = c(1.5u) = cT(u).
            </mrow>
          </md>
          Since <m>T</m> satisfies both defining properties, <m>T</m> is linear.
        </p>
        <p>
          <em>Note:</em> we know from this <xref ref="matrix-trans-eg-dilation"/> that <m>T</m> is a matrix transformation: in fact,
          <me>T(x) = \mat{1.5 0; 0 1.5}x.</me>
          Since a matrix transformation is a linear transformation, this is another proof that <m>T</m> is linear.
        </p>
      </solution>
    </example>

    <example>
      <title>Verifying linearity: rotation</title>
      <statement>
        <p>
          Define <m>T\colon\R^2\to\R^2</m> by
          <me>T(x) = \text{ the vector $x$ rotated counterclockwise by the angle $\theta$}.</me>
          Verify that <m>T</m> is linear.
        </p>
      </statement>
      <solution>
        <p>
          Since <m>T</m> is defined geometrically, we give a geometric argument.
          For the first property, <m>T(u) + T(v)</m> is the sum of the vectors obtained by rotating <m>u</m> and <m>v</m> by <m>\theta</m>.  On the other side of the equation, <m>T(u+v)</m> is the vector obtained by rotating the sum of the vectors <m>u</m> and <m>v</m>.  But it does not matter whether we sum or rotate first, as the following picture shows.
          <latex-code mode="bare">
            \usetikzlibrary{angles}
          </latex-code>
          <latex-code>
\begin{tikzpicture}[scale=3, thin border nodes]
  \draw[->] (-.3,0) -- (1.3,0);
  \draw[->] (0,-.3) -- (0,1);

  \draw[vector, seq-blue]  (0,0) to["$u$"'] (.8,.2);
  \draw[vector, seq-green] (.8, .2) to["$v$"'] (1.1, .7);
  \draw[vector, seq-red] (0, 0) to["$u+v$"] (1.1, .7);

  \draw[->, thick] (1.3, .5) to[bend left, "$T$"] (2.2, .5);

  \begin{scope}[xshift=2.5cm]
    \draw[->] (-.3,0) -- (1.3,0);
    \draw[->] (0,-.3) -- (0,1);

    \draw[seq-blue, vector, very thin, opacity=.5] (0, 0) -- (.8, .2);

    \begin{scope}[rotate=24]
      \coordinate (C) at (.8, .2);
      \draw[vector, seq-blue]  (0,0) to["$T(u)$"' {whitebg, behind path, pos=.9}] (.8,.2);
      \draw[vector, seq-green] (.8, .2) to["$T(v)$"'] (1.1, .7);
      \draw[vector, seq-red] (0, 0) to["$T(u+v)$" {whitebg, behind path}] (1.1, .7);
    \end{scope}

    \coordinate (A) at (.8, .2);
    \coordinate (B) at (0, 0);

    \pic[draw, "$\theta$" font=\tiny, angle radius=1cm, angle eccentricity=.8] {angle=A--B--C};

  \end{scope}
\end{tikzpicture}
          </latex-code>
          For the second property, <m>cT(u)</m> is the vector obtained by rotating <m>u</m> by the angle <m>\theta</m>, then changing its length by a factor of <m>c</m> (reversing direction of <m>c&lt;0</m>.  On the other hand, <m>T(cu)</m> first changes the length of <m>c</m>, then rotates.  But it does not matter in which order we do these two operations.
          <latex-code mode="bare">
            \usetikzlibrary{angles}
          </latex-code>
          <latex-code>
\begin{tikzpicture}[scale=3, thin border nodes]
  \draw[->] (-.3,0) -- (1.3,0);
  \draw[->] (0,-.3) -- (0,.8);

  \draw[vector, seq-blue]  (0,0) to["$u$"] (.8,.2);
  \draw[vector, seq-red, thin] (0, 0) to["$cu$" {pos=.9, below right=2pt}] ($1.3*(.8, .2)$);

  \draw[->, thick] (1.3, .5) to[bend left, "$T$"] (2.2, .5);

  \begin{scope}[xshift=2.5cm]
    \draw[->] (-.3,0) -- (1.3,0);
    \draw[->] (0,-.3) -- (0,.8);

    \draw[seq-blue, vector, very thin, opacity=.5] (0, 0) -- (.8, .2);

    \begin{scope}[rotate=24]
      \coordinate (C) at (.8, .2);
      \draw[vector, seq-blue]  (0,0) to["$T(u)$" {whitebg, behind path}] (.8,.2);
      \draw[vector, seq-red, thin] (0, 0) to["$T(cu)$" {pos=.9, below right}] ($1.3*(.8, .2)$);
    \end{scope}

    \coordinate (A) at (.8, .2);
    \coordinate (B) at (0, 0);

    \pic[draw, "$\theta$" font=\tiny, angle radius=1cm, angle eccentricity=.8] {angle=A--B--C};

  \end{scope}
\end{tikzpicture}
          </latex-code>
          This verifies that <m>T</m> is a linear transformation.  We will find its matrix in the next <xref ref="linear-trans-is-matrix-trans"/>.  Note however that it is not at all obvious that <m>T</m> can be expressed as multiplication by a matrix.
        </p>
      </solution>
    </example>

    <example>
      <title>A transformation defined by a formula</title>
      <statement>
        <p>
          Define <m>T\colon\R^2\to\R^3</m> by the formula
          <me>T\vec{x y} = \vec{3x-y y x}.</me>
          Verify that <m>T</m> is linear.
        </p>
      </statement>
      <solution>
        <p>
          We have to check the <xref ref="linear-trans-defn">defining properties</xref> for <em>all</em> vectors <m>u,v</m> and <em>all</em> scalars <m>c</m>.  In other words, we have to treat <m>u,v,</m> and <m>c</m> as <em>unknowns</em>; the only thing we are allowed to use is the definition of <m>T</m>.  Since <m>T</m> is defined in terms of the coordinates of <m>u,v</m>, we need to give those names as well; say <m>u={x_1\choose y_1}</m> and <m>v={x_2\choose y_2}</m>.  For the first property, we have
          <me>
            \begin{split}
              T\left(\vec{x_1 y_1} + \vec{x_2 y_2}\right)
              &amp;= T\vec{x_1+x_2 y_1+y_2}
              = \vec{3(x_1+x_2)-(y_1+y_2) y_1+y_2 x_1+x_2} \\
              &amp;= \vec{(3x_1-y_1)+(3x_2-y_2) y_1+y_2 x_1+x_2} \\
              &amp;= \vec{3x_1-y_1 y_1 x_1} + \vec{3x_2-y_2 y_2 x_2}
              = T\vec{x_1 y_1} + T\vec{x_2 y_2}.
            \end{split}
          </me>
          For the second property,
          <me>
            \begin{split}
            T\left(c\vec{x_1 y_1}\right)
            &amp;= T\vec{cx_1 cy_1}
            = \vec{3(cx_1)-(cy_1) cy_1 cx_1} \\
            &amp;= \vec{c(3x_1-y_1) cy_1 cx_1}
            = c\vec{3x_1-y_1 y_1 x_1}
            = cT\vec{x_1 y_1}.
            \end{split}
          </me>
          Since <m>T</m> satisfies the <xref ref="linear-trans-defn">defining properties</xref>, <m>T</m> is a linear transformation.
        </p>
        <p>
          <em>Note:</em> we will see in this <xref ref="linear-trans-matrix-formula"/> below  that
          <me>T\vec{x y} = \mat{3 -1; 0 1; 1 0}\vec{x y}.</me>
          Hence <m>T</m> is in fact a matrix transformation.
        </p>
      </solution>
    </example>

    <p>
      <idx><h>Linear transformation</h><h>when defined by a formula</h></idx>
      One can show that, if a transformation is defined by formulas in the coordinates as in the above example, then the transformation is linear if and only if each coordinate is a linear expression in the variables with no constant term.
    </p>

    <example>
      <title>A translation</title>
      <p>
        Define <m>T\colon\R^3\to\R^3</m> by
        <me>T(x) = x + \vec{1 2 3}.</me>
        This kind of transformation is called a <term>translation</term>.
        As in a previous <xref ref="linear-trans-eg-T0-non0"/>, this <m>T</m>
        is not linear, because <m>T(0)</m> is not the zero vector.
      </p>
    </example>

    <example>
      <title>More non-linear transformations</title>
      <idx><h>Linear transformation</h><h>verifying nonlinearity</h></idx>
      <statement>
        <p>
          Verify that the following transformations from <m>\R^2</m> to <m>\R^2</m> are not linear:
          <me>
            T_1\vec{x y} = \vec{|x| y} \qquad
            T_2\vec{x y} = \vec{xy y} \qquad
            T_3\vec{x y} = \vec{2x+1 x-2y}.
          </me>
        </p>
      </statement>
      <solution>
        <p>
          In order to verify that a transformation <m>T</m> is <em>not</em> linear, we have to show that <m>T</m> does not satisfy <em>at least one</em> of the two <xref ref="linear-trans-defn">defining properties</xref>.  For the first, the negation of the statement <q><m>T(u+v)=T(u)+T(v)</m> for all vectors <m>u,v</m></q> is <q>there exists at least one pair of vectors <m>u,v</m> such that <m>T(u+v)\neq T(u)+T(v)</m>.</q>  In other words, it suffices to find <em>one example</em> of a pair of vectors <m>u,v</m> such that <m>T(u+v)\neq T(u)+T(v)</m>.  Likewise, for the second, the negation of the statement <q><m>T(cu) = cT(u)</m> for all vectors <m>u</m> and all scalars <m>c</m></q> is <q>there exists some vector <m>u</m> and some scalar <m>c</m> such that <m>T(cu)\neq cT(u)</m>.</q>  In other words, it suffices to find <em>one</em> vector <m>u</m> and <em>one</em> scalar <m>c</m> such that <m>T(cu)\neq cT(u)</m>.
        </p>
        <p>
          For the first transformation, we note that
          <me>T_1\left(-\vec{1 0}\right) = T_1\vec{-1 0} = \vec{|-1| 0} = \vec{1 0}</me>
          but that
          <me>-T_1\vec{1 0} = -\vec{|1| 0} = -\vec{1 0} = \vec{-1 0}.</me>
          Therefore, this transformation does not satisfy the second property.
        </p>
        <p>
          For the second transformation, we note that
          <me>T_2\left(2\vec{1 1}\right) = T_2\vec{2 2} = \vec{2\cdot 2 2} = \vec{4 2}</me>
          but that
          <me>2T_2\vec{1 1} = 2\vec{1\cdot 1 1} = 2\vec{1 1} = \vec{2 2}.</me>
          Therefore, this transformation does not satisfy the second property.
        </p>
        <p>
          For the third transformation, we observe that
          <me>T_3\vec{0 0} = \vec{2(0)+1 0-2(0)} = \vec{1 0}\neq\vec{0 0}.</me>
          Since <m>T_3</m> does not take the zero vector to the zero vector, it cannot be linear.
        </p>
      </solution>
    </example>

    <p>
      When deciding whether a transformation <m>T</m> is linear, generally the first thing to do is to check whether <m>T(0)=0</m>; if not, <m>T</m> is automatically not linear. Note however that the non-linear transformations <m>T_1</m> and <m>T_2</m> of the above example do take the zero vector to the zero vector.
    </p>

    <remark type-name="Challenge">
      <p>
        Find an example of a transformation that satisfies the first <xref ref="linear-trans-defn">property of linearity</xref> but not the second.
      </p>
    </remark>

  </subsection>

  <subsection>
    <title>The Standard Coordinate Vectors</title>

    <p>
      In the next subsection, we will present the relationship between linear transformations and matrix transformations.  Before doing so, we need the following important notation.
    </p>

    <bluebox xml:id="defn-standard-coord-vectors" type-name="Notation">
      <title>Standard coordinate vectors</title>
      <idx><h>Standard coordinate vectors</h><h>definition of</h></idx>
      <p>
        <notation><usage>e_1,e_2,\ldots</usage><description>Standard coordinate vectors</description></notation>
        The <term>standard coordinate vectors in <m>\R^n</m></term> are the <m>n</m> vectors
        <me>
          e_1=\vec{1 0 \vdots, 0 0},\quad
          e_2=\vec{0 1 \vdots, 0 0},\quad\ldots,\quad
          e_{n-1}=\vec{0 0 \vdots, 1 0},\quad
          e_n=\vec{0 0 \vdots, 0 1}.
        </me>
        The <m>i</m>th entry of <m>e_i</m> is equal to 1, and the other entries are zero.
      </p>
      <p>
        <em>From now on,</em> for the rest of the book, we will use the symbols <m>\color{red}e_1,e_2,\ldots</m> to denote the standard coordinate vectors.
      </p>
    </bluebox>

    <p>
      There is an ambiguity in this notation: one has to know from context that <m>e_1</m> is meant to have <m>n</m> entries.  That is, the vectors
      <me>\vec{1 0} \quad\text{ and }\quad \vec{1 0 0}</me>
      may both be denoted <m>e_1</m>, depending on whether we are discussing vectors in <m>\R^2</m> or in <m>\R^3</m>.
    </p>

    <p>
      <idx><h>Standard coordinate vectors</h><h>picture of</h></idx>
      The standard coordinate vectors in <m>\R^2</m> and <m>\R^3</m> are pictured below.
      <latex-code>
\begin{tikzpicture}[thin border nodes]
  \draw[grid lines] (-2,-2) grid (2,2);
  \draw[->] (-2,0) -- (2,0);
  \draw[->] (0,-2) -- (0,2);
  \draw[vector, seq1] (0,0) to["$e_1$" below=1pt] (1,0);
  \draw[vector, seq2] (0,0) to["$e_2$"] (0,1);
  \node[font=\normalsize, above] at (0,2.1) {in $\R^2$};
  \point at (0,0);
\end{tikzpicture}
\qquad
\begin{tikzpicture}[myxyz, thin border nodes]
  \node[font=\normalsize, above] at (0,0,2.1) {in $\R^3$};
  \draw[grid lines, transformxy] (-2,-2) grid (2,2);
  \draw[->] (-2,0,0) -- (2,0,0);
  \draw[->] (0,-2,0) -- (0,2,0);
  \draw[->] (0,0,-2) -- (0,0,2);
  \draw[vector, seq1] (0,0,0) to["$e_1$" pos=.8] (1,0,0);
  \draw[vector, seq2] (0,0,0) to["$e_2$" below=1pt] (0,1,0);
  \draw[vector, seq3] (0,0,0) to["$e_3$"] (0,0,1);
  \point at (0,0,0);
\end{tikzpicture}
      </latex-code>
      These are the vectors of length 1 that point in the positive directions of each of the axes.
    </p>

    <fact hide-type="true" xml:id="linear-trans-pick-columns">
      <title>Multiplying a matrix by the standard coordinate vectors</title>
      <idx><h>Standard coordinate vectors</h><h>and matrix columns</h></idx>
      <idx><h>Matrix-vector product</h><h>with standard coordinate vectors</h></idx>
      <p>
        If <m>A</m> is an <m>m\times n</m> matrix with columns <m>v_1,v_2,\ldots,v_m</m>, then <m>\color{red}Ae_i = v_i</m> for each <m>i=1,2,\ldots,n</m>:
        <me>
          \mat{| | ,, |; v_1 v_2 \cdots, v_n; | | ,, |}e_i = v_i.
        </me>
        In other words, multiplying a matrix by <m>e_i</m> simply selects its <m>i</m>th column.
      </p>
    </fact>

    <p>
      For example,
      <me>\def\A{\mat{1 2 3; 4 5 6; 7 8 9}}
      \A\vec{1 0 0} = \vec{1 4 7} \quad
      \A\vec{0 1 0} = \vec{2 5 8} \quad
      \A\vec{0 0 1} = \vec{3 6 9}.
      </me>
    </p>

    <definition xml:id="linear-trans-identity-mat">
      <idx><h>Identity matrix</h><h>definition of</h></idx>
      <idx><h>Identity matrix</h><h>and standard coordinate vectors</h></idx>
      <idx><h>Standard coordinate vectors</h><h>columns of the identity matrix</h></idx>
      <notation><usage>I_n</usage><description><m>n\times n</m> identity matrix</description></notation>
      <statement>
        <p>
          The <term><m>n\times n</m> identity matrix</term> is the matrix <m>I_n</m> whose columns are the <m>n</m> standard coordinate vectors in <m>\R^n</m>:
          <me>I_n = \mat{1 0 \cdots, 0 0; 0 1 \cdots, 0 0;
          \vdots, \vdots, \ddots, \vdots, \vdots;
          0 0 \cdots, 1 0; 0 0 \cdots, 0 1}.</me>
        </p>
      </statement>
    </definition>

    <p>
      We will see in this <xref ref="linear-trans-matrix-of-id"/> below that the identity matrix is the matrix of the <xref ref="matrix-trans-identity" text="title">identity transformation</xref>.
    </p>

  </subsection>

  <subsection xml:id="linear-trans-is-matrix-trans">
    <title>The Matrix of a Linear Transformation</title>

    <p>
      Now we can prove that every linear transformation is a matrix transformation, and we will show how to compute the matrix.
    </p>

    <theorem xml:id="matrix-of-transformation">
      <title>The matrix of a linear transformation</title>
      <idx><h>Linear transformation</h><h>standard matrix of</h></idx>
      <idx><h>Standard matrix</h><see>Linear transformation</see></idx>
      <statement>
        <p>
          Let <m>T\colon\R^n\to\R^m</m> be a linear transformation.  Let <m>A</m> be the <m>m\times n</m> matrix
          <me>A = \mat{| | ,, |; T(e_1) T(e_2) \cdots, T(e_n); | | ,, | }.</me>
          Then <m>T</m> is the matrix transformation associated with <m>A</m>: that is, <m>T(x) = Ax</m>.
        </p>
      </statement>
      <proof>
        <p>
          We suppose for simplicity that <m>T</m> is a transformation from <m>\R^3</m> to <m>\R^2</m>.  Let <m>A</m> be the matrix given in the statement of the theorem.  Then
          <me>
\begin{split}
  T\vec{x y z} \amp;= T\left( x\vec{1 0 0} + y\vec{0 1 0} + z\vec{0 0 1} \right) \\
  \amp= T\bigl( xe_1 + ye_2 + ze_3 \bigr) \\
  \amp= xT(e_1) + yT(e_2) + zT(e_3) \\
  \amp= \mat{| | |; T(e_1) T(e_2) T(e_3); | | |}\vec{x y z}\\
  \amp= A\vec{x y z}.
\end{split}
          </me>
        </p>
      </proof>
    </theorem>

    <p>
      The matrix <m>A</m> in the above theorem is called the <term>standard matrix</term> for <m>T</m>.  The columns of <m>A</m> are the vectors obtained by evaluating <m>T</m> on the <m>n</m> standard coordinate vectors in <m>\R^n</m>.  To summarize part of the theorem:
    </p>

    <bluebox>
      <idx><h>Linear transformation</h><h>are matrix transformations</h></idx>
      <p>Matrix transformations are the same as linear transformations.</p>
    </bluebox>

    <note hide-type="true">
      <title>Dictionary</title>
      <idx><h>Linear transformation</h><h>dictionary</h></idx>
      <idx><h>Matrix transformation</h><h>dictionary</h></idx>
      <p>
        Linear transformations are the same as matrix transformations, which come from matrices.  The correspondence can be summarized in the following dictionary.
        <latex-code>
\parbox{.3\linewidth}{\centering%
  $T\colon\R^n\to\R^m$\\
  Linear transformation}%
\;$\xrightarrow{\phantom{MMM}}$\;
\hbox to .5\linewidth{$m\times n$ matrix
\small$A = \mat{| | ,, |; T(e_1) T(e_2) \cdots, T(e_n); | | ,, | }$\hss}\\[3mm]
\leavevmode
\hbox to .3\linewidth{\hfil
  $\begin{aligned}&amp;T\colon\R^n\to\R^m\\&amp;T(x) = Ax\end{aligned}$\hfil}%
\;$\xleftarrow{\phantom{MMM}}$\;
\hbox to .5\linewidth{$m\times n$ matrix $A$\hfil}
        </latex-code>
      </p>
    </note>

    <example>
      <title>The matrix of a dilation</title>
      <idx><h>Dilation</h></idx>
      <statement>
        <p>Define <m>T\colon\R^2\to\R^2</m> by <m>T(x)=1.5x</m>.  Find the standard matrix <m>A</m> for <m>T</m>.</p>
      </statement>
      <solution>
        <p>
          The columns of <m>A</m> are obtained by evaluating <m>T</m> on the standard coordinate vectors <m>e_1,e_2</m>.
          <me>
\left.\begin{aligned}
    T(e_1) \amp= 1.5e_1 = \vec{1.5 0} \\
    T(e_2) \amp= 1.5e_2 = \vec{0 1.5}
\end{aligned}\right\}
\implies A = \mat{1.5 0; 0 1.5}.
          </me>
          This is the matrix we started with in this <xref ref="matrix-trans-eg-dilation"/>.
        </p>
      </solution>
    </example>

    <example xml:id="linear-trans-rotation-matrix">
      <title>The matrix of a rotation</title>
      <idx><h>Rotation</h><h>counterclockwise by <m>\theta</m></h></idx>
      <statement>
        <p>
          Define <m>T\colon\R^2\to\R^2</m> by
          <me>T(x) = \text{ the vector $x$ rotated counterclockwise by the angle $\theta$}.</me>
          Find the standard matrix for <m>T</m>.
        </p>
      </statement>
      <solution>
        <p>
          The columns of <m>A</m> are obtained by evaluating <m>T</m> on the standard coordinate vectors <m>e_1,e_2</m>.  In order to compute the entries of <m>T(e_1)</m> and <m>T(e_2)</m>, we have to do some trigonometry.
          <latex-code mode="bare">
            \usetikzlibrary{math}
          </latex-code>
          <latex-code>
\begin{tikzpicture}[scale=3.25, thin border nodes]
  \clip (-.5,-.4) rectangle (3.5,1.3);

  \draw[help lines] (0,0) circle[radius=1];
  \draw[->] (-1.3,0) -- (1.3,0);
  \draw[->] (0,-1.3) -- (0,1.3);

  \tikzmath{
    coordinate \c;
    \x = cos(37);
    \y = sin(37);
    \c = (\x,\y);
  }

  \coordinate (Te) at (\c);
  \coordinate (cos) at (\cx,0);
  \point (o) at (0,0);
  \pic[draw] {right angle=(o)--(cos)--(Te)};
  \draw[thick vector] (0,0) -- (1,0)
    node[pos=.9, below=1mm] {$e_1$};
  \draw[thick vector] (0,0) -- (Te)
    node[pos=.7, above left] {$T(e_1)$};
  \draw (.3,0) arc[start angle=0, delta angle=37, radius=.3cm]
    node[midway, right=.2mm, yshift=.5mm] {$\theta$};
  \draw[seq1,thick] (Te) -- (\cx,0) node[midway,right] {$\sin\theta$};
  \draw[seq2,thick] (0,0) -- (\cx,0) node[midway,below=1mm] {$\cos\theta$};

  \begin{scope}[xshift=3cm]
    \draw[help lines] (0,0) circle[radius=1];
    \draw[->] (-1.3,0) -- (1.3,0);
    \draw[->] (0,-1.3) -- (0,1.3);

    \tikzmath{
      coordinate \c;
      \x = -sin(37);
      \y = cos(37);
      \c = (\x,\y);
    }

    \coordinate (Te) at (\c);
    \coordinate (sin) at (\cx, 0);
    \point (o) at (0,0);
    \pic[draw] {right angle=(o)--(sin)--(Te)};
    \draw[thick vector] (0,0) -- (0,1)
      node[pos=.6, right] {$e_2$};
    \draw[thick vector] (0,0) -- (Te)
      node[pos=1.15, inner sep=1pt] {$T(e_2)$};
    \draw (0,.3) arc[start angle=90, delta angle=37, radius=.3cm]
      node[midway, above=.21mm] {$\theta$};
    \draw[seq2,thick] (Te) -- (\cx,0) node[midway,left] {$\cos\theta$};
    \draw[seq1,thick] (0,0) -- (\cx,0) node[midway,below=1mm] {$\sin\theta$};

  \end{scope}

\end{tikzpicture}
          </latex-code>
          We see from the picture that
          <me>
\left.\begin{aligned}
    T(e_1) \amp= \vec{\color{seq2}\cos\theta, \color{seq1}\sin\theta} \\
    T(e_2) \amp= \vec{-\color{seq1}\sin\theta, \color{seq2}\cos\theta}
\end{aligned}\right\}
\implies A =
\mat{\color{seq2}\cos\theta, -\color{seq1}\sin\theta;
     \color{seq1}\sin\theta, \color{seq2}\cos\theta}
          </me>
        </p>
      </solution>
    </example>

    <p>
      We saw in the above example that the matrix for counterclockwise rotation of the plane by an angle of <m>\theta</m> is
      <me>A = \mat{\cos\theta, -\sin\theta; \sin\theta, \cos\theta}.</me>
    </p>

    <example xml:id="linear-trans-matrix-formula">
      <title>A transformation defined by a formula</title>
      <statement>
        <p>
          Define <m>T\colon\R^2\to\R^3</m> by the formula
          <me>T\vec{x y} = \vec{3x-y y x}.</me>
          Find the standard matrix for <m>T</m>.
        </p>
      </statement>
      <solution>
        <p>
          We substitute the standard coordinate vectors into the formula defining <m>T</m>:
          <me>
\left.\begin{aligned}
    T(e_1) \amp= T\vec{1 0} = \vec{3(1)-0 0 1} = \vec{3 0 1} \\
    T(e_2) \amp= T\vec{0 1} = \vec{3(0)-1 1 0} = \vec{-1 1 0}
\end{aligned}\right\}
\implies A = \mat{3 -1; 0 1; 1 0}.
          </me>
        </p>
      </solution>
    </example>

    <example xml:id="linear-trans-in-steps">
      <title>A transformation defined in steps</title>
      <statement>
        <p>
          Let <m>T\colon\R^3\to\R^3</m> be the linear transformation that reflects over the <m>xy</m>-plane and then projects onto the <m>yz</m>-plane.  What is the standard matrix for <m>T</m>?
        </p>
      </statement>
      <solution>
        <p>
          This transformation is described geometrically, in two steps.  To find the columns of <m>A</m>, we need to follow the standard coordinate vectors through each of these steps.
          <latex-code mode="bare">
\def\drawarrow#1{
\begin{tikzpicture}[myxyz, y={(1cm,-.28cm)}, scale=0.8, thin border nodes, baseline]
  \path[clip, resetxy] (-2,-2) rectangle (2,2);

  \begin{scope}[transformxy]
    \fill[blue, opacity=.05] (-1.5,-1.5) rectangle (1.5,1.5);
    \draw[step=1cm, help lines, blue!40!black]
      (-1.5, -1.5) grid (1.5, 1.5);
  \end{scope}

  \node[blue!40!black,right] at (-.5,.8,-.5) {$xy$};

  \begin{scope}[transformyz]
    \fill[green, opacity=.05] (-1.5,-1.5) rectangle (1.5,1.5);
    \draw[step=1cm, help lines, green!40!black]
      (-1.5, -1.5) grid (1.5, 1.5);
  \end{scope}

  \node[green!40!black] at (.75,-1,1) {$yz$};

  #1

\end{tikzpicture}
}
          </latex-code>
          <latex-code>
\drawarrow{\draw[vector] (0,0,0) -- (1,0,0) node[below right, pos=.5] {$e_1$};}
$\xrightarrow{\text{reflect $xy$}}$
\drawarrow{\draw[vector] (0,0,0) -- (1,0,0);}
$\xrightarrow{\text{project $yz$}}$
\drawarrow{\point at (0,0,0);}
          </latex-code>
          Since <m>e_1</m> lies on the <m>xy</m>-plane, reflecting over the <m>xy</m>-plane does not move <m>e_1</m>.  Since <m>e_1</m> is perpendicular to the <m>yz</m>-plane, projecting <m>e_1</m> onto the <m>yz</m>-plane sends it to zero.  Therefore,
          <me>T(e_1) = \vec{0 0 0}.</me>
          <latex-code>
\drawarrow{\draw[vector] (0,0,0) -- (0,1,0) node[below left, pos=.6] {$e_2$};}
$\xrightarrow{\text{reflect $xy$}}$
\drawarrow{\draw[vector] (0,0,0) -- (0,1,0);}
$\xrightarrow{\text{project $yz$}}$
\drawarrow{\draw[vector] (0,0,0) -- (0,1,0);}
          </latex-code>
          Since <m>e_2</m> lies on the <m>xy</m>-plane, reflecting over the <m>xy</m>-plane does not move <m>e_2</m>.  Since <m>e_2</m> lies on the <m>yz</m>-plane, projecting onto the <m>yz</m>-plane does not move <m>e_2</m> either.
          Therefore,
          <me>T(e_2) = e_2 = \vec{0 1 0}.</me>
          <latex-code>
\drawarrow{\draw[vector] (0,0,0) -- (0,0,1) node[left, pos=.4] {$e_3$};}
$\xrightarrow{\text{reflect $xy$}}$
\drawarrow{\draw[vector] (0,0,0) -- (0,0,-1);}
$\xrightarrow{\text{project $yz$}}$
\drawarrow{\draw[vector] (0,0,0) -- (0,0,-1);}
          </latex-code>
          Since <m>e_3</m> is perpendicular to the <m>xy</m>-plane, reflecting over the <m>xy</m>-plane takes <m>e_3</m> to its negative.  Since <m>-e_3</m> lies on the <m>yz</m>-plane, projecting onto the <m>yz</m>-plane does not move it.
          Therefore,
          <me>T(e_3) = -e_3 = \vec{0 0 -1}.</me>
          Now we have computed all three columns of <m>A</m>:
          <me>
\left.\begin{aligned}
  T(e_1) \amp= \vec{0 0 0} \\
  T(e_2) \amp= \vec{0 1 0} \\
  T(e_1) \amp= \vec{0 0 -1} \\
\end{aligned}\right\}
\implies A = \mat{0 0 0; 0 1 0; 0 0 -1}.
          </me>
        </p>
        <figure>
          <caption>Illustration of a transformation defined in steps.  Click and drag the vector on the left.</caption>
          <mathbox source="demos/steps.html" height="500px"/>
        </figure>
      </solution>
    </example>

    <p>
      Recall from this <xref ref="matrix-trans-identity"/> that the <em>identity transformation</em> is the transformation <m>\Id_{\R^n}\colon\R^n\to\R^n</m> defined by <m>\Id_{\R^n}(x) = x</m> for every vector <m>x</m>.
    </p>

    <example xml:id="linear-trans-matrix-of-id">
      <title>The standard matrix of the identity transformation</title>
      <idx><h>Identity matrix</h><h>and identity transformation</h></idx>
      <idx><h>Identity transformation</h><h>and identity matrix</h></idx>
      <statement>
        <p>Verify that the identity transformation <m>\Id_{\R^n}\colon\R^n\to\R^n</m> is linear, and compute its standard matrix.
        </p>
      </statement>
      <solution>
        <p>
          We verify the two <xref ref="linear-trans-defn">defining properties</xref> of linear transformations.  Let <m>u,v</m> be vectors in <m>\R^n</m>.  Then
          <me>\Id_{\R^n}(u+v) = u+v = \Id_{\R^n}(u) + \Id_{\R^n}(v).</me>
          If <m>c</m> is a scalar, then
          <me>\Id_{\R^n}(cu) = cu = c\Id_{\R^n}(u).</me>
          Since <m>\Id_{\R^n}</m> satisfies the two defining properties, it is a linear transformation.
        </p>
        <p>
          Now that we know that <m>\Id_{\R^n}</m> is linear, it makes sense to compute its standard matrix.
          For each standard coordinate vector <m>e_i</m>, we have <m>\Id_{R^n}(e_i) = e_i</m>.  In other words, the columns of the standard matrix of <m>\Id_{\R^n}</m> are the standard coordinate vectors, so the standard matrix is the identity matrix
          <me>I_n = \mat{1 0 \cdots, 0 0; 0 1 \cdots, 0 0;
          \vdots, \vdots, \ddots, \vdots, \vdots;
          0 0 \cdots, 1 0; 0 0 \cdots, 0 1}.</me>
        </p>
      </solution>
    </example>

    <p>
      We computed in this <xref ref="linear-trans-matrix-of-id"/> that the matrix of the identity transform is the identity matrix: for every <m>x</m> in <m>\R^n</m>,
      <me>x = \Id_{\R^n}(x) = I_nx.</me>
      Therefore, <m>I_nx=x</m> for all vectors <m>x</m>: the product of the identity matrix and a vector is the same vector.
    </p>

  </subsection>


</section>