MATH 136: Linear Algebra 1

Estimated reading time: 31 minutes

Table of contents

Chapter 1: Vectors in Euclidean Space

1.1 Vector Addition and Scalar Multiplication

We begin by defining the fundamental space in which we work throughout the course.

Definition (Euclidean Space \(\mathbb{R}^n\)). For any positive integer \(n\), the set of all elements of the form \((x_1, \ldots, x_n)\) where \(x_i \in \mathbb{R}\) for \(1 \leq i \leq n\) is called n-dimensional Euclidean space and is denoted \(\mathbb{R}^n\). The elements of \(\mathbb{R}^n\) are called points.

In linear algebra we view elements of \(\mathbb{R}^n\) as column vectors \(\vec{x} = \begin{pmatrix} x_1 \\ \vdots \\ x_n \end{pmatrix}\), which allows us to perform operations on them.

\[ \vec{x} + \vec{y} = \begin{pmatrix} x_1 + y_1 \\ \vdots \\ x_n + y_n \end{pmatrix}, \qquad c\vec{x} = \begin{pmatrix} cx_1 \\ \vdots \\ cx_n \end{pmatrix}. \]

Definition (Linear Combination). Let \(\vec{v}_1, \ldots, \vec{v}_k \in \mathbb{R}^n\). The sum \(c_1\vec{v}_1 + c_2\vec{v}_2 + \cdots + c_k\vec{v}_k\) where \(c_i \in \mathbb{R}\) for \(1 \leq i \leq k\) is called a linear combination of \(\vec{v}_1, \ldots, \vec{v}_k\).

The following theorem establishes the fundamental algebraic properties of \(\mathbb{R}^n\).

Theorem 1.1.1. Let \(\vec{x}, \vec{y}, \vec{w} \in \mathbb{R}^n\) and \(c, d \in \mathbb{R}\). Then:
V1: \(\vec{x} + \vec{y} \in \mathbb{R}^n\);
V2: \((\vec{x} + \vec{y}) + \vec{w} = \vec{x} + (\vec{y} + \vec{w})\);
V3: \(\vec{x} + \vec{y} = \vec{y} + \vec{x}\);
V4: There exists \(\vec{0} \in \mathbb{R}^n\) such that \(\vec{x} + \vec{0} = \vec{x}\);
V5: For each \(\vec{x} \in \mathbb{R}^n\) there exists \(-\vec{x} \in \mathbb{R}^n\) such that \(\vec{x} + (-\vec{x}) = \vec{0}\);
V6: \(c\vec{x} \in \mathbb{R}^n\);
V7: \(c(d\vec{x}) = (cd)\vec{x}\);
V8: \((c + d)\vec{x} = c\vec{x} + d\vec{x}\);
V9: \(c(\vec{x} + \vec{y}) = c\vec{x} + c\vec{y}\);
V10: \(1\vec{x} = \vec{x}\).

Properties V1 and V6 together imply that \(\mathbb{R}^n\) is closed under linear combinations.

Span

Definition (Span). Let \(B = \{\vec{v}_1, \ldots, \vec{v}_k\}\) be a set of vectors in \(\mathbb{R}^n\). Then we define \(\operatorname{Span} B = \{c_1\vec{v}_1 + c_2\vec{v}_2 + \cdots + c_k\vec{v}_k \mid c_1, \ldots, c_k \in \mathbb{R}\}\). We say that \(\operatorname{Span} B\) is spanned by \(B\) and that \(B\) is a spanning set for \(\operatorname{Span} B\).

Definition (Vector Equation). If the set \(S\) is spanned by \(\{\vec{v}_1, \ldots, \vec{v}_k\}\), then a vector equation for \(S\) is \(\vec{x} = c_1\vec{v}_1 + \cdots + c_k\vec{v}_k\), \(c_1, \ldots, c_k \in \mathbb{R}\).

When a spanning set contains redundant vectors, we can simplify.

Theorem 1.1.2. If \(\vec{v}_k\) can be written as a linear combination of \(\vec{v}_1, \ldots, \vec{v}_{k-1}\), then \(\operatorname{Span}\{\vec{v}_1, \ldots, \vec{v}_k\} = \operatorname{Span}\{\vec{v}_1, \ldots, \vec{v}_{k-1}\}\).

Linear Independence

Definition (Linearly Dependent / Independent). A set of vectors \(\{\vec{v}_1, \ldots, \vec{v}_k\}\) in \(\mathbb{R}^n\) is linearly dependent if there exist coefficients \(c_1, \ldots, c_k\), not all zero, such that \(\vec{0} = c_1\vec{v}_1 + \cdots + c_k\vec{v}_k\). The set is linearly independent if the only solution is \(c_1 = c_2 = \cdots = c_k = 0\) (the trivial solution).

Theorem 1.1.3. If a set of vectors \(\{\vec{v}_1, \ldots, \vec{v}_k\}\) contains the zero vector, then it is linearly dependent.

Basis and Standard Basis

Definition (Basis). If a subset \(S\) of \(\mathbb{R}^n\) can be written as a span of vectors \(\vec{v}_1, \ldots, \vec{v}_k\) where \(\{\vec{v}_1, \ldots, \vec{v}_k\}\) is linearly independent, then \(\{\vec{v}_1, \ldots, \vec{v}_k\}\) is a basis for \(S\).

Definition (Standard Basis). In \(\mathbb{R}^n\), let \(\vec{e}_i\) represent the vector whose \(i\)-th component is 1 and all other components are 0. The set \(\{\vec{e}_1, \ldots, \vec{e}_n\}\) is called the standard basis for \(\mathbb{R}^n\).

Surfaces in Higher Dimensions

Definition (Line in \(\mathbb{R}^n\)). Let \(\vec{v}, \vec{b} \in \mathbb{R}^n\) with \(\vec{v} \neq \vec{0}\). The set with vector equation \(\vec{x} = c_1\vec{v} + \vec{b}\), \(c_1 \in \mathbb{R}\), is a line in \(\mathbb{R}^n\) passing through \(\vec{b}\).

Definition (Plane in \(\mathbb{R}^n\)). Let \(\vec{v}_1, \vec{v}_2, \vec{b} \in \mathbb{R}^n\) with \(\{\vec{v}_1, \vec{v}_2\}\) linearly independent. The set with vector equation \(\vec{x} = c_1\vec{v}_1 + c_2\vec{v}_2 + \vec{b}\), \(c_1, c_2 \in \mathbb{R}\), is a plane in \(\mathbb{R}^n\) passing through \(\vec{b}\).

Definition (\(k\)-plane in \(\mathbb{R}^n\)). Let \(\vec{v}_1, \ldots, \vec{v}_k, \vec{b} \in \mathbb{R}^n\) with \(\{\vec{v}_1, \ldots, \vec{v}_k\}\) linearly independent. The set with vector equation \(\vec{x} = c_1\vec{v}_1 + \cdots + c_k\vec{v}_k + \vec{b}\), \(c_1, \ldots, c_k \in \mathbb{R}\), is a \(k\)-plane in \(\mathbb{R}^n\) passing through \(\vec{b}\).

Definition (Hyperplane in \(\mathbb{R}^n\)). Let \(\vec{v}_1, \ldots, \vec{v}_{n-1}, \vec{b} \in \mathbb{R}^n\) with \(\{\vec{v}_1, \ldots, \vec{v}_{n-1}\}\) linearly independent. The set with vector equation \(\vec{x} = c_1\vec{v}_1 + \cdots + c_{n-1}\vec{v}_{n-1} + \vec{b}\), \(c_i \in \mathbb{R}\), is a hyperplane in \(\mathbb{R}^n\) passing through \(\vec{b}\).

1.2 Subspaces

Not all subsets of \(\mathbb{R}^n\) are closed under addition and scalar multiplication. Those that are enjoy a special structure.

Definition (Subspace of \(\mathbb{R}^n\)). A non-empty subset \(S\) of \(\mathbb{R}^n\) is called a subspace of \(\mathbb{R}^n\) if it satisfies all ten properties V1–V10 of Theorem 1.1.1.

Theorem 1.2.1 (Subspace Test). Let \(S\) be a non-empty subset of \(\mathbb{R}^n\). If \(\vec{x} + \vec{y} \in S\) and \(c\vec{x} \in S\) for all \(\vec{x}, \vec{y} \in S\) and \(c \in \mathbb{R}\), then \(S\) is a subspace of \(\mathbb{R}^n\).

Remark. If \(S\) is non-empty and closed under scalar multiplication, then \(\vec{0} \in S\). Thus, the standard way to check non-emptiness is to verify \(\vec{0} \in S\). If \(\vec{0} \notin S\), then \(S\) is not a subspace.

Theorem 1.2.2. Let \(\vec{v}_1, \ldots, \vec{v}_k \in \mathbb{R}^n\). Then \(S = \operatorname{Span}\{\vec{v}_1, \ldots, \vec{v}_k\}\) is a subspace of \(\mathbb{R}^n\).

1.3 Dot Product

The dot product generalizes the familiar inner product from \(\mathbb{R}^2\) and \(\mathbb{R}^3\) to \(\mathbb{R}^n\).

Theorem 1.3.1. Let \(\vec{x}, \vec{y} \in \mathbb{R}^2\) and let \(\theta\) be the angle between \(\vec{x}\) and \(\vec{y}\). Then \(\vec{x} \cdot \vec{y} = \|\vec{x}\|\|\vec{y}\| \cos \theta\).

Definition (Dot Product). Let \(\vec{x}, \vec{y} \in \mathbb{R}^n\). The dot product of \(\vec{x}\) and \(\vec{y}\) is \(\vec{x} \cdot \vec{y} = x_1 y_1 + \cdots + x_n y_n = \sum_{i=1}^n x_i y_i\).

Theorem 1.3.2. Let \(\vec{x}, \vec{y}, \vec{z} \in \mathbb{R}^n\) and \(s, t \in \mathbb{R}\). Then:
(1) \(\vec{x} \cdot \vec{x} \geq 0\) and \(\vec{x} \cdot \vec{x} = 0\) if and only if \(\vec{x} = \vec{0}\).
(2) \(\vec{x} \cdot \vec{y} = \vec{y} \cdot \vec{x}\).
(3) \(\vec{x} \cdot (s\vec{y} + t\vec{z}) = s(\vec{x} \cdot \vec{y}) + t(\vec{x} \cdot \vec{z})\).

Length, Unit Vectors, and Angles

Definition (Length / Norm). Let \(\vec{x} \in \mathbb{R}^n\). The length (or norm) of \(\vec{x}\) is \(\|\vec{x}\| = \sqrt{\vec{x} \cdot \vec{x}}\).

Definition (Unit Vector). A vector \(\vec{x} \in \mathbb{R}^n\) such that \(\|\vec{x}\| = 1\) is called a unit vector.

Theorem 1.3.3. Let \(\vec{x}, \vec{y} \in \mathbb{R}^n\) and \(c \in \mathbb{R}\). Then:
(1) \(\|\vec{x}\| \geq 0\) and \(\|\vec{x}\| = 0\) if and only if \(\vec{x} = \vec{0}\).
(2) \(\|c\vec{x}\| = |c|\|\vec{x}\|\).
(3) \(|\vec{x} \cdot \vec{y}| \leq \|\vec{x}\|\|\vec{y}\|\) (Cauchy–Schwarz Inequality).
(4) \(\|\vec{x} + \vec{y}\| \leq \|\vec{x}\| + \|\vec{y}\|\) (Triangle Inequality).

Definition (Angle in \(\mathbb{R}^n\)). Let \(\vec{x}, \vec{y} \in \mathbb{R}^n\). The angle between \(\vec{x}\) and \(\vec{y}\) is the angle \(\theta\) such that \(\vec{x} \cdot \vec{y} = \|\vec{x}\|\|\vec{y}\| \cos \theta\).

Definition (Orthogonal). Two vectors \(\vec{x}, \vec{y} \in \mathbb{R}^n\) are orthogonal if and only if \(\vec{x} \cdot \vec{y} = 0\).

Cross Product

Definition (Cross Product). Let \(\vec{v}, \vec{w} \in \mathbb{R}^3\). The cross product is \(\vec{v} \times \vec{w} = \begin{pmatrix} v_2 w_3 - v_3 w_2 \\ v_3 w_1 - v_1 w_3 \\ v_1 w_2 - v_2 w_1 \end{pmatrix}\).

Theorem 1.3.4. For any \(\vec{v}, \vec{w}, \vec{x} \in \mathbb{R}^3\) and \(c \in \mathbb{R}\):
(1) If \(\vec{n} = \vec{v} \times \vec{w}\), then \(\vec{y} \cdot \vec{n} = 0\) for any \(\vec{y} \in \operatorname{Span}\{\vec{v}, \vec{w}\}\).
(2) \(\vec{v} \times \vec{w} = -\vec{w} \times \vec{v}\).
(3) \(\vec{v} \times \vec{v} = \vec{0}\).
(4) If \(\vec{v} \times \vec{w} = \vec{0}\) then either one of \(\vec{v}, \vec{w}\) is \(\vec{0}\), or \(\vec{w}\) is a scalar multiple of \(\vec{v}\).
(5) \(\vec{v} \times (\vec{w} + \vec{x}) = \vec{v} \times \vec{w} + \vec{v} \times \vec{x}\).
(6) \((k\vec{v}) \times \vec{w} = k(\vec{v} \times \vec{w})\).
(7) \(\|\vec{v} \times \vec{w}\| = \|\vec{v}\|\|\vec{w}\| \sin \theta\) where \(\theta\) is the angle between \(\vec{v}\) and \(\vec{w}\).

1.4 Projections

Given vectors \(\vec{u}\) and \(\vec{v}\), we often want to decompose \(\vec{u}\) into a component along \(\vec{v}\) and a component orthogonal to \(\vec{v}\).

Definition (Projection). Let \(\vec{u}, \vec{v} \in \mathbb{R}^n\) with \(\vec{v} \neq \vec{0}\). The projection of \(\vec{u}\) onto \(\vec{v}\) is \(\operatorname{proj}_{\vec{v}} \vec{u} = \frac{\vec{u} \cdot \vec{v}}{\|\vec{v}\|^2} \vec{v}\).

Definition (Perpendicular). Let \(\vec{u}, \vec{v} \in \mathbb{R}^n\) with \(\vec{v} \neq \vec{0}\). The perpendicular of \(\vec{u}\) onto \(\vec{v}\) is \(\operatorname{perp}_{\vec{v}} \vec{u} = \vec{u} - \operatorname{proj}_{\vec{v}} \vec{u}\).

One can verify that \(\operatorname{proj}_{\vec{v}} \vec{u} \cdot \operatorname{perp}_{\vec{v}} \vec{u} = 0\), confirming the decomposition is orthogonal. To project a vector onto a plane, compute the perpendicular of the projection onto the plane’s normal vector.

Chapter 2: Systems of Linear Equations

2.1 Systems of Linear Equations

Definition (Linear Equation). An equation of the form \(a_1 x_1 + \cdots + a_n x_n = b\) where \(a_1, \ldots, a_n, b\) are constants is called a linear equation. The constants \(a_i\) are the coefficients and \(b\) is the right-hand side.

Definition (System of Linear Equations). A set of \(m\) linear equations in the same variables \(x_1, \ldots, x_n\) is called a system of linear equations.

Definition (Solution). A vector \(\vec{s} \in \mathbb{R}^n\) is a solution of a system of \(m\) linear equations in \(n\) unknowns if all \(m\) equations are satisfied when \(x_i = s_i\) for \(1 \leq i \leq n\).

Definition (Consistent / Inconsistent). A system is consistent if it has at least one solution; otherwise it is inconsistent.

Theorem 2.1.1. If a system of linear equations has two distinct solutions \(\vec{s}\) and \(\vec{t}\), then \(\vec{x} = \vec{s} + c(\vec{s} - \vec{t})\) is a distinct solution for each \(c \in \mathbb{R}\). Hence, a consistent system has either exactly one solution or infinitely many.

Definition (Solution Set). The set of all solutions of a system of linear equations is called the solution set of the system.

2.2 Solving Systems of Linear Equations

We encode a system compactly as a matrix and use row operations to solve it.

Definition (Augmented Matrix / Coefficient Matrix). The augmented matrix of a system is the rectangular array \([A \mid \vec{b}]\) formed by the coefficients and the right-hand side. The coefficient matrix \(A\) contains only the coefficients.

Definition (Equivalent Systems). Two systems of equations are equivalent if they have the same solution set.

Definition (Elementary Row Operations). The three elementary row operations (EROs) are:

  1. Multiplying a row by a non-zero scalar (\(cR_i\)).
  2. Adding a multiple of one row to another (\(R_i + cR_j\)).
  3. Swapping two rows (\(R_i \leftrightarrow R_j\)).

Definition (Row Equivalent). Two matrices \(A\) and \(B\) are row equivalent if there exists a sequence of EROs transforming \(A\) into \(B\).

Theorem 2.2.1. If the augmented matrices \([A_1 \mid \vec{b}_1]\) and \([A \mid \vec{b}]\) are row equivalent, then their associated systems of linear equations are equivalent.

Reduced Row Echelon Form

Definition (Reduced Row Echelon Form). A matrix \(R\) is in RREF if:

  1. All rows with a non-zero entry are above rows of all zeros.
  2. The first non-zero entry in each non-zero row is 1 (a leading one).
  3. Each leading one is to the right of the leading one in any row above it.
  4. A leading one is the only non-zero entry in its column.

Theorem 2.2.2. The RREF of a matrix is unique.

Definition (Free Variable). Let \(R\) be the RREF of the coefficient matrix. If the \(j\)-th column of \(R\) does not contain a leading one, then \(x_j\) is a free variable.

Rank

Definition (Rank). The rank of a matrix is the number of leading ones in its RREF.

The following theorem is fundamental and referenced throughout the course.

Theorem 2.2.3. Let \(A\) be the \(m \times n\) coefficient matrix of a system of linear equations.
(1) If \(\operatorname{rank} A < \operatorname{rank}[A \mid \vec{b}]\), then the system is inconsistent.
(2) If \([A \mid \vec{b}]\) is consistent, then the system contains \(n - \operatorname{rank} A\) free variables. A consistent system has a unique solution if and only if \(\operatorname{rank} A = n\).
(3) \(\operatorname{rank} A = m\) if and only if \([A \mid \vec{b}]\) is consistent for every \(\vec{b} \in \mathbb{R}^m\).

Homogeneous Systems

Definition (Homogeneous System). A system of linear equations is homogeneous if the right-hand side is all zeros, i.e., it has the form \([A \mid \vec{0}]\).

A homogeneous system is always consistent (the zero vector is always a solution, called the trivial solution).

Theorem 2.2.4. The solution set of a homogeneous system of \(m\) linear equations in \(n\) variables is a subspace of \(\mathbb{R}^n\).

Chapter 3: Matrices and Linear Mappings

3.1 Operations on Matrices

Definition (Matrix). An \(m \times n\) matrix \(A\) is a rectangular array with \(m\) rows and \(n\) columns. We denote the entry in the \(i\)-th row and \(j\)-th column by \(a_{ij}\). The set of all \(m \times n\) matrices with real entries is denoted \(M_{m \times n}(\mathbb{R})\).

Definition (Matrix Addition and Scalar Multiplication). Let \(A, B \in M_{m \times n}(\mathbb{R})\) and \(c \in \mathbb{R}\). Then \((A + B)_{ij} = a_{ij} + b_{ij}\) and \((cA)_{ij} = c \cdot a_{ij}\).

Theorem 3.1.1. Let \(A, B, C\) be \(m \times n\) matrices and \(c_1, c_2 \in \mathbb{R}\). Then \(M_{m \times n}(\mathbb{R})\) satisfies properties V1–V10, with zero matrix \(O_{m,n}\).

Transpose

Definition (Transpose). The transpose of an \(m \times n\) matrix \(A\) is the \(n \times m\) matrix \(A^T\) with \((A^T)_{ij} = a_{ji}\).

Theorem 3.1.2. For \(m \times n\) matrices \(A, B\) and scalar \(c\):
(1) \((A^T)^T = A\).
(2) \((A + B)^T = A^T + B^T\).
(3) \((cA)^T = cA^T\).

Matrix-Vector Multiplication

Definition (Matrix-Vector Multiplication, Row View). Let \(A\) be an \(m \times n\) matrix with rows \(\vec{a}_i^T\). For \(\vec{x} \in \mathbb{R}^n\), \(A\vec{x} = \begin{pmatrix} \vec{a}_1 \cdot \vec{x} \\ \vdots \\ \vec{a}_m \cdot \vec{x} \end{pmatrix}\).

Definition (Matrix-Vector Multiplication, Column View). Let \(A\) be an \(m \times n\) matrix with columns \(\vec{a}_1, \ldots, \vec{a}_n\). For \(\vec{x} \in \mathbb{R}^n\), \(A\vec{x} = x_1\vec{a}_1 + \cdots + x_n\vec{a}_n\).

This shows that \(A\vec{x}\) is always a linear combination of the columns of \(A\).

Matrix Multiplication

Definition (Matrix Multiplication). Let \(A\) be \(m \times n\) and \(B = [\vec{b}_1 \cdots \vec{b}_p]\) be \(n \times p\). Then \(AB = [A\vec{b}_1 \cdots A\vec{b}_p]\), an \(m \times p\) matrix with \((AB)_{ij} = \sum_{k=1}^n a_{ik} b_{kj}\).

Theorem 3.1.3. If \(A, B, C\) have correct sizes and \(t \in \mathbb{R}\):
(1) \(A(B + C) = AB + AC\).
(2) \(t(AB) = (tA)B = A(tB)\).
(3) \(A(BC) = (AB)C\).
(4) \((AB)^T = B^T A^T\).

In general \(AB \neq BA\), and matrix multiplication does not satisfy the cancellation law.

Theorem 3.1.4. If \(A\) and \(B\) are \(m \times n\) matrices such that \(A\vec{x} = B\vec{x}\) for every \(\vec{x} \in \mathbb{R}^n\), then \(A = B\).

Identity Matrix

Definition (Identity Matrix). The \(n \times n\) identity matrix \(I_n\) has \((I)_{ii} = 1\) and \((I)_{ij} = 0\) for \(i \neq j\). Its columns are the standard basis vectors of \(\mathbb{R}^n\).

Theorem 3.1.5. Let \(A\) be an \(m \times n\) matrix. Then \(I_m A = A\) and \(A I_n = A\).

Block Matrices

Definition (Block Matrix). An \(m \times n\) matrix \(A\) can be written as a \(k \times \ell\) block matrix where \(A_{ij}\) are submatrices (blocks) such that blocks in the same row have the same number of rows and blocks in the same column have the same number of columns. Block multiplication follows the same formula as ordinary matrix multiplication.

3.2 Linear Mappings

Theorem 3.2.1. Let \(A\) be \(m \times n\) and \(f(\vec{x}) = A\vec{x}\). Then for all \(\vec{x}, \vec{y} \in \mathbb{R}^n\) and \(b, c \in \mathbb{R}\), \(f(b\vec{x} + c\vec{y}) = bf(\vec{x}) + cf(\vec{y})\).

Definition (Linear Mapping). A function \(L : \mathbb{R}^n \to \mathbb{R}^m\) is a linear mapping if for every \(\vec{x}, \vec{y} \in \mathbb{R}^n\) and \(b, c \in \mathbb{R}\), \(L(b\vec{x} + c\vec{y}) = bL(\vec{x}) + cL(\vec{y})\).

The key consequence is that if \(\vec{x} = c_1\vec{v}_1 + \cdots + c_k\vec{v}_k\), then \(L(\vec{x}) = c_1 L(\vec{v}_1) + \cdots + c_k L(\vec{v}_k)\).

Theorem 3.2.2. Every linear mapping \(L : \mathbb{R}^n \to \mathbb{R}^m\) can be represented as \(L(\vec{x}) = [L]\vec{x}\) where \([L] = [L(\vec{e}_1) \cdots L(\vec{e}_n)]\).

Definition (Standard Matrix). The matrix \([L] = [L(\vec{e}_1) \cdots L(\vec{e}_n)]\) is the standard matrix of the linear mapping \(L\).

Theorem 3.2.3. Let \(R_\theta : \mathbb{R}^2 \to \mathbb{R}^2\) be rotation by angle \(\theta\). Then \([R_\theta] = \begin{pmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{pmatrix}\), and the columns of this rotation matrix are orthogonal unit vectors.

3.3 Special Subspaces

Kernel

Definition (Kernel). Let \(L : \mathbb{R}^n \to \mathbb{R}^m\) be linear. The kernel of \(L\) is \(\ker(L) = \{\vec{x} \in \mathbb{R}^n \mid L(\vec{x}) = \vec{0}\}\).

Lemma 3.3.1. If \(L : \mathbb{R}^n \to \mathbb{R}^m\) is linear, then \(L(\vec{0}) = \vec{0}\).

Theorem 3.3.2. Let \(L : \mathbb{R}^n \to \mathbb{R}^m\) be linear. Then \(\ker(L)\) is a subspace of \(\mathbb{R}^n\).

Range

Definition (Range). Let \(L : \mathbb{R}^n \to \mathbb{R}^m\) be linear. The range of \(L\) is \(R(L) = \{L(\vec{x}) \in \mathbb{R}^m \mid \vec{x} \in \mathbb{R}^n\}\).

Theorem 3.3.3. Let \(L : \mathbb{R}^n \to \mathbb{R}^m\) be linear. Then \(R(L)\) is a subspace of \(\mathbb{R}^m\).

Four Fundamental Subspaces

Theorem 3.3.4. Let \(L : \mathbb{R}^n \to \mathbb{R}^m\) be linear with standard matrix \(A = [L]\). Then \(\vec{x} \in \ker(L)\) if and only if \(A\vec{x} = \vec{0}\).

Definition (Nullspace). Let \(A\) be \(m \times n\). The nullspace of \(A\) is \(\operatorname{Null}(A) = \{\vec{x} \in \mathbb{R}^n \mid A\vec{x} = \vec{0}\}\).

Theorem 3.3.5. A consistent system \(A\vec{x} = \vec{b}\) has a unique solution if and only if \(\operatorname{Null}(A) = \{\vec{0}\}\).

Theorem 3.3.6. Let \(L : \mathbb{R}^n \to \mathbb{R}^m\) with standard matrix \(A = [\vec{a}_1 \cdots \vec{a}_n]\). Then \(R(L) = \operatorname{Span}\{\vec{a}_1, \ldots, \vec{a}_n\}\).

Definition (Columnspace). Let \(A = [\vec{a}_1 \cdots \vec{a}_n]\). The columnspace is \(\operatorname{Col}(A) = \operatorname{Span}\{\vec{a}_1, \ldots, \vec{a}_n\} = \{A\vec{x} \in \mathbb{R}^m \mid \vec{x} \in \mathbb{R}^n\}\).

Theorem 3.3.7. Let \(A\) be \(m \times n\). Then \(\operatorname{Col}(A) = \mathbb{R}^m\) if and only if \(\operatorname{rank}(A) = m\).

Definition (Rowspace). The rowspace of an \(m \times n\) matrix \(A\) is \(\operatorname{Row}(A) = \{A^T \vec{x} \in \mathbb{R}^n \mid \vec{x} \in \mathbb{R}^m\}\).

Definition (Left Nullspace). The left nullspace of \(A\) is \(\operatorname{Null}(A^T) = \{\vec{x} \in \mathbb{R}^m \mid A^T \vec{x} = \vec{0}\}\).

The nullspace, columnspace, rowspace, and left nullspace are the four fundamental subspaces of a matrix.

Theorem 3.3.8. Let \(A\) be \(m \times n\). If \(\vec{a} \in \operatorname{Row}(A)\) and \(\vec{x} \in \operatorname{Null}(A)\), then \(\vec{a} \cdot \vec{x} = 0\).

Theorem 3.3.9. Let \(A\) be \(m \times n\). If \(\vec{a} \in \operatorname{Col}(A)\) and \(\vec{x} \in \operatorname{Null}(A^T)\), then \(\vec{a} \cdot \vec{x} = 0\).

3.4 Operations on Linear Mappings

Definition (Addition and Scalar Multiplication of Mappings). Let \(L, M : \mathbb{R}^n \to \mathbb{R}^m\) be linear and \(c \in \mathbb{R}\). Define \((L + M)(\vec{x}) = L(\vec{x}) + M(\vec{x})\) and \((cL)(\vec{x}) = cL(\vec{x})\).

Theorem 3.4.1. The set \(\mathcal{L}\) of all linear mappings \(L : \mathbb{R}^n \to \mathbb{R}^m\) satisfies properties V1–V10.

Theorem 3.4.2. For linear mappings \(L, M : \mathbb{R}^n \to \mathbb{R}^m\) and \(c \in \mathbb{R}\), \([L + M] = [L] + [M]\) and \([cL] = c[L]\).

Composition

Definition (Composition). Let \(L : \mathbb{R}^n \to \mathbb{R}^m\) and \(M : \mathbb{R}^m \to \mathbb{R}^p\) be linear. Then \((M \circ L)(\vec{x}) = M(L(\vec{x}))\).

Theorem 3.4.3. If \(L : \mathbb{R}^n \to \mathbb{R}^m\) and \(M : \mathbb{R}^m \to \mathbb{R}^p\) are linear, then \(M \circ L\) is linear and \([M \circ L] = [M][L]\).

Chapter 4: Vector Spaces

4.1 Vector Spaces

Definition (Vector Space). A set \(V\) with operations of addition \(\vec{x} + \vec{y}\) and scalar multiplication \(c\vec{x}\) is a vector space over \(\mathbb{R}\) if for any \(\vec{v}, \vec{x}, \vec{y} \in V\) and \(a, b \in \mathbb{R}\):
V1: \(\vec{x} + \vec{y} \in V\); V2: associativity of addition; V3: commutativity of addition; V4: existence of zero vector \(\vec{0}\); V5: existence of additive inverse \(-\vec{x}\); V6: \(a\vec{x} \in V\); V7: \(a(b\vec{x}) = (ab)\vec{x}\); V8: \((a+b)\vec{x} = a\vec{x} + b\vec{x}\); V9: \(a(\vec{x}+\vec{y}) = a\vec{x} + a\vec{y}\); V10: \(1\vec{x} = \vec{x}\).

Examples. The following are vector spaces: \(\mathbb{R}^n\) and its subspaces; \(M_{m \times n}(\mathbb{R})\); the set of linear mappings \(\mathbb{R}^n \to \mathbb{R}^m\); \(P_n(\mathbb{R})\) (polynomials of degree at most \(n\)); \(C[a,b]\) (continuous functions on \([a,b]\)); the trivial vector space \(\{\vec{0}\}\).

Theorem 4.1.1. Let \(V\) be a vector space. Then:
(1) \(0\vec{x} = \vec{0}\) for all \(\vec{x} \in V\).
(2) \(-\vec{x} = (-1)\vec{x}\) for all \(\vec{x} \in V\).

Subspaces of Vector Spaces

Definition (Subspace). Let \(V\) be a vector space. If \(S \subseteq V\) and \(S\) is a vector space under the same operations as \(V\), then \(S\) is a subspace of \(V\).

Theorem 4.1.2 (Subspace Test). Let \(S\) be a non-empty subset of a vector space \(V\). If \(\vec{x} + \vec{y} \in S\) and \(c\vec{x} \in S\) for all \(\vec{x}, \vec{y} \in S\) and \(c \in \mathbb{R}\), then \(S\) is a subspace of \(V\).

Spanning and Linear Independence in Vector Spaces

Definition (Span in a Vector Space). Let \(B = \{\vec{v}_1, \ldots, \vec{v}_k\}\) be a set of vectors in a vector space \(V\). Then \(\operatorname{Span} B = \{c_1\vec{v}_1 + \cdots + c_k\vec{v}_k \mid c_1, \ldots, c_k \in \mathbb{R}\}\).

Theorem 4.1.3. Let \(B = \{\vec{v}_1, \ldots, \vec{v}_k\}\) be a set of vectors in a vector space \(V\). Then \(\operatorname{Span} B\) is a subspace of \(V\).

Definition (Linear Dependence/Independence in a Vector Space). A set \(\{\vec{v}_1, \ldots, \vec{v}_k\}\) in a vector space \(V\) is linearly dependent if there exist \(c_1, \ldots, c_k\), not all zero, with \(\vec{0} = c_1\vec{v}_1 + \cdots + c_k\vec{v}_k\). It is linearly independent if the only solution is the trivial one.

Theorem 4.1.4. Any set of vectors containing the zero vector is linearly dependent.

4.2 Bases and Dimension

Theorem 4.2.1. If \(\vec{v}_i \in \operatorname{Span}\{\vec{v}_1, \ldots, \vec{v}_{i-1}, \vec{v}_{i+1}, \ldots, \vec{v}_k\}\), then \(\operatorname{Span}\{\vec{v}_1, \ldots, \vec{v}_k\} = \operatorname{Span}\{\vec{v}_1, \ldots, \vec{v}_{i-1}, \vec{v}_{i+1}, \ldots, \vec{v}_k\}\).

Theorem 4.2.2. If \(\vec{v}_i \notin \operatorname{Span}\{\vec{v}_1, \ldots, \vec{v}_{i-1}, \vec{v}_{i+1}, \ldots, \vec{v}_k\}\) for \(1 \leq i \leq k\), then \(\{\vec{v}_1, \ldots, \vec{v}_k\}\) is linearly independent.

Definition (Basis). Let \(V\) be a vector space. The set \(B\) is a basis for \(V\) if \(B\) is a linearly independent spanning set for \(V\).

A basis is both a minimal spanning set and a maximal linearly independent set.

Dimension

Theorem 4.2.3. Let \(B = \{\vec{v}_1, \ldots, \vec{v}_n\}\) be a basis for \(V\), and let \(\{\vec{w}_1, \ldots, \vec{w}_k\}\) be a linearly independent set in \(V\). Then \(k \leq n\).

Theorem 4.2.4. If \(\{\vec{v}_1, \ldots, \vec{v}_n\}\) and \(\{\vec{w}_1, \ldots, \vec{w}_k\}\) are both bases for \(V\), then \(k = n\).

Definition (Dimension). If \(\{\vec{v}_1, \ldots, \vec{v}_n\}\) is a basis for \(V\), then the dimension of \(V\) is \(\dim V = n\). The trivial vector space has dimension 0 (its basis is the empty set). A vector space that has no finite basis is infinite-dimensional.

Theorem 4.2.5. Let \(V\) be an \(n\)-dimensional vector space. Then:
(1) A set of more than \(n\) vectors in \(V\) must be linearly dependent.
(2) A set of fewer than \(n\) vectors in \(V\) cannot span \(V\).
(3) A set of \(n\) vectors in \(V\) is linearly independent if and only if it spans \(V\).

Theorem 4.2.6 (Extension Theorem). Let \(V\) be \(n\)-dimensional and \(\{\vec{v}_1, \ldots, \vec{v}_k\}\) be linearly independent with \(k < n\). Then there exist vectors \(\vec{w}_{k+1}, \ldots, \vec{w}_n\) such that \(\{\vec{v}_1, \ldots, \vec{v}_k, \vec{w}_{k+1}, \ldots, \vec{w}_n\}\) is a basis for \(V\).

Standard dimensions. \(\dim \mathbb{R}^n = n\); \(\dim M_{m \times n}(\mathbb{R}) = mn\); \(\dim P_n(\mathbb{R}) = n + 1\).

4.3 Coordinates

Theorem 4.3.1. Let \(B = \{\vec{v}_1, \ldots, \vec{v}_n\}\) be a basis for \(V\). Then every \(\vec{v} \in V\) can be represented as a unique linear combination of \(\vec{v}_1, \ldots, \vec{v}_n\).

Definition (Coordinate Vector). Let \(V\) be a vector space with basis \(B = \{\vec{v}_1, \ldots, \vec{v}_n\}\). For \(\vec{v} = b_1\vec{v}_1 + \cdots + b_n\vec{v}_n\), the coordinate vector of \(\vec{v}\) with respect to \(B\) is \([\vec{v}]_B = \begin{pmatrix} b_1 \\ \vdots \\ b_n \end{pmatrix}\).

Theorem 4.3.2. Let \(B = \{\vec{v}_1, \ldots, \vec{v}_n\}\) be a basis for \(V\). Then for any \(\vec{v}, \vec{w} \in V\) and \(s, t \in \mathbb{R}\), \([s\vec{v} + t\vec{w}]_B = s[\vec{v}]_B + t[\vec{w}]_B\).

Change of Coordinates

Definition (Change of Coordinates Matrix). Let \(B = \{\vec{v}_1, \ldots, \vec{v}_n\}\) and \(C\) both be bases for \(V\). The change of coordinates matrix from \(B\)-coordinates to \(C\)-coordinates is \({}_C P_B = [[\vec{v}_1]_C \cdots [\vec{v}_n]_C]\), and \([\vec{x}]_C = {}_C P_B [\vec{x}]_B\).

Theorem 4.3.3. Let \(B\) and \(C\) be bases for an \(n\)-dimensional vector space \(V\). Then \({}_C P_B \cdot {}_B P_C = I = {}_B P_C \cdot {}_C P_B\).

Chapter 5: Inverses and Determinants

5.1 Matrix Inverses

Definition (Left and Right Inverse). Let \(A\) be \(m \times n\). If \(B\) is \(n \times m\) with \(AB = I_m\), then \(B\) is a right inverse of \(A\). If \(CA = I_n\), then \(C\) is a left inverse of \(A\).

Theorem 5.1.1. If \(A\) is \(m \times n\) with \(m > n\), then \(A\) cannot have a right inverse.

Corollary 5.1.2. If \(A\) is \(m \times n\) with \(m < n\), then \(A\) cannot have a left inverse.

Theorem 5.1.3. Let \(A, B, C\) be \(n \times n\) matrices with \(AB = I = CA\). Then \(B = C\).

Definition (Inverse / Invertible). Let \(A\) be \(n \times n\). If \(B\) satisfies \(AB = I = BA\), then \(B = A^{-1}\) is the inverse of \(A\), and \(A\) is invertible.

Theorem 5.1.4. If \(A\) is \(n \times n\) and there exists \(B\) with \(AB = I\), then \(\operatorname{rank} A = n = \operatorname{rank} B\) and \(BA = I\). Hence \(A\) is invertible.

Theorem 5.1.5. Let \(A, B\) be invertible and \(c \neq 0\). Then:
(1) \((cA)^{-1} = \frac{1}{c}A^{-1}\).
(2) \((A^T)^{-1} = (A^{-1})^T\).
(3) \((AB)^{-1} = B^{-1}A^{-1}\).

Theorem 5.1.6. If \(A\) is \(n \times n\) with \(\operatorname{rank} A = n\), then \(A\) is invertible. Moreover, \([A \mid I] \sim [I \mid A^{-1}]\).

Invertible Matrix Theorem

Theorem 5.1.7 (Invertible Matrix Theorem). For an \(n \times n\) matrix \(A\), the following are equivalent:
(1) \(A\) is invertible.
(2) The RREF of \(A\) is \(I\).
(3) \(\operatorname{rank} A = n\).
(4) \(A\vec{x} = \vec{b}\) is consistent with a unique solution for all \(\vec{b} \in \mathbb{R}^n\).
(5) \(\operatorname{Null}(A) = \{\vec{0}\}\).
(6) The columns of \(A\) form a basis for \(\mathbb{R}^n\).
(7) The rows of \(A\) form a basis for \(\mathbb{R}^n\).
(8) \(A^T\) is invertible.

If \(A\) is invertible, the unique solution to \(A\vec{x} = \vec{b}\) is \(\vec{x} = A^{-1}\vec{b}\).

5.2 Elementary Matrices

Definition (Elementary Matrix). An \(n \times n\) matrix \(E\) is an elementary matrix if it can be obtained from \(I\) by performing exactly one elementary row operation.

Every elementary matrix is invertible; its inverse is the elementary matrix corresponding to the reverse row operation.

Theorem 5.2.1. Let \(E\) be the elementary matrix for \(R_i + cR_j\). Then \(EA\) is the matrix obtained from \(A\) by performing \(R_i + cR_j\) on \(A\).

Theorem 5.2.2. Let \(E\) be the elementary matrix for \(cR_i\). Then \(EA\) is obtained from \(A\) by performing \(cR_i\).

Theorem 5.2.3. Let \(E\) be the elementary matrix for \(R_i \leftrightarrow R_j\). Then \(EA\) is obtained from \(A\) by swapping rows \(i\) and \(j\).

Corollary 5.2.4. If \(E\) is an \(m \times m\) elementary matrix and \(A\) is \(m \times n\), then \(\operatorname{rank}(EA) = \operatorname{rank} A\).

Theorem 5.2.5. For any \(m \times n\) matrix \(A\), there exist elementary matrices \(E_1, \ldots, E_k\) such that \(E_k \cdots E_1 A = R\) where \(R\) is the RREF of \(A\).

Corollary 5.2.6. If \(A\) is an \(n \times n\) invertible matrix, then both \(A\) and \(A^{-1}\) can be written as products of elementary matrices.

Theorem 5.2.7. If \(E\) is an elementary matrix, then \(E^T\) is also an elementary matrix.

5.3 Determinants

Definition (\(2 \times 2\) Determinant). For \(A = \begin{pmatrix} a & b \\ c & d \end{pmatrix}\), \(\det A = ad - bc\).

Definition (Cofactor). Let \(A\) be \(n \times n\) with \(n \geq 2\). Let \(A(i,j)\) be the \((n-1) \times (n-1)\) matrix obtained by deleting the \(i\)-th row and \(j\)-th column. The cofactor of \(a_{ij}\) is \(C_{ij} = (-1)^{i+j} \det A(i,j)\).

Definition (\(n \times n\) Determinant). For an \(n \times n\) matrix \(A\) with \(n \geq 2\), \(\det A = \sum_{i=1}^n a_{i1} C_{i1}\), where \(\det[c] = c\) for \(1 \times 1\) matrices.

Theorem 5.3.1 (Cofactor Expansion). For any \(n \times n\) matrix \(A\), \(\det A = \sum_{k=1}^n a_{ik} C_{ik}\) (expansion along the \(i\)-th row) or \(\det A = \sum_{k=1}^n a_{kj} C_{kj}\) (expansion along the \(j\)-th column).

Triangular Matrices

Definition (Upper/Lower Triangular). An \(m \times n\) matrix \(U\) is upper triangular if \(u_{ij} = 0\) for \(i > j\). A matrix \(L\) is lower triangular if \(l_{ij} = 0\) for \(i < j\).

Theorem 5.3.2. If \(A\) is an \(n \times n\) upper or lower triangular matrix, then \(\det A = a_{11} a_{22} \cdots a_{nn}\).

Determinants and Row Operations

Theorem 5.3.3. If \(B\) is obtained from \(A\) by multiplying one row by \(c\), then \(\det B = c \det A\).

Theorem 5.3.4. If \(B\) is obtained from \(A\) by swapping two rows, then \(\det B = -\det A\).

Corollary 5.3.5. If \(A\) has two identical rows, then \(\det A = 0\).

Theorem 5.3.6. If \(B\) is obtained from \(A\) by adding a multiple of one row to another, then \(\det B = \det A\).

Corollary 5.3.7. If \(E\) is an \(n \times n\) elementary matrix and \(A\) is \(n \times n\), then \(\det(EA) = \det E \cdot \det A\).

Theorem 5.3.8 (Addition to IMT). An \(n \times n\) matrix \(A\) is invertible if and only if \(\det A \neq 0\).

Theorem 5.3.9. If \(A\) and \(B\) are \(n \times n\), then \(\det(AB) = \det A \cdot \det B\).

Corollary 5.3.10. If \(A\) is invertible, then \(\det A^{-1} = \frac{1}{\det A}\).

Theorem 5.3.11. For any \(n \times n\) matrix \(A\), \(\det A = \det A^T\).

5.4 Determinants and Systems of Equations

Inverse by Cofactors

Lemma 5.4.1. Let \(A\) be \(n \times n\) with cofactors \(C_{ij}\). Then \(\sum_{k=1}^n a_{ik} C_{jk} = 0\) for \(i \neq j\).

Theorem 5.4.2. If \(A\) is invertible, then \((A^{-1})_{ij} = \frac{1}{\det A} C_{ji}\).

Definition (Cofactor Matrix). The cofactor matrix \(\operatorname{cof} A\) has \((\operatorname{cof} A)_{ij} = C_{ij}\).

Definition (Adjugate). The adjugate of \(A\) is \(\operatorname{adj} A = (\operatorname{cof} A)^T\), and \(A^{-1} = \frac{1}{\det A} \operatorname{adj} A\).

Cramer’s Rule

Theorem 5.4.3 (Cramer’s Rule). If \(A\) is invertible, the solution of \(A\vec{x} = \vec{b}\) is given by \(x_i = \frac{\det A_i}{\det A}\), where \(A_i\) is the matrix obtained from \(A\) by replacing its \(i\)-th column with \(\vec{b}\).

5.5 Area and Volume

The determinant has a geometric interpretation. For vectors \(\vec{u}, \vec{v} \in \mathbb{R}^2\), the area of the parallelogram they induce is \(|\det[\vec{u}\ \vec{v}]|\). For \(\vec{u}, \vec{v}, \vec{w} \in \mathbb{R}^3\), the volume of the parallelepiped is \(|\det[\vec{u}\ \vec{v}\ \vec{w}]|\).

Remark. More generally, if \(\vec{v}_1, \ldots, \vec{v}_n \in \mathbb{R}^n\), the \(n\)-volume of the parallelotope they induce is \(|\det[\vec{v}_1 \cdots \vec{v}_n]|\).

Chapter 6: Diagonalization

6.1 Matrix of a Linear Mapping and Similar Matrices

Definition (B-Matrix). Let \(B = \{\vec{v}_1, \ldots, \vec{v}_n\}\) be a basis for \(\mathbb{R}^n\) and \(L : \mathbb{R}^n \to \mathbb{R}^n\) a linear operator. The \(B\)-matrix of \(L\) is \([L]_B = [[L(\vec{v}_1)]_B \cdots [L(\vec{v}_n)]_B]\), satisfying \([L(\vec{x})]_B = [L]_B [\vec{x}]_B\).

If \(P = [\vec{v}_1 \cdots \vec{v}_n]\) is the change of coordinates matrix from \(B\) to the standard basis, then \([L]_B = P^{-1}[L]P\).

Definition (Diagonal Matrix). An \(n \times n\) matrix \(D\) is diagonal if \(d_{ij} = 0\) for \(i \neq j\). We write \(D = \operatorname{diag}(d_{11}, \ldots, d_{nn})\).

Similar Matrices

Theorem 6.1.1. Let \(A\) and \(B\) be \(n \times n\) with \(P^{-1}AP = B\) for some invertible \(P\). Then:
(1) \(\operatorname{rank} A = \operatorname{rank} B\).
(2) \(\det A = \det B\).
(3) \(\operatorname{tr} A = \operatorname{tr} B\), where \(\operatorname{tr} A = \sum_{i=1}^n a_{ii}\).

Definition (Similar Matrices). Matrices \(A\) and \(B\) are similar if there exists an invertible \(P\) with \(P^{-1}AP = B\).

6.2 Eigenvalues and Eigenvectors

If \(A\) is diagonalizable with \(P^{-1}AP = D = \operatorname{diag}(\lambda_1, \ldots, \lambda_n)\), then the columns \(\vec{v}_i\) of \(P\) satisfy \(A\vec{v}_i = \lambda_i \vec{v}_i\) with \(\vec{v}_i \neq \vec{0}\).

Definition (Eigenvalue, Eigenvector, Eigenpair). Let \(A\) be \(n \times n\). If \(A\vec{v} = \lambda\vec{v}\) for some \(\vec{v} \neq \vec{0}\), then \(\lambda\) is an eigenvalue, \(\vec{v}\) is an eigenvector, and \((\lambda, \vec{v})\) is an eigenpair.

Definition (Eigenvalues/Eigenvectors of a Linear Operator). Let \(L : \mathbb{R}^n \to \mathbb{R}^n\) be a linear operator. If \(L(\vec{v}) = \lambda\vec{v}\) for \(\vec{v} \neq \vec{0}\), then \(\lambda\) is an eigenvalue and \(\vec{v}\) an eigenvector of \(L\).

To find eigenvalues, we solve \((A - \lambda I)\vec{v} = \vec{0}\); for non-trivial solutions we need \(\det(A - \lambda I) = 0\).

Definition (Characteristic Polynomial). The characteristic polynomial of an \(n \times n\) matrix \(A\) is \(C(\lambda) = \det(A - \lambda I)\).

Theorem 6.2.1. A scalar \(\lambda\) is an eigenvalue of \(A\) if and only if \(C(\lambda) = 0\).

Definition (Eigenspace). The eigenspace of the eigenvalue \(\lambda\) is \(E_\lambda = \operatorname{Null}(A - \lambda I)\).

Definition (Algebraic and Geometric Multiplicity). If \(C(\lambda) = (\lambda - \lambda_1)^k C_1(\lambda)\) with \(C_1(\lambda_1) \neq 0\), then the algebraic multiplicity is \(a_{\lambda_1} = k\). The geometric multiplicity is \(g_{\lambda_1} = \dim(E_{\lambda_1})\).

Lemma 6.2.2. If \(A\) and \(B\) are similar, they have the same characteristic polynomial and hence the same eigenvalues.

Theorem 6.2.3. For any eigenvalue \(\lambda_1\) of \(A\), \(1 \leq g_{\lambda_1} \leq a_{\lambda_1}\).

6.3 Diagonalization

Definition (Diagonalizable). An \(n \times n\) matrix \(A\) is diagonalizable if \(A\) is similar to a diagonal matrix \(D\). If \(P^{-1}AP = D\), then \(P\) diagonalizes \(A\).

Lemma 6.3.1. Let \(A\) have eigenpairs \((\lambda_1, \vec{v}_1), \ldots, (\lambda_k, \vec{v}_k)\) with \(\lambda_i \neq \lambda_j\) for \(i \neq j\). Then \(\{\vec{v}_1, \ldots, \vec{v}_k\}\) is linearly independent.

Theorem 6.3.2. Let \(A\) have distinct eigenvalues \(\lambda_1, \ldots, \lambda_k\), and let \(B_i\) be a basis for \(E_{\lambda_i}\). Then \(B_1 \cup \cdots \cup B_k\) is a linearly independent set.

Corollary 6.3.3. An \(n \times n\) matrix \(A\) with distinct eigenvalues \(\lambda_1, \ldots, \lambda_k\) is diagonalizable if and only if \(g_{\lambda_i} = a_{\lambda_i}\) for all \(1 \leq i \leq k\).

Corollary 6.3.4. An \(n \times n\) matrix with \(n\) distinct eigenvalues is diagonalizable.

The diagonalization algorithm: (1) Factor \(C(\lambda)\). (2) If any root is complex, \(A\) is not diagonalizable over \(\mathbb{R}\). (3) Find a basis for each eigenspace. (4) If \(g_\lambda < a_\lambda\) for any eigenvalue, \(A\) is not diagonalizable. Otherwise, form \(P\) from the eigenvectors and \(D = \operatorname{diag}(\lambda_1, \ldots, \lambda_n)\).

6.4 Powers of Matrices

Theorem 6.4.1. If \(P^{-1}AP = D\) for a diagonal matrix \(D\), then \(A^k = P D^k P^{-1}\) for any positive integer \(k\).

Since \(D^k = \operatorname{diag}(d_1^k, \ldots, d_n^k)\), this makes computing large powers of diagonalizable matrices very efficient.

Example. For \(A = \begin{pmatrix} 1 & 2 \\ -1 & 4 \end{pmatrix}\), the eigenvalues are \(\lambda_1 = 2\) and \(\lambda_2 = 3\), giving \(P = \begin{pmatrix} 2 & 1 \\ 1 & 1 \end{pmatrix}\) and \(D = \begin{pmatrix} 2 & 0 \\ 0 & 3 \end{pmatrix}\). Then \(A^{1000} = P \begin{pmatrix} 2^{1000} & 0 \\ 0 & 3^{1000} \end{pmatrix} P^{-1} = \begin{pmatrix} 2^{1001} - 3^{1000} & -2^{1001} + 2 \cdot 3^{1000} \\ 2^{1000} - 3^{1000} & -2^{1000} + 2 \cdot 3^{1000} \end{pmatrix}\).

Back to top