Learning #31 – Linear Algebra for Scientists: Matrices, Eigenvectors, Geometric Transformations and PCA

Most scientists encounter linear algebra as a compulsory course before they need it. The result is that many learn the mechanics (row reduction, matrix multiplication) without the geometry. This post tries to do the opposite: start from geometric meaning and derive the algebra as the natural language for describing it.

1. Matrices as Linear Maps

A matrix A of size m×n defines a linear function f: ℝⁿ → ℝ^m by f(x) = Ax. Linear means two properties hold:

f(x + y) = f(x) + f(y)
f(αx) = α f(x)

Geometrically, a linear map sends straight lines to straight lines (or collapses them to a point if det A = 0), and sends the origin to itself. The columns of A tell you exactly where the standard basis vectors go: column j of A is f(e_j).

Composition and Basis Change

Composition:   (A ∘ B) x = A(Bx)   →   matrix product AB

Change of basis from B to C:
  [v]_C = M_{C←B} [v]_B   where M_{C←B} = C⁻¹ B

Similarity transform: A in basis C   →   A' = P⁻¹ A P
(P = change-of-basis matrix, columns = new basis vectors in old coords)

For orthonormal bases: P⁻¹ = Pᵀ  (rotation/reflection matrices)

This is why the choice of coordinate system matters so much in physics: expressing a tensor in its principal axes (the basis of eigenvectors) makes its action diagonal and interpretable. The inertia tensor of a rigid body becomes I₁, I₂, I₃ along the principal axes; the stress tensor at a point becomes three principal stresses without shear.

2. Determinants as Signed Volume Scaling

The determinant of a square matrix A equals the signed volume of the parallelotope spanned by its column vectors. For 2×2 matrices:

Determinant — Geometric and Algebraic Forms

2×2:  det(A) = ad − bc
       (signed area of parallelogram spanned by columns)

3×3:  det(A) = a(ei−fh) − b(di−fg) + c(dh−eg)
       (Sarrus rule / cofactor expansion along row 1)

Properties:
  det(AB) = det(A) det(B)
  det(Aᵀ) = det(A)         (transpose preserves volume)
  det(A⁻¹) = 1/det(A)
  det(αA) = α∧n det(A)  (scaling each row by α scales det by α)
  det(A) = 0  ⇔  A is singular  ⇔  columns are linearly dependent

In the interactive matrix transforms visualiser, the determinant determines how a shape’s area changes under the transformation: a unit square becomes a parallelogram with area |det A|. A negative determinant indicates a reflection (orientation reversal). When det = 0, the entire plane collapses onto a line or a point.

3. Eigenvectors and Eigenvalues

An eigenvector of matrix A is a non-zero vector v satisfying Av = λv: the transformation only scales the vector, not rotates it. The scalar λ is the corresponding eigenvalue.

Characteristic Polynomial and Diagonalisation

Eigenvalue equation:   Av = λv   ⇔   (A − λI)v = 0
Characteristic poly:   det(A − λI) = 0

For 2×2: λ² − tr(A)λ + det(A) = 0
  λ₁₂ = [tr(A) ± √(tr(A)² − 4 det(A))] / 2

Diagonalisation (if A has n independent eigenvectors):
  A = P D P⁻¹
  D = diag(λ₁, …, λₙ),   P = [v₁ | v₂ | … | vₙ]

Powers:  Aᵁ= = P Dᵁ= P⁻¹   (cheap: just raise each λ𝑖 to the power k)
Exponential: eᴬᵀ = P eᴬ P⁻¹  (useful for linear ODE systems)

Eigenvalues govern the long-run behaviour of linear dynamical systems x_n+1 = Ax_n: the system grows if any |λ| > 1 and contracts to zero if all |λ| < 1. For continuous systems dx/dt = Ax, stability requires all eigenvalues to have negative real parts.

4. The Spectral Theorem and Its Applications

The spectral theorem is the central result of linear algebra for physics:

Spectral Theorem (Real Symmetric Case)

Let A = Aᵀ (real symmetric, n×n).
Then:
  1. All eigenvalues of A are real.
  2. Eigenvectors for distinct eigenvalues are orthogonal.
  3. A is orthogonally diagonalisable:   A = Q Λ Qᵀ
     Q orthogonal (QᵀQ = I),  Λ = diag(λ₁, …, λₙ)

For Hermitian matrices (A = A†, complex):
  Same conclusions hold over ℂ.
  Eigenvalues real  ⇔  quantum observables give real measurements.

The spectral theorem has direct, foundational interpretations in multiple fields:

Quantum mechanics: every observable (energy, momentum, position) is a Hermitian operator. Its eigenvalues are the possible measurement outcomes; its eigenvectors are the quantum states with definite values of the observable (e.g., energy eigenstates for the Hamiltonian).
Normal modes: the mass-weighted stiffness matrix K_~ = M^−½KM^−½ is symmetric. Its eigenvectors are the normal mode shapes; its eigenvalues give ω_i². Diagonalisation decouples N coupled ODEs into N independent harmonic oscillators.
Finite element analysis: the global stiffness matrix K is symmetric positive-definite. Its eigenvalues are related to natural frequencies; solving Ku = f is well-posed when K has no zero eigenvalues (structure fully constrained).

5. Singular Value Decomposition (SVD)

Eigenvalue decomposition requires a square matrix. SVD generalises it to any m×n matrix and is more numerically stable:

SVD and the Pseudoinverse

A = U Σ Vᵀ    (any real m×n matrix)

U  m×m orthogonal  (left singular vectors = columns)
Σ  m×n diagonal   (singular values σ₁ ≥ σ₂ ≥ … ≥ 0)
V  n×n orthogonal  (right singular vectors = columns)

Relationship to eigenvalues:
  AᵀA = V ΣᵀΣ Vᵀ,  singular values σ𝑖 = √(eigenvalues of AᵀA)

Truncated SVD (rank-k approximation):
  A ≈ Uᵁ Σᵁ Vᵁᵀ   (best k-rank approximation, Eckart-Young theorem)

Moore-Penrose pseudoinverse:
  A⁺ = V Σ⁺ Uᵀ   where Σ⁺ = diag(1/σ₁, …, 1/σ𝑟, 0, …)
Least-squares solution: x∗ = A⁺ b  (minimises ‖Ax−b‖²)

SVD is the workhorse of numerical linear algebra: it solves least-squares problems (data fitting, tomographic reconstruction), computes low-rank approximations (image compression, latent semantic analysis), and provides the condition number κ(A) = σ_max/σ_min which quantifies how sensitive Ax = b is to perturbations.

6. Principal Component Analysis (PCA)

PCA finds the directions of maximum variance in a dataset. Given n data points in ℜ^d (rows of matrix X, mean-centred), PCA diagonalises the sample covariance matrix:

PCA via Covariance Eigendecomposition

Sample covariance:   C = (1/(n−1)) Xᵀ X    (d×d, symmetric PSD)
Eigendecomposition:  C = Q Λ Qᵀ
  λ₁ ≥ λ₂ ≥ … ≥ λ𝑑 ≥ 0   (principal variances)
  q₁, q₂, …, q𝑑             (principal components)

Projection onto first k PCs:
  X𝔌 = X Qᵁ   (n×k, low-dimensional representation)

Variance retained by k components:
  R_k = (λ₁ + … + λᵁ) / (λ₁ + … + λ𝑑)

Connection to SVD:
  If X = U Σ Vᵀ then eigenvectors of C = V,  λ𝑖 = σ𝑖²/(n−1)

PCA appears throughout science: it is used to identify the dominant modes of climate variability (EOF analysis), to compress gene expression profiles (bioinformatics), to separate signal sources in EEG/MEG (when combined with independent component analysis), and to initialise neural network training by whitening the input feature space.

Interactive Visualisations

The matrix transforms simulation lets you build geometric intuition for all the concepts above. Use the 2×2 sliders to construct rotations (det = 1), reflections (det = −1), shears (det = 1, one eigenvalue = 1), scalings, and projections (det = 0). The eigenvector overlay shows the fixed directions when they exist; the unit circle overlay shows where circles map under the transformation (the semi-axes of the result are the singular values).

📐 Mathematics

Matrix Transforms Visualiser

2×2 matrix sliders, unit grid / basis vectors / eigenvectors / unit circle layers, 8 presets, continuous animation, trace / det / λ₁/λ₂ / matrix-type panel.

📊 Probability

Linear Regression (OLS)

Click-to-add scatter plot, ordinary least-squares line, slope / intercept / R² / Pearson r / SSE, residuals, 5 data presets, undo/clear.

Why linear algebra everywhere? Because the real world is rarely linear — but it often approximately is, locally. Linearisation (Taylor expansion around an equilibrium, Jacobian matrix of a dynamical system) reduces any smooth non-linear problem to a linear one at small amplitude. The eigenvalues of that linear approximation determine local stability; the eigenvectors determine the characteristic timescales and normal-mode shapes.

Learning #31 – Linear Algebra for Scientists: Matrices, Eigenvectors, Transformations and PCA

1. Matrices as Linear Maps

Composition and Basis Change

2. Determinants as Signed Volume Scaling

Determinant — Geometric and Algebraic Forms

3. Eigenvectors and Eigenvalues

Characteristic Polynomial and Diagonalisation

4. The Spectral Theorem and Its Applications

Spectral Theorem (Real Symmetric Case)

5. Singular Value Decomposition (SVD)

SVD and the Pseudoinverse

6. Principal Component Analysis (PCA)

PCA via Covariance Eigendecomposition

Interactive Visualisations

Matrix Transforms Visualiser

Linear Regression (OLS)

Further Reading