Spectral Theorem and Symmetric Matrices

Definition

A real symmetric matrix is always orthogonally diagonalisable:

where is orthogonal (, columns are orthonormal eigenvectors) and contains real eigenvalues. This is the spectral theorem for real symmetric matrices.

Intuition

Symmetric matrices arise whenever a quantity is “self-adjoint” — e.g., covariance matrices, Hessians, graph Laplacians. The spectral theorem says these matrices are always nicely diagonalisable: their eigenvectors are mutually perpendicular and span the whole space, and all eigenvalues are real. There’s no rotation; the matrix simply stretches along orthogonal axes.

Formal Description

Spectral decomposition:

The outer products are rank-1 orthogonal projection matrices. is a weighted sum of projections onto its eigendirections, with eigenvalues as weights.

Positive semidefinite (PSD) matrices:

is PSD if for all . Equivalent conditions:

  • All eigenvalues
  • for some matrix (Cholesky-like factorisation)
  • All principal minors are non-negative

Positive definite (PD): for all ; all eigenvalues ; is invertible.

Covariance matrices (centred data) are always PSD: for any , .

Cholesky decomposition: every PD matrix factors as where is lower-triangular with positive diagonal. More numerically stable than eigendecomposition for solving linear systems.

Functions of symmetric matrices: via spectral decomposition, matrix functions are well-defined:

Examples: (matrix square root), .

Rayleigh quotient:

; the maximum is attained at the leading eigenvector (used in power iteration and PCA).

Applications

ConceptRole of spectral theorem
PCAEigendecomposition of covariance matrix; eigenvectors = principal directions
Kernel methodsKernel matrix is PSD; Mercer’s theorem guarantees spectral expansion
Quadratic forms in optimizationHessian is symmetric; PD Hessian ↔ strictly convex, unique minimum
Graph Laplacian symmetric PSD; eigenvalues encode connectivity
Gaussian distributionsCovariance must be PSD for to be valid

Trade-offs

  • The spectral theorem only applies to symmetric (or normal) matrices; general square matrices require Jordan normal form, not a clean eigendecomposition.
  • Numerical eigendecomposition of symmetric matrices is well-conditioned and reliable; use numpy.linalg.eigh (not eig) for symmetric matrices — it’s faster and guarantees real eigenvalues.
  • PSD testing in practice: add small (Tikhonov regularisation) to handle numerical near-singularity.