Spectral Theorem and Symmetric Matrices
Definition
A real symmetric matrix is always orthogonally diagonalisable:
where is orthogonal (, columns are orthonormal eigenvectors) and contains real eigenvalues. This is the spectral theorem for real symmetric matrices.
Intuition
Symmetric matrices arise whenever a quantity is “self-adjoint” — e.g., covariance matrices, Hessians, graph Laplacians. The spectral theorem says these matrices are always nicely diagonalisable: their eigenvectors are mutually perpendicular and span the whole space, and all eigenvalues are real. There’s no rotation; the matrix simply stretches along orthogonal axes.
Formal Description
Spectral decomposition:
The outer products are rank-1 orthogonal projection matrices. is a weighted sum of projections onto its eigendirections, with eigenvalues as weights.
Positive semidefinite (PSD) matrices:
is PSD if for all . Equivalent conditions:
- All eigenvalues
- for some matrix (Cholesky-like factorisation)
- All principal minors are non-negative
Positive definite (PD): for all ; all eigenvalues ; is invertible.
Covariance matrices (centred data) are always PSD: for any , .
Cholesky decomposition: every PD matrix factors as where is lower-triangular with positive diagonal. More numerically stable than eigendecomposition for solving linear systems.
Functions of symmetric matrices: via spectral decomposition, matrix functions are well-defined:
Examples: (matrix square root), .
Rayleigh quotient:
; the maximum is attained at the leading eigenvector (used in power iteration and PCA).
Applications
| Concept | Role of spectral theorem |
|---|---|
| PCA | Eigendecomposition of covariance matrix; eigenvectors = principal directions |
| Kernel methods | Kernel matrix is PSD; Mercer’s theorem guarantees spectral expansion |
| Quadratic forms in optimization | Hessian is symmetric; PD Hessian ↔ strictly convex, unique minimum |
| Graph Laplacian | symmetric PSD; eigenvalues encode connectivity |
| Gaussian distributions | Covariance must be PSD for to be valid |
Trade-offs
- The spectral theorem only applies to symmetric (or normal) matrices; general square matrices require Jordan normal form, not a clean eigendecomposition.
- Numerical eigendecomposition of symmetric matrices is well-conditioned and reliable; use
numpy.linalg.eigh(noteig) for symmetric matrices — it’s faster and guarantees real eigenvalues. - PSD testing in practice: add small (Tikhonov regularisation) to handle numerical near-singularity.
Links
- Eigenvalues and Eigenvectors
- Singular Value Decomposition — SVD of symmetric PSD coincides with eigendecomposition
- Orthogonal Projections
- Convex Optimization — PD Hessian ↔ strictly convex function
- Multivariate Gaussian (covariance must be PSD)