Norms

Definition

A norm is a function mapping vectors to non-negative values, measuring the “size” or distance from the origin. A function $f$ is a norm if it satisfies:

$f (x) = 0 \Rightarrow x = 0$
$f (x + y) \leq f (x) + f (y)$ (triangle inequality)
$\forall α \in R, f (α x) = ∣ α ∣ f (x)$

Intuition

Different norms measure “size” with different geometries. The $L^{2}$ norm measures straight-line distance; the $L^{1}$ norm measures Manhattan (city-block) distance and is robust to outliers; the $L^{\infty}$ norm is controlled by the single largest component. Choosing the right norm shapes the geometry of optimisation — $L^{1}$ encourages sparsity, $L^{2}$ encourages smoothness.

Formal Description

The $L^{p}$ norm is defined as

∥ x ∥_{p} = (i \sum ∣ x_{i} ∣^{p})^{1/ p} .

$L^{2}$ norm (Euclidean norm). $∥ x ∥_{2} = ∥ x ∥$ . The squared $L^{2}$ norm $∥ x ∥_{2}^{2}$ is often preferred computationally since its derivatives depend only on individual elements.

$L^{1}$ norm. Grows at the same rate everywhere; useful when distinguishing exact zeros from small nonzero values:

∥ x ∥_{1} = i \sum ∣ x_{i} ∣.

$L^{\infty}$ norm (max norm). Equals the absolute value of the largest element:

∥ x ∥_{\infty} = i max ∣ x_{i} ∣.

Frobenius norm (for matrices):

∥ A ∥_{F} = i, j \sum A_{i, j}^{2} .

The dot product can be expressed in terms of the $L^{2}$ norm and the angle $θ$ between vectors:

x^{⊤} y = ∥ x ∥_{2} ∥ y ∥_{2} cos θ .

The $L^{0}$ “norm” (count of nonzero entries) is not a true norm, as scaling a vector does not change its nonzero count.

Applications

$L^{2}$ regularisation (ridge / weight decay) penalises large weights and yields smooth solutions.
$L^{1}$ regularisation (LASSO) induces sparsity and automatic feature selection.
$L^{\infty}$ norm arises in minimax problems and robustness guarantees.

Trade-offs

$L^{2}$ is differentiable everywhere and computationally convenient; $L^{1}$ is non-differentiable at zero, requiring subgradient or proximal methods. $L^{p}$ norms with $p < 1$ promote stronger sparsity but are non-convex, making optimisation much harder.

Notes

Explorer

norms

Norms

Definition

Intuition

Formal Description

Applications

Trade-offs

Links

Graph View

Table of Contents

Backlinks