Norms

Definition

A norm is a function mapping vectors to non-negative values, measuring the “size” or distance from the origin. A function is a norm if it satisfies:

  • (triangle inequality)

Intuition

Different norms measure “size” with different geometries. The norm measures straight-line distance; the norm measures Manhattan (city-block) distance and is robust to outliers; the norm is controlled by the single largest component. Choosing the right norm shapes the geometry of optimisation — encourages sparsity, encourages smoothness.

Formal Description

The norm is defined as

norm (Euclidean norm). . The squared norm is often preferred computationally since its derivatives depend only on individual elements.

norm. Grows at the same rate everywhere; useful when distinguishing exact zeros from small nonzero values:

norm (max norm). Equals the absolute value of the largest element:

Frobenius norm (for matrices):

The dot product can be expressed in terms of the norm and the angle between vectors:

The “norm” (count of nonzero entries) is not a true norm, as scaling a vector does not change its nonzero count.

Applications

  • regularisation (ridge / weight decay) penalises large weights and yields smooth solutions.
  • regularisation (LASSO) induces sparsity and automatic feature selection.
  • norm arises in minimax problems and robustness guarantees.

Trade-offs

is differentiable everywhere and computationally convenient; is non-differentiable at zero, requiring subgradient or proximal methods. norms with promote stronger sparsity but are non-convex, making optimisation much harder.