Lagrangian and Constrained Optimization

Definition

The Lagrangian method converts a constrained optimization problem into an unconstrained one by incorporating constraints into the objective via Lagrange multipliers. The KKT conditions characterise optima of constrained problems.

Intuition

A constraint restricts the feasible region. The Lagrange multiplier $λ$ measures the sensitivity of the optimal value to relaxing that constraint — if $λ$ is large, tightening the constraint costs a lot; if $λ = 0$ , the constraint is inactive at the optimum. The method works because at a constrained optimum, the gradient of the objective is parallel to the gradient of the active constraint.

Formal Description

Equality Constraints (Classical Lagrange Multipliers)

Problem: $min f (x)$ subject to $h_{j} (x) = 0$ , $j = 1, \dots, p$ .

Lagrangian:

L (x, ν) = f (x) + j = 1 \sum p ν_{j} h_{j} (x)

Necessary condition (Lagrange condition): at a local minimum $x^{*}$ :

\nabla_{x} L = \nabla f (x^{*}) + j \sum ν_{j}^{*} \nabla h_{j} (x^{*}) = 0

Geometrically: $\nabla f$ lies in the span of ${\nabla h_{j}}$ .

Inequality Constraints (KKT Conditions)

Problem: $min f (x)$ s.t. $g_{i} (x) \leq 0$ , $i = 1, \dots, m$ ; $h_{j} (x) = 0$ , $j = 1, \dots, p$ .

Lagrangian:

L (x, λ, ν) = f (x) + i = 1 \sum m λ_{i} g_{i} (x) + j = 1 \sum p ν_{j} h_{j} (x)

KKT conditions (necessary for local optimum under constraint qualification; sufficient for convex problems):

Condition	Equation
Stationarity	$\nabla_{x} L = 0$
Primal feasibility	$g_{i} (x^{}) \leq 0$ , $h_{j} (x^{}) = 0$
Dual feasibility	$λ_{i}^{*} \geq 0$
Complementary slackness	$λ_{i}^{} g_{i} (x^{}) = 0$ for all $i$

Complementary slackness means: either the constraint is active ( $g_{i} = 0$ ) or the multiplier is zero ( $λ_{i} = 0$ ). This classifies constraints as active (binding) or inactive.

Dual Problem

The Lagrange dual function: $g (λ, ν) = in f_{x} L (x, λ, ν)$

Dual problem: $max_{λ \geq 0, ν} g (λ, ν)$

Weak duality: $d^{*} \leq p^{*}$ always holds.
Strong duality: $d^{*} = p^{*}$ holds for convex problems satisfying Slater’s condition (feasible interior point exists).

The dual is always concave (max of a concave function) and often easier to solve.

Sensitivity Interpretation

At the optimum, $ν_{j}^{*} = - \partial p^{*} / \partial b_{j}$ where $b_{j}$ is the RHS of equality constraint $h_{j} (x) = b_{j}$ . Lagrange multipliers are shadow prices — the marginal value of relaxing a constraint.

Applications

Support Vector Machine (SVM): primal problem is a QP; dual formulation:

α max i \sum α_{i} - \frac{1}{2} i, j \sum α_{i} α_{j} y_{i} y_{j} x_{i}^{⊤} x_{j} s.t. i \sum α_{i} y_{i} = 0, 0 \leq α_{i} \leq C

The kernel trick enters naturally in the dual, replacing $x_{i}^{⊤} x_{j}$ with $k (x_{i}, x_{j})$ .

Lasso/Constrained regression: $min ∥ y - Xβ ∥^{2}$ s.t. $∥ β ∥_{1} \leq t$ — Lagrangian gives the penalised form.

Portfolio optimisation: maximise expected return subject to variance constraint.

Trade-offs

KKT conditions identify candidates for optima but do not guarantee finding a global optimum for non-convex problems.
The dual can be much lower-dimensional than the primal (e.g., SVM dual has $n$ variables, primal has $d + 1$ ).

Notes

Explorer

lagrangian_and_constrained_optimization