Lagrangian and Constrained Optimization

Definition

The Lagrangian method converts a constrained optimization problem into an unconstrained one by incorporating constraints into the objective via Lagrange multipliers. The KKT conditions characterise optima of constrained problems.

Intuition

A constraint restricts the feasible region. The Lagrange multiplier measures the sensitivity of the optimal value to relaxing that constraint — if is large, tightening the constraint costs a lot; if , the constraint is inactive at the optimum. The method works because at a constrained optimum, the gradient of the objective is parallel to the gradient of the active constraint.

Formal Description

Equality Constraints (Classical Lagrange Multipliers)

Problem: subject to , .

Lagrangian:

Necessary condition (Lagrange condition): at a local minimum :

Geometrically: lies in the span of .

Inequality Constraints (KKT Conditions)

Problem: s.t. , ; , .

Lagrangian:

KKT conditions (necessary for local optimum under constraint qualification; sufficient for convex problems):

ConditionEquation
Stationarity
Primal feasibility,
Dual feasibility
Complementary slackness for all

Complementary slackness means: either the constraint is active () or the multiplier is zero (). This classifies constraints as active (binding) or inactive.

Dual Problem

The Lagrange dual function:

Dual problem:

  • Weak duality: always holds.
  • Strong duality: holds for convex problems satisfying Slater’s condition (feasible interior point exists).

The dual is always concave (max of a concave function) and often easier to solve.

Sensitivity Interpretation

At the optimum, where is the RHS of equality constraint . Lagrange multipliers are shadow prices — the marginal value of relaxing a constraint.

Applications

Support Vector Machine (SVM): primal problem is a QP; dual formulation:

The kernel trick enters naturally in the dual, replacing with .

Lasso/Constrained regression: s.t. — Lagrangian gives the penalised form.

Portfolio optimisation: maximise expected return subject to variance constraint.

Trade-offs

  • KKT conditions identify candidates for optima but do not guarantee finding a global optimum for non-convex problems.
  • The dual can be much lower-dimensional than the primal (e.g., SVM dual has variables, primal has ).