Lagrangian and Constrained Optimization
Definition
The Lagrangian method converts a constrained optimization problem into an unconstrained one by incorporating constraints into the objective via Lagrange multipliers. The KKT conditions characterise optima of constrained problems.
Intuition
A constraint restricts the feasible region. The Lagrange multiplier measures the sensitivity of the optimal value to relaxing that constraint — if is large, tightening the constraint costs a lot; if , the constraint is inactive at the optimum. The method works because at a constrained optimum, the gradient of the objective is parallel to the gradient of the active constraint.
Formal Description
Equality Constraints (Classical Lagrange Multipliers)
Problem: subject to , .
Lagrangian:
Necessary condition (Lagrange condition): at a local minimum :
Geometrically: lies in the span of .
Inequality Constraints (KKT Conditions)
Problem: s.t. , ; , .
Lagrangian:
KKT conditions (necessary for local optimum under constraint qualification; sufficient for convex problems):
| Condition | Equation |
|---|---|
| Stationarity | |
| Primal feasibility | , |
| Dual feasibility | |
| Complementary slackness | for all |
Complementary slackness means: either the constraint is active () or the multiplier is zero (). This classifies constraints as active (binding) or inactive.
Dual Problem
The Lagrange dual function:
Dual problem:
- Weak duality: always holds.
- Strong duality: holds for convex problems satisfying Slater’s condition (feasible interior point exists).
The dual is always concave (max of a concave function) and often easier to solve.
Sensitivity Interpretation
At the optimum, where is the RHS of equality constraint . Lagrange multipliers are shadow prices — the marginal value of relaxing a constraint.
Applications
Support Vector Machine (SVM): primal problem is a QP; dual formulation:
The kernel trick enters naturally in the dual, replacing with .
Lasso/Constrained regression: s.t. — Lagrangian gives the penalised form.
Portfolio optimisation: maximise expected return subject to variance constraint.
Trade-offs
- KKT conditions identify candidates for optima but do not guarantee finding a global optimum for non-convex problems.
- The dual can be much lower-dimensional than the primal (e.g., SVM dual has variables, primal has ).