Orthogonalization

Definition

A design principle for ML development where each intervention targets exactly one failure mode, keeping the effects of different controls independent.

Intuition

Like orthogonal dials on a mixing board — turning one should not affect others; if fixing bias also changes variance, you lose diagnostic clarity; early stopping is an example of a non-orthogonal tool (it entangles optimization and regularization).

Formal Description

The ML development chain has four distinct failure modes, each with its own orthogonal set of fixes:

StepFailure modeOrthogonal fixes
1Does not fit training set wellIncrease model capacity, tune optimizer, add features
2Does not generalize to dev setMore data, regularization, reduce model complexity
3Does not generalize to test setLarger/better dev set (possible overfit to dev)
4Does not work in real worldRevisit dev/test distribution, re-examine metrics

Applying a fix from step 2 to a step 1 problem is wasteful and can mask the real issue.

Early stopping is non-orthogonal: stopping early reduces training error (step 1) while simultaneously affecting generalization (step 2), making both harder to diagnose independently.

Applications

Structuring the ML iteration loop; debugging stalled training; deciding which knob to turn next.

Trade-offs

  • Orthogonality is an ideal — real interventions always have some side effects
  • Useful as a heuristic, not a rigid rule