Supervised Learning

Definition

Learning a parameterized mapping from labeled examples by minimizing empirical risk.

Intuition

Find parameters that best explain the training labels; generalization is the key challenge. The model must learn structure that transfers beyond the training set — not just memorize labels.

Formal Description

Setup: dataset , hypothesis class .

Empirical risk minimization:

where is the task loss and is an optional regularizer.

Task types:

TaskOutputLoss
Binary classificationsigmoidBCE
Multi-class classificationsoftmaxCE
RegressionlinearMSE / MAE
Structured outputsequence, bounding boxtask-specific

Bias-variance decomposition:

High bias → underfitting (model too simple); high variance → overfitting (model too complex). Regularization, more data, and architectural choices all shift this tradeoff.

Applications

  • Image classification (ImageNet, medical imaging)
  • Speech recognition
  • Machine translation
  • Fraud detection
  • Medical diagnosis

Trade-offs

  • Requires labeled data — expensive to collect and annotate
  • Assumes train and test distributions match; distribution shift degrades performance
  • Capacity vs. generalization tradeoff: larger models can overfit with little data
  • More data generally helps but with diminishing returns; data quality often matters more than quantity