Error Analysis

Definition

A systematic process of manually inspecting misclassified dev examples to identify and quantify error categories, enabling prioritized model improvement.

Intuition

Raw error rate tells you how bad the model is, but not why; error analysis reveals the largest tractable sources of error so effort goes to the highest-impact fixes.

Formal Description

Minimal workflow:

Sample ~100 misclassified dev examples
Define error categories (e.g., blurry image, label noise, unusual lighting, occlusion)
Count each category
Compute ceiling: if fixing category X would eliminate p% of errors, it’s worth at most a p% relative improvement
Prioritize by expected impact × implementation effort

Handling mislabeled examples: count mislabeled examples as a separate category; only worth fixing if they represent a significant fraction of total errors; apply corrections consistently to dev and test sets (never selectively to one).

Applications

Image classification, NLP tasks, speech recognition — anywhere you have interpretable dev errors.

Trade-offs

Time-consuming for very large error sets
Human judgment introduces bias in category definitions
Categories should be mutually exclusive to avoid double-counting

Notes

Explorer

error_analysis

Error Analysis

Definition

Intuition

Formal Description

Applications

Trade-offs

Links

Graph View

Table of Contents

Backlinks