Training Data Index

Constructing high-quality training datasets.

Notes

  • Data Labeling — annotation workflows, weak supervision, and quality control for constructing ground-truth datasets
  • Class Imbalance and Augmentation — oversampling, undersampling, SMOTE, and augmentation strategies for skewed label distributions
  • Dataset Versioning — DVC-based lineage, data contracts, and reproducible data snapshots