Training Data Index
Constructing high-quality training datasets.
Notes
- Data Labeling — annotation workflows, weak supervision, and quality control for constructing ground-truth datasets
- Class Imbalance and Augmentation — oversampling, undersampling, SMOTE, and augmentation strategies for skewed label distributions
- Dataset Versioning — DVC-based lineage, data contracts, and reproducible data snapshots