Visual Inspection
Problem
Automatically detect defects, anomalies, or quality issues in physical products or environments using camera images or video — replacing or augmenting manual visual inspection on a production line, in a warehouse, or in field operations. Visual inspection ML must achieve human-level accuracy at machine speed, and the cost of a false negative (defective product shipped to customer) typically far exceeds the cost of a false positive (acceptable product rejected).
Users / Stakeholders
| Role | Decision |
|---|---|
| Quality control inspector | Review flagged items; approve borderline cases |
| Production line manager | Stop/continue production; escalate to maintenance |
| Quality engineer | Defect root cause analysis; process improvement |
| Customer quality team | Defect rate reporting; warranty claim analysis |
Domain Context
- Real-time constraint: Products on a moving line pass the camera at fixed intervals. Inference must complete before the product exits the inspection zone (typically 50–200ms budget).
- Limited labelled defects: Defects are rare by definition. Collecting labelled defect images requires running the line until defects occur. Semi-supervised and anomaly detection approaches are valuable.
- Defect taxonomy: Cracks, scratches, chips, contamination, incorrect assembly, missing components. Each defect type may require a different model or threshold.
- Lighting and sensor variation: Image quality depends on lighting, camera calibration, lens contamination. Distribution shift from sensor degradation is a significant production risk.
- Edge deployment: Cameras and inference must often run at the edge (no cloud round-trip). Edge hardware: NVIDIA Jetson, Intel OpenVINO, Google Coral. Model compression (quantization, pruning) required.
- ISO 9001 / IATF 16949: Quality management standards require documented inspection procedures. ML-based inspection requires validation records showing equivalence to manual inspection.
Inputs and Outputs
Input:
images: High-resolution camera frames (1MP–12MP), greyscale or RGB
trigger: Line encoder pulse triggers capture
metadata: product_id, batch_id, station_id, timestamp, shift
reference: Golden sample image for comparison (optional)
Output:
decision: PASS / FAIL / MARGINAL (human review)
defect_class: CRACK / SCRATCH / CONTAMINATION / MISSING_COMPONENT / ...
defect_location: Bounding box or segmentation mask on image
confidence: P(defect) ∈ [0, 1]
audit_record: Image + annotation stored for traceability
Decision or Workflow Role
Camera captures product image (triggered by conveyor encoder)
↓
Image preprocessing: normalisation, crop to ROI, white balance correction
↓
Inference on edge device (<50ms budget)
↓
PASS: Product continues down line
FAIL: Pneumatic reject gate activates
MARGINAL: Image queued for human review within 30s
↓
Review outcome logged → retraining dataset
↓
Weekly: retrain cycle with new confirmed defect examples
Modeling / System Options
| Approach | Strength | Weakness | When to use |
|---|---|---|---|
| CNN classification (ResNet, EfficientNet) | High accuracy with sufficient labelled data | Requires many defect examples | Mature product line with rich defect history |
| Anomaly detection (PatchCore, PADIM) | No defect labels required; detects novel defects | Higher FPR; harder to classify defect type | Few labelled defects; novel defect types |
| Object detection (YOLO) | Locates and classifies multiple defects in one pass | Requires bounding box annotations | Multiple defect types in one image |
| Semantic segmentation (U-Net) | Pixel-level defect mapping | Annotation cost; slower inference | Precise area measurement required |
| Traditional CV (template matching, edge detection) | No training needed; fast; interpretable | Fragile to variation; limited to simple defects | Simple go/no-go checks; quick deployment |
Recommended: PatchCore or PaDiM for initial deployment (no defect labels needed). Transition to fine-tuned EfficientNet-B4 as defect dataset grows. YOLO for multi-defect detection on complex assemblies.
Deployment Constraints
- Inference latency: < 50ms at the edge. Use TensorRT or OpenVINO for optimisation.
- Edge hardware: NVIDIA Jetson AGX Orin (50–100ms), Intel NUC + OpenVINO (30–80ms), Google Coral (~5ms for small models).
- Model size: Quantise to INT8 for edge. Accuracy drop typically <1% vs FP32.
- Connectivity: Edge inference with cloud sync for retraining data. Must function offline.
- Traceability: Every inspection decision must be stored with product ID for quality audits. Retention: typically 3–10 years.
Risks and Failure Modes
| Risk | Description | Mitigation |
|---|---|---|
| Lighting change | Sensor/bulb degradation changes image distribution → model fails | Automated lighting calibration; distribution shift monitoring |
| Novel defect type | New defect not in training data → classified as PASS | Anomaly detection as safety layer; low-confidence → MARGINAL |
| Edge device failure | Inference device crashes → line runs without inspection | Hardware watchdog; fallback to manual inspection alert |
| Overfit to training camera | Model doesn’t generalise to new cameras after equipment upgrade | Domain adaptation; camera-specific fine-tuning |
| Class imbalance | Very few defect examples → model biased toward PASS | Oversampling; synthetic defect augmentation |
Success Metrics
| Metric | Target | Notes |
|---|---|---|
| Defect detection recall | > 99% | Miss rate drives customer escapes |
| False positive rate | < 0.5% | Reject yield loss: economic cost |
| Inference latency P99 | < 50ms | Line speed constraint |
| Customer escape rate | < 10 PPM | Parts Per Million defective at customer |
| Inspector workload reduction | > 80% | Operational efficiency |
| Model uptime | > 99.5% | Reliability SLA |
References
- Bergmann, P. et al. (2020). Uninformed Students: Student-Teacher Anomaly Detection. CVPR.
- Roth, K. et al. (2022). Towards Total Recall in Industrial Anomaly Detection. (PatchCore)
Links
Modeling
- Deep Learning — CNN architectures, transfer learning
- Unsupervised Learning — anomaly detection
Reference Implementations
Adjacent Applications