Deployment and Serving Index

Making trained models accessible in production.

Notes

  • Serving Patterns — batch, online REST/gRPC, streaming, and edge inference patterns with latency/throughput trade-offs
  • Model Compression — quantization, pruning, knowledge distillation, and ONNX export for efficient deployment
  • Rollout Strategies — canary, blue-green, shadow mode, and feature-flag-controlled progressive rollouts

3 items under this folder.