System Patterns

Reusable production patterns for ML/AI systems in code.

How do I build this type of ML/AI system component?

Each note covers the design, code structure, key configuration, and integration points for a specific production pattern. Synthesises 05_ml_engineering/ and 06_ai_engineering/ engineering knowledge into runnable implementations.

Notes

Experiment & Data Management

Model Serving

  • Model Serving with FastAPI — REST inference API with Pydantic, background workers, health checks
  • vLLM Serving — high-throughput LLM serving with PagedAttention and OpenAI-compatible API

Training & Fine-tuning

RAG & Agents

LLM Tooling

Monitoring & Quality

Infrastructure