System Patterns
Reusable production patterns for ML/AI systems in code.
How do I build this type of ML/AI system component?
Each note covers the design, code structure, key configuration, and integration points for a specific production pattern. Synthesises 05_ml_engineering/ and 06_ai_engineering/ engineering knowledge into runnable implementations.
Notes
Experiment & Data Management
- MLflow Experiment Tracking — run logging, artifact storage, model registry with named aliases
- DVC Dataset Versioning — data versioning, pipeline caching, remote storage
- Feature Store Pattern — offline/online feature computation, point-in-time correctness
- Training Pipeline Pattern — orchestrated data → features → training → registry workflow
Model Serving
- Model Serving with FastAPI — REST inference API with Pydantic, background workers, health checks
- vLLM Serving — high-throughput LLM serving with PagedAttention and OpenAI-compatible API
Training & Fine-tuning
- Distributed Training with Accelerate — multi-GPU/TPU training with Hugging Face Accelerate
- Deep Learning Training Patterns (PyTorch) — MLP, CNN, LSTM training loops with early stopping, validation, and model serialisation
- PEFT LoRA Fine-tuning — parameter-efficient fine-tuning with LoRA/QLoRA
- TRL Preference Training — DPO/RLHF/GRPO preference optimisation
- Quantization Deployment Pattern — AWQ, GPTQ, GGUF quantization for inference
RAG & Agents
- RAG Pipeline Pattern — indexing pipeline, query pipeline, hybrid retrieval
- Vector Database Retrieval — FAISS, pgvector, Qdrant comparison and implementation
- Chroma Vector Store — Chroma setup, embedding, CRUD, metadata filtering
- Agentic Loop Pattern — ReAct agent, tool use, multi-agent with LangGraph/CrewAI
LLM Tooling
- Instructor Structured Outputs — validated JSON extraction from LLMs with Pydantic
- DSPy Prompt Optimization — automatic prompt optimisation with DSPy signatures
- LLM Evaluation Pipeline — LM Eval Harness, LLM-as-judge, custom benchmarks
- LangSmith LLM Observability — tracing, dataset management, prompt versioning
- LlamaGuard Content Moderation — safety classification for LLM inputs and outputs
Monitoring & Quality
- Drift Monitoring with Evidently — feature/prediction drift detection, data quality reports
- Model Monitoring System — operational metrics, alerting, retraining triggers
Infrastructure
- Docker ML Pipeline — containerized training and serving images
- CD for ML — automated testing, model evaluation, and deployment pipelines
- MCP Server Implementation — Model Context Protocol server for LLM tool integration