06 — AI Engineering
Engineering discipline for building, evaluating, and operating applications built on foundation models — from model selection and evaluation through fine-tuning, inference optimization, and production feedback loops.
Guiding question: “How do we build reliable, useful systems with foundation models?”
This layer does NOT cover: general ML infrastructure and pipelines (→ 05_ml_engineering), classical model selection and statistical trade-offs (→ 03_modeling), or Transformer mathematical derivations (→ 01_foundations/06_deep_learning_theory).
Sublayers
01 — Foundation Models
Model families, scaling laws, alignment (RLHF/DPO/GRPO), tokenization.
02 — Evaluation
LLM evaluation taxonomy, benchmarks (MMLU/HumanEval), AI-as-judge, LM Eval Harness.
03 — Prompt Engineering
CoT, few-shot, structured outputs (Instructor/Guidance), prompt injection defense.
04 — RAG & Agents
RAG architectures, vector stores (Chroma/FAISS), agentic loop, function calling, multi-agent (CrewAI), DSPy.
05 — Fine-tuning
LoRA/QLoRA (Axolotl/LLaMA-Factory), RLHF/DPO/GRPO (TRL), fine-tuning strategy.
06 — Dataset Engineering
Instruction data design, synthetic data generation (Self-Instruct, Constitutional AI).
07 — Inference Optimization
Quantization (AWQ/GPTQ/GGUF/bitsandbytes), Flash Attention, KV cache, vLLM/llama.cpp.
08 — Architecture & Feedback
AI application architecture, LLM observability (LangSmith), safety (LlamaGuard), data flywheel.
Relationship to Other Layers
- 05_ml_engineering — production ML infrastructure that AI engineering builds on for foundation-model-specific concerns
- 04_software_engineering — general software patterns applied to LLM system design
- 08_implementations — concrete implementations synthesizing this layer’s concepts into working code