Fine-tuning Index
Adapting model weights to specific tasks or domains.
Notes
- Fine-tuning Strategies — When and how to fine-tune vs prompt, full vs PEFT, and dataset size considerations.
- PEFT and LoRA — Parameter-efficient fine-tuning methods, LoRA rank selection, QLoRA, Axolotl, and SFTTrainer tips.
- RL Fine-tuning — RLHF, DPO, GRPO with TRL; reward modelling and preference training pipelines.
Navigation
← Prev ← RAG and Agents | Next → Dataset Engineering →