Fine-tuning Index

Adapting model weights to specific tasks or domains.

Notes

Fine-tuning Strategies — When and how to fine-tune vs prompt, full vs PEFT, and dataset size considerations.
PEFT and LoRA — Parameter-efficient fine-tuning methods, LoRA rank selection, QLoRA, Axolotl, and SFTTrainer tips.
RL Fine-tuning — RLHF, DPO, GRPO with TRL; reward modelling and preference training pipelines.