LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA
Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or...