Best 35 Rlhf Open Source Projects

Reproduce alpaca

Collections of all kinds of LLMs finetuning scripts

Safe-RLHF: Constrained Value Alignment via Safe Reinforcement Learning f...

A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer h...

对ChatGLM直接使用RLHF提升或降低目标输出概率|Modify ChatGLM output with o...