Best 33 Rlhf Open Source Projects

pykoi: Active learning in one unified interface

🛰️ 基于真实医疗对话数据在ChatGLM上进行LoRA、P-Tuning V2、Freeze、RLHF等...

A curated list of Human Preference Datasets for LLM fine-tuning, RLHF, a...

Chain-of-Hindsight, A Scalable RLHF Method

Code accompanying the paper Pretraining Language Models with Human Prefe...

The open source implementation of ChatGPT, Alpaca, Vicuna and RLHF Pipe...

LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA

A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer ha...

Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement L...

A recipe to train reward models for RLHF.

Okapi: Instruction-tuned Large Language Models in Multiple Languages wit...

Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction

Python client library for improving your LLM app accuracy

Reproduce alpaca

Collections of all kinds of LLMs finetuning scripts