OpenAssistant is a chat-based assistant that understands tasks, can inte...
Unify Efficient Fine-Tuning of 100+ LLMs
中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-...
Official release of InternLM2 7B and 20B base and chat models. 200K cont...
Argilla is a collaboration platform for AI engineers and domain experts ...
A Doctor for your data
A curated list of reinforcement learning with human feedback resources (...
ms-swift: Use PEFT or Full-parameter to finetune 200+ LLMs or 15+ MLLMs
An Easy-to-use, Scalable and High-performance RLHF Framework (Support 70...
An automatic evaluator for instruction-following language models. Human-...
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences fo...
Aligning Large Language Models with Human: A Survey
A library with extensible implementations of DPO, KTO, PPO, ORPO, and ot...
Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedba...