Awesome Llm Human Preference Datasets Save

A curated list of Human Preference Datasets for LLM fine-tuning, RLHF, and eval.

Project README

Awesome Human Preference Datasets for LLM 🧑❤️🤖

A curated list of open source Human Preference datasets for LLM instruction-tuning, RLHF and evaluation.

For general NLP datasets and text corpora, check out this awesome list.

Datasets

OpenAI WebGPT Comparisons

20k comparisons where each example comprises a question, a pair of model answers, and human-rated preference scores for each answer.
RLHF dataset used to train the OpenAI WebGPT reward model.

OpenAI Summarization

64k text summarization examples including human-written responses and human-rated model responses.
RLHF dataset used in the OpenAI Learning to Summarize from Human Feedback paper.
Explore sample data here.

Anthropic Helpfulness and Harmlessness Dataset (HH-RLHF)

In total 170k human preference comparisons, including human preference data collected for Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback and human-generated red teaming data from Red Teaming Language Models to Reduce Harms, divided into 3 sub-datasets:
- A base dataset using a context-distilled 52B model, with 44k helpfulness comparisons and 42k red-teaming (harmlessness) comparisons.
- A RS dataset of 52k helpfulness comparisons and 2k red-teaming comparisons using rejection sampling models, where rejection sampling used a preference model trained on the base dataset.
- An iterated online dataset including data from RLHF models, updated weekly over five weeks, with 22k helpfulness comparisons.

OpenAssistant Conversations Dataset (OASST1)

A human-generated, human-annotated assistant-style conversation corpus consisting of 161k messages in 35 languages, annotated with 461k quality ratings, resulting in 10k+ fully annotated conversation trees.

Stanford Human Preferences Dataset (SHP)

385K collective human preferences over responses to questions/instructions in 18 domains for training RLHF reward models and NLG evaluation models. Datasets collected from Reddit.

Reddit ELI5

270k examples of questions, answers and scores collected from 3 Q&A subreddits.

Human ChatGPT Comparison Corpus (HC3)

60k human answers and 27K ChatGPT answers for around 24K questions.
Sibling dataset available for Chinese.

HuggingFace H4 StackExchange Preference Dataset

10 million questions (with >= 2 answers) and answers (scored based on vote count) from Stackoverflow.

ShareGPT.com

90k (as of April 2023) user-uploaded ChatGPT interactions.
~~To access the data using ShareGPT's API, see documentation here~~ The ShareGPT API is currently disabled ("due to excess traffic").
Precompliled datasets on HuggingFace.

Alpaca

52k instructions and demonstrations generated by OpenAI's text-davinci-003 engine for self-instruct training.

GPT4All

1M prompt-response pairs colleced using GPT-3.5-Turbo API in March 2023. GitHub repo.

Databricks Dolly Dataset

15k instruction-following records generated by Databricks employees in categories including brainstorming, classification, closed QA, generation, information extraction, open QA, and summarization.

HH_golden

42k harmless data, same prompts and "rejected" responses as the Harmless dataset in Anthropic HH datasets, but the responses in the "chosen" responses are re-writtened using GPT4 to yield more harmless answers. The comparison before and after re-written can be found here. Empirically, compared with the original Harmless dataset, training on this dataset improves the harmless metrics for various alignment methods such as RLHF and DPO.

Open Source Agenda is not affiliated with "Awesome Llm Human Preference Datasets" Project. README Source: glgh/awesome-llm-human-preference-datasets

Stars

234

Open Issues

Last Commit

6 months ago

Repository

glgh/awesome-llm-human-preference-datasets

License

MIT

Open Source Agenda Badge

<a href="https://www.opensourceagenda.com/projects/awesome-llm-human-preference-datasets"><img src="https://www.opensourceagenda.com/projects/awesome-llm-human-preference-datasets/reviews/badge.svg" alt="Open Source Agenda"></a>

Submit Review Review Your Favorite Project

Submit Resource Articles, Courses, Videos

Submit Article Submit a post to our blog

From the blog

Dec 11, 2022

How to Choose Which Programming Language to Learn First?

From the blog

Dec 11, 2022