Jeopardy Clue Dataset Save

A dataset containing 473,000 Jeopardy! clues (1984–2023).

Project README

jeopardy_clue_dataset

Jeopardy! Logo

This dataset contains Jeopardy! clues from Season 1 through Season 39 (July 2023). It does not contain every clue that has appeared on the show. The data source prefers not to be credited.

There are 473,067 clues in total. Most of them can be found in combined_season1-39.tsv. This file is approx. 68 MB.

There are also individual files for each season (located in the seasons folder). These files are small enough that you should be able to open them with Microsoft Excel or Google Sheets.

  • Seasons 1–11 average 8,821 clues each.
  • Seasons 12–38 average 13,260 clues each.

There is a kids_teen.tsv file which contains only clues that appeared in Kids and Teen Tournament matches. These clues are in the combined dataset but this file is included for convenience.

Clues appearing in special matches outside the daily syndicated program are found in extra_matches.tsv. This file has 4,750 clues and they do not appear in the combined dataset.

I've done my best to clean the data and filter out clues that depend on images, video, or audio.


Column Information

Label Description
round 1 for Single Jeopardy, 2 for Double Jeopardy, or 3 for Final Jeopardy. (Note: These values are different in extra_matches.tsv to account for Triple Jeopardy.)
clue_value The clue's value on the board before any Daily Double wagering.
daily_double_value If the clue is a Daily Double, this column is the amount wagered. Otherwise it's zero.
category i.e. the top row of the board.
comments The host's comments about a category.
answer The prompt given to contestants.
question The correct response.
air_date The calendar date on which the episode first aired.
notes Misc. information about the clue, e.g. if it's from a special tournament match.

Other Data

A file with contestant scoring data can be found in the other_data folder. There are columns for each contestant's score after the Single, Double, and Final Jeopardy rounds. Most but not all episodes from combined_season1-39.tsv are included.


FAQ

How do I download the dataset?

If you're new to Github and aren't sure what's going on, click the green Code button near the top of the page, then click Download ZIP.

What is a .TSV file?

The data is written in plain text and organized like a spreadsheet with a TAB character between each cell. You can open the files with applications like Microsoft Excel or Google Sheets.


All data is property of Jeopardy Productions, Inc. and protected under law. I am not affiliated with the show. Please don't use the data to make a public-facing web site, app, or any other product.

Open Source Agenda is not affiliated with "Jeopardy Clue Dataset" Project. README Source: jwolle1/jeopardy_clue_dataset
Stars
77
Open Issues
0
Last Commit
8 months ago

Open Source Agenda Badge

Open Source Agenda Rating