Tdc2023 Starter Kit Save

This is the starter kit for the Trojan Detection Challenge 2023 (LLM Edition), a NeurIPS 2023 competition.

Project README

Starter Kit for TDC 2023 (LLM Edition)

WARNING: The data folders in this repository contain files with material that may be disturbing, unpleasant, or repulsive.

This is the starter kit for the Trojan Detection Challenge 2023 (LLM Edition), a NeurIPS 2023 competition. To learn more about the competition, please see the competition website. Starter kits for individual tracks are in the trojan_detection and red_teaming folders. Please see the README in those folders for instructions on downloading data, running baselines, and generating submissions.

Post-competition evaluations: To evaluate methods on the held-out data and behavior classifiers, see the Local Evaluation section in the README file for each track. These scores can be compared with the official leaderboard scores.

Citation

If you find this useful in your research, please consider citing:

@inproceedings{tdc2023,
  title={TDC 2023 (LLM Edition): The Trojan Detection Challenge},
  author={Mantas Mazeika and Andy Zou and Norman Mu and Long Phan and Zifan Wang and Chunru Yu and Adam Khoja and Fengqing Jiang and Aidan O'Gara and Ellie Sakhaee and Zhen Xiang and Arezoo Rajabi and Dan Hendrycks and Radha Poovendran and Bo Li and David Forsyth},
  booktitle={NeurIPS Competition Track},
  year={2023}
}
Open Source Agenda is not affiliated with "Tdc2023 Starter Kit" Project. README Source: centerforaisafety/tdc2023-starter-kit
Stars
76
Open Issues
0
Last Commit
3 weeks ago
License
MIT

Open Source Agenda Badge

Open Source Agenda Rating