A benchmark dataset for evaluating dialog system and natural language generation metrics.
No reviews for this project.