The Natural Language for Visual Reasoning corpora use the task of determining whether a sentence is true about a visual input, like an image. This task focuses on reasoning about sets of objects, comparisons, and spatial relations. This includes two datasets: NLVR, with synthetically generated images, and NLVR2, which includes natural photographs.
See the webpage for examples and the leaderboards here: http://lic.nlp.cornell.edu/nlvr/
If you have questions, please use the Issues page, or email us directly: [email protected]
Following Microsoft COCO (http://cocodataset.org/#termsofuse), we have licensed the NLVR dataset (synthetically-generated images, structured representations, and annotations) under CC-BY-4.0 (https://creativecommons.org/licenses/by/4.0/).
We have licensed the annotations of the NLVR2 images (sentences and binary labels) under CC-BY-4.0 (https://creativecommons.org/licenses/by/4.0/). We do not license the NLVR2 images as we do not hold the copyright to them.