Coco Cn Save

Enriching MS-COCO with Chinese sentences and tags for cross-lingual multimedia tasks

Project README

COCO-CN

COCO-CN is a bilingual image description dataset enriching MS-COCO with manually written Chinese sentences and tags. The new dataset can be used for multiple tasks including image tagging, captioning and retrieval, all in a cross-lingual setting.

Chinese sentences COCO-CN train COCO-CN val COCO-CN test
human written :white_check_mark: :white_check_mark: :white_check_mark:
human translation :x: :x: :white_check_mark:
machine translation (baidu) :white_check_mark: :white_check_mark: :white_check_mark:
coco-cn annotation examples

Progress

  • version 201805: 20,341 images (training / validation / test: 18,341 / 1,000 / 1,000), associated with 22,218 manually written Chinese sentences and 5,000 manually translated sentences. Data is freely available upon request. Please submit your request via Google Form.
  • Precomputed image features: ResNext-101
  • COCO-CN-Results-Viewer: A lightweight tool to inspect the results of different image captioning systems on the COCO-CN test set, developed by Emiel van Miltenburg at the Tilburg University.
  • NUS-WIDE100: An extra test set.

Citation

If you find COCO-CN useful, please consider citing the following paper:

Open Source Agenda is not affiliated with "Coco Cn" Project. README Source: li-xirong/coco-cn

Open Source Agenda Badge

Open Source Agenda Rating