COER Save

Chinese Open Entity-Relation Knowledge Base

Project README

COER

Chinese Open Entity-Relation Knowledge Base.

COER is a scalable entity and relation corpus, which currently contains more than 100,000,000 relation triples, where relations are open and arbitrary. Its design is aimed to make up for the lack of corpora in the field of Chinese information extraction. It is created automatically by unsupervised open extractor from diverse and heterogeneous web text, including encyclopedia and news. These corpuses contain military, sports, entertainment, economics and other fields, which ensures the openness of my base. The extracted triple set are stored in a series of XML files. Relation items are composed of original text, entity pairs, relationship phrases and shortest dependency paths. Each “Entity_pair” unit includes two argument entries, and every “relation_phrase” unit contains several mention entries. Meanwhile, entries own rich attributes. The organization of the content can be represented by a tree structure.

Citation:

Jia S, Li M, Xiang Y. Chinese Open Relation Extraction and Knowledge Base Establishment[J]. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 2018, 17(3): 15.

Please email us to get more information. Contact: [email protected]

Open Source Agenda is not affiliated with "COER" Project. README Source: TJUNLP/COER
Stars
34
Open Issues
1
Last Commit
5 years ago
Repository

Open Source Agenda Badge

Open Source Agenda Rating