A collection of corpora for named entity recognition (NER) and entity re...
NLTK Data
Data repository for pretrained NLP models and NLP corpora.
微信公众号语料库
A collaborative catalog of NLP resources for Indic languages
Fuzzing resources for feeding various fuzzers with input. 🔧
Links to Russian corpora + Python functions for loading and parsing
Official source for spanish Language Models and resources made @ BSC-TEM...
A web-based engine for creating and annotating textual corpora
Open Korean NLP Dataset Curation for the Users All Around the Globe
CrossNER: Evaluating Cross-Domain Named Entity Recognition (AAAI-2021)
Automatic categorization of documents, consists in assigning a category ...
Unannotated Spanish 3 Billion Words Corpora
A curated list of resources dedicated to Natural Language Processing (NL...
An R package for dynamic exploration of text collections