Named entity extraction from Portuguese web text
My master dissertation on Named entity extraction from Portuguese web text, at FEUP (Faculty of Engineering of University of Porto).
Entity extraction using well-established tools (OpenNLP, Stanford CoreNLP, spaCy and NLTK) for the Portuguese language, and more specifically for the news section in University of Porto Information System - SIGARRA and all its subdomains.
Author: André Ricardo Oliveira Pires
Supervisor: Sérgio Nunes
Co-supervisor: José Devezas
In colaboration with: FEUP InfoLab and INESC TEC
For more information, regarding the developing process, guidelines for each tool, results obtained, resources created (trained NER models and annotated dataset) and more, check wiki.