Entity Extraction Text Processor
Version 2.7 incorporates a number of changes and improvements, including some breaking changes.
regex.Mgrs
now adds GeoJSON to extracted coordinatesregex.NaiveParagraph
to naively annotate paragraphs based on multiple new linestriage.TokenFrequencySummarisation
to use a token frequency approach to document summarisationCsvFolderReader
collection reader to add line numbers and reprocess files that are modifiedFor a complete list of changes, see the Git commit log.
This release provides new functionality including new annotators, new consumers, additional functionality for event extraction, relationship extraction and document triage, support for horizontal and vertical scaling and more.
Of note, two new consumers analysis.mongo
and analysis.elasticsearch
have been added which allow exploitation of Baleen's output within Jonah
For full details, see What's New in Baleen 2.6.0.
The majority of changes should be backwards compatible with Baleen 2.4.0 however, the Elasticsearch consumer has been upgraded from version 2 to to 5 (tested on 5.6.4.) This is likely to be a breaking change and will require Elasticsearch servers to be upgraded. However the ElasticsearchRest consumer should still work with Elasticsearch 2.
This release contains a large number of changes, improvements and new features - including new annotators, an updated type system, self ordering pipelines, structure extraction, templating, and a whole lot more!
For full details, see What's New in Baleen 2.4.0. For upgrade instructions, see Upgrading Between Versions.
The following is a summary of the new features and changes in Baleen 2.3.0. There may be additional changes - refer to the diff and commit log for full details.
Since the previous release, the following changes have been made.
Please be aware that some aspects of this release may not be backwards compatible with previous versions. Refer to the wiki for information on upgrading between versions.
The following is a summary of the new features and changes in Baleen 2.2.0. There may be additional changes and features. Please refer to the diff and commit logs for full details.
New core features
New collection readers and improvements to existing collection readers
New annotators and improvements to existing annotators
New consumers and improvements to existing consumers
New jobs
New resources
Bug fixes, improved unit testing, updated dependencies and reductions to technical debt
Please be aware that some aspects of this release may not be backwards compatible with previous versions.
This version includes the following improvements:
This is the initial open source release of Baleen, v2.0.0.
For more information, please refer to the README.md.