Improves documentation and adds additional tutorials and examples.
Adds support for layout parsing via publaynet models
0.3.8
1 year ago
Adds documentation for missing codebase.
Update Document classifier and table extractor interface.
0.3.7
1 year ago
Minor updates to setup.py and README.
Update Missing requirements.
0.3.6
1 year ago
Adds an experimental DocumentClassifier Module: that lets users classify documents into 16 different categories like invoice, newspaper, resume, reserch-paper etc.
minor refactoring
0.3.5
1 year ago
Minor improvements to codebase.
Improves Table Parser - adds support for csv formatter.
Adds Pipeline Config.
Improves TextOcrPipeline - provides pipeline run info and refactoring.
Adds alternate pipeline load and run methods via ocrpy config - yaml files.
0.3.4
1 year ago
Code formatting and refactoring.
Adds support for High level text parser interface via TextParser.
0.3.3
1 year ago
Adds Missing docstrings to Parsers and Readers.
Minor improvements to parsers and readers - consistent attribute names and methods.
Adds Shared File utils.
Overall refactoring of the codebase.
0.3.2
1 year ago
adds core pipeline support
adds support for processing files in batch in a directory or cloud bucket(gcs & s3)