Tesseract Versions Save

Tesseract Open Source OCR Engine (main repository)

5.0.0-rc3

2 years ago

This is the third release candidate of Tesseract 5.0.0.

  • Improve training messages
  • Add RowAttributes getter to PageIterator

See also list of all changes.

4.1.3

2 years ago

This is a new stable release of Tesseract 4.1.

  • Fix broken autoconf build (issue #3642)

See also list of all changes.

4.1.2

2 years ago

This is a new stable release of Tesseract 4.1.

Note: The autoconf build is broken (see issue #3642), so please use 4.1.3.

  • Allow line images with larger width for training
  • Bug fixes
  • Build updates and fixes

See also list of all changes.

5.0.0-rc2

2 years ago

This is the second release candidate of Tesseract 5.0.0.

  • Fix regression for OCR with more than one model file
  • Bug fixes
  • Optimizations

See also list of all changes.

5.0.0-rc1

2 years ago

This is the first release candidate of Tesseract 5.0.0.

  • Enable fast float32 LSTM by default
  • Switch to NFC normalisation everywhere
  • Remove banner message
  • Disable music staff detection and removal
  • Add new command line option --loglevel
  • Bug fixes

See also list of all changes.

5.0.0-beta-20210916

2 years ago

This is a new pre-release of Tesseract 5.0.0.

  • Bug fixes
  • Extend URI support for Tesseract with libcurl
  • Rename processed TIFF output file and add page number if needed

See also list of all changes.

5.0.0-beta-20210815

2 years ago

This is a new pre-release of Tesseract 5.0.0.

  • Bug fixes
  • Modernize more code
  • More options for binarization
  • Improved support for ARM NEON
  • No longer depends on Abseil for unit tests
  • Support float for model training and text recognition (faster, requires less RAM)

See also list of all changes.

5.0.0-alpha-20210401

3 years ago

This is a new pre-release of Tesseract 5.0.0.

  • Replaced all remaining STRING by std::string
  • Replaced lots of GenericVector by std::vector
  • Replaced all malloc / free by C++ code
  • Modernized and formatted code

See also list of all changes.

5.0.0-alpha-20201231

3 years ago

This is a new pre-release of Tesseract 5.0.0.

It has massive changes in the public API which is a great step towards a final 5.0.0. All unit tests pass, but because of those changes more practical experience is needed.

  • the public API no longer uses proprietary data types GenericVector, STRING
  • pdf.ttf is no longer needed because it is now integrated into the code

See also list of all changes.

5.0.0-alpha-20201224

3 years ago

This is a new pre-release of Tesseract 5.0.0.

It is considered to be production ready for end users, but nevertheless not stable because more incompatible API changes are planned.

  • improved performance (also on ARM / ARM64)
  • improved unit tests
  • many fixes
  • faster flat build with automake
  • support for latest macOS (including new M1 processor)

See also list of all changes.