A Gtk/Qt front-end to tesseract-ocr.
Read and extract text and other content from PDFs in C# (port of PDFBox)
OCR engine for all the languages
Document Layout Analysis resources repos for development with PdfPig.
Conversions between various OCR formats
Text Overlay plugin for Mirador 3
Convert between Tesseract hOCR and ALTO XML using XSL stylesheets
Ergonomic line-by-line transcription of scanned text.
Text-to-tibble