This repo is for developing a Docstrum algorithm presented by O’Gorman (1993).
This source code is built on top of the work by Chadoliver. Please find the original code from here (https://github.com/chadoliver/cosc428-structor).
This project aims at segmenting a document image into meaningful components. The domain of image is specified on historical machine-printed/hand-written document image.
numpy
cv2
O'Gorman, L., 1993. The document spectrum for page layout analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(11), pp.1162-1173. pdf.
@article{o1993document,
title={The document spectrum for page layout analysis},
author={O'Gorman, Lawrence},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
volume={15},
number={11},
pages={1162--1173},
year={1993},
publisher={IEEE}
}
find . -name '.DS_Store' -type f -delete