This is a simple document reading program that is made to read machine printed text and has a function to detect IBANs. It was a student project and is far from working perfectly. However, with text that is not blurry the results are ok. The program mainly consists of two parts, a character extractor and a classifier. Character extraction can either be done by connected component analysis in a binary image made from the input or by extracting maximally stable extremal regions from the input. Classification is done by a small convolutional neural network.
The code is written in python 3.6 and requires the following packages:
To use the program clone this repository, navigate to it in the terminal and run the interface:
$ python interface.py
A documentation is available on this repository's GitHub page.
For a theoretical background on the used algorithms and for results see the report. For a visualization of the individual steps take a look at this notebook.
Roman Remme, Lucas-Raphael Mueller, Lucas Moeller