OCR for handwritten math expression.
H2L is built for recognizing handwritten mathematical equations using neural networks. For now, we use convolutional net combined with other traditional techniques to do the trick.
Everything begins at the function heuristicGenerate
in evaluate.py.
- Crop the corresponding area of found text with
crop_image
. - Gray scale and binarize with
binarizeNd
. - Segment all the lines using
line_segmenter.segment
- Construct equation from line images using
build_equation
.- Segment characters in a line with
heuristicSegmenter.segmenter
. - Check whether a character is a super script or subscript.
- Predict the characters from above segmented character images with
characterRecognizer.recognizer
- Return the resulting equation from the line image.
- Segment characters in a line with
- Translate the equation strings into .tex file and call pdflatex to compile.
Originally, there are some neural nets in the code performing different tasks, but we only use one of them to do the recognition due to their poor performance. At the very first, I have tried to use similar techniques to do the segmentation, sadly the resulting performance is a disaster. So all the segmentation are implemented with heuristic techniques.
Codes for training neural networks are in trainer directory, trained models are in models directory, configuration for learning parameters are in configuration directory, you can use train.py to kick the training processes start.
For now, we only use the convolutional net for character recognition, specifically, a modified version of res-50 or a simple cnn.
Codes are glued together, and since the project is still in experiment phase, there are lots of deprecated but not removed code in this repository. These code will be cleaned in the future.
There is a huge problem when trying to recognize those characters. Although many papers were published to address the image recognition problem, but the performance described in papers and in real life are different. So, here the main task is to tackle the recognition performance issue.
- JZP @JZPHome (https://github.com/jzphome)
- [ ] A better character segmenter
- [ ] Better character recognizer
- [ ] Characters segmentation check for o shape
- [ ] Add more tests.