Skip to content

trivialfis/H2L

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

H2L

OCR for handwritten math expression.

Introduction

H2L is built for recognizing handwritten mathematical equations using neural networks. For now, we use convolutional net combined with other traditional techniques to do the trick.

Internal structure

Everything begins at the function heuristicGenerate in evaluate.py.

Steps for generating PDF file (internal):

  1. Crop the corresponding area of found text with crop_image.
  2. Gray scale and binarize with binarizeNd.
  3. Segment all the lines using line_segmenter.segment
  4. Construct equation from line images using build_equation.
    1. Segment characters in a line with heuristicSegmenter.segmenter.
    2. Check whether a character is a super script or subscript.
    3. Predict the characters from above segmented character images with characterRecognizer.recognizer
    4. Return the resulting equation from the line image.
  5. Translate the equation strings into .tex file and call pdflatex to compile.

Training the neural networks:

Originally, there are some neural nets in the code performing different tasks, but we only use one of them to do the recognition due to their poor performance. At the very first, I have tried to use similar techniques to do the segmentation, sadly the resulting performance is a disaster. So all the segmentation are implemented with heuristic techniques.

Codes for training neural networks are in trainer directory, trained models are in models directory, configuration for learning parameters are in configuration directory, you can use train.py to kick the training processes start.

Used model

For now, we only use the convolutional net for character recognition, specifically, a modified version of res-50 or a simple cnn.

Used dataset

Misc

Codes are glued together, and since the project is still in experiment phase, there are lots of deprecated but not removed code in this repository. These code will be cleaned in the future.

What to do next

There is a huge problem when trying to recognize those characters. Although many papers were published to address the image recognition problem, but the performance described in papers and in real life are different. So, here the main task is to tackle the recognition performance issue.

Collaborators

To-do list for fis: [0/3]

  • [ ] A better character segmenter
  • [ ] Better character recognizer
  • [ ] Characters segmentation check for o shape
  • [ ] Add more tests.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published