Handwritten_Recognition

This is a handwritten recognition api. For current, it can detect line text from single line img. It used Gated CNN as encoder and Multi head attention model as decoder.

To improve it the following content could be done in future:

add a language model to improve the detection text result, especially for german.
add bounding box detector for line text, to make this api able to deal with paragraph text

Installation

conda  create -n venv python=3.7

conda activate venv

pip install -r requirements.txt

Model Architecture

Reshape all the input image with padding as shape (1,128,1024).#(channel,height,width)
This Model adopt CNN and Gated CNN as encoder to extract sequence features from input image. Change image from (1024,128,1) as sequence (1,128,128).
Build an Multi head attention model as feature decoder. This attention model is modified from ALBert, which achieves the SOTA result on many language task with a way smaller model compared with Bert..
Since the model is trained with CTCLoss, the prediction logits from model need to be decoded with ctc_decode. Because there isn't available ctc_decode from pytorch now, therefore this api use the ctc_decode from tensorflow. Without decoding, it will have repeated character in prediction result.
This model is very small checkpoints size is only 2.6MB. So it can run on any machine with fast speed.

Training

This model is only trained with IAM handwritten dataset. For training, we used 8000 line images and 1000 line images for validate. It trained 800 Epoch with batch size of 150. During training data augmentation is used to improve the model.
Since the model is trained with limited English data. It may not works perfectly for other language and different scenario. Therefore a language model is needed for german text.
Idea: we can use ALBert to train a language model in next step.
To retrain the model, run script:
```
python train_handwritten.py
```

API Example:

initial a detector model outside the repo folder

from handwritten_recognition import HandWritten

detector = HandWritten()

do detection process

text = detector.detect(img_path = "handwritten_recognition/images/r06-137-00.png")

Result Example From test set of IAM

result: 'The doorman turned his athentian'

result: 'to the next red - eyed emerger from the'

result: 'darke ; and we went an tvgether to'

result: 'the future would depend . Outwardly she was calm , but'

result: 'her heart was beating fast , an d the palms of her'

result: 'ohn autocracy . Her squeaking , quermlons'

result: 'aecents were heard without intr -'

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
checkpoints		checkpoints
images		images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
attention.py		attention.py
dataset.py		dataset.py
handwritten_api.py		handwritten_api.py
model.py		model.py
requirements.txt		requirements.txt
tokenizer.py		tokenizer.py
train_handwritten.py		train_handwritten.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

checkpoints

checkpoints

images

images

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

init.py

init.py

attention.py

attention.py

dataset.py

dataset.py

handwritten_api.py

handwritten_api.py

model.py

model.py

requirements.txt

requirements.txt

tokenizer.py

tokenizer.py

train_handwritten.py

train_handwritten.py

Repository files navigation

Handwritten_Recognition

Installation

Model Architecture

Training

API Example:

Result Example From test set of IAM

About

Releases

Packages

Languages

License

guanjianyu/pytorch_hand_written_recognition

Folders and files

Latest commit

History

Repository files navigation

Handwritten_Recognition

Installation

Model Architecture

Training

API Example:

Result Example From test set of IAM

About

Resources

License

Stars

Watchers

Forks

Languages