Software Dependencies

Python 2.7
Tensorflow 0.12
- Needs to be this specific version if you do not want to modify any code
- Tensorflow made changes to the name of the function that initializes variable tensors in v0.12
- Tensorflow made changes yet again to the function that initializes variable tensors in v1.0
- So if you run on another version of Tensorflow, you will need to make changes to the code.
OpenCV 3.1.0
Numpy
Matplotlib
Pandas
Pillow
h5py
argparse

And for making use of the real-time camera app, you will need a computer with a webcam plugged in.

Running the Realtime Multi-Digit Recognition App

If you just want to try out the app then you can just run the following:

python live_predictor.py

This will launch an application that displays the captured video feed from the default webcam on the computer, and overlays the predicted bounding boxes and digits on the screen in real time.

Train from Scratch

If you want to use the existing workflow and existing deep learning architectures included to train a new model from scratch, then you can follow the following steps.

1. Data

The data used for training the models in this project are not included. The data can be downloaded from the following links.

The scripts assume that the contents of these tar files are extracted within a subdirectory "data". This should result in the three subdirectories:

data/train
data/extra
data/test

To process the data to be used by the neural network, simply run the following in the command line:

python create_data_pickles.py -i "data"

This will create several pickle files within the data subdirectory that will be used by the neural network directly.

2. Training

python trainer.py -m a -e 10 -b 32 -a 0.001 -o "A_01"

argument	what it is	possible values
-m	model architecture	"a", "b", "c", or "d"
-e	number of epochs	any integer value
-b	batch-size	any integer value
-a	alpha learning rate	a float between 0-1
-o	output name *	any string that could legally be used as a directory name.

* the output name will be used as a place where all snapshots, evaluation files, and visualisations for the model will be stored, in a subdirectory results/output_name

3. Evaluate on Test Dataset

python tester.py

This will print out the Per Digit Accuracy, Whole Number Accuracy, and Intersect of Union score on the test dataset. It will also save images of the following things in the imgs subdirectory:

Grid of predictions on test set
Grid of worst performing bounding box predictions
Grid of incorrect digit predictions

4. Real Time Prediction

python live_predictor.py

Will launch an application that displays the captured video feed from the default webcam on the computer, and overlays the predicted bounding boxes and digits on the screen in real time.

NOTE: You may need to change some of the code to get it to load up the checkpoint files for your trained model. By default it is looking for a model called "A_02" using the model_a architecture, which is the pre-trained model that is included with this repo.

Pre-Trained Models

Only the The checkpoint files for the best performing model (model A) are packaged here in the interest of keeping file size to a minimum. These are located in results/A_02/checkpoints.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
file_support		file_support
image_processing		image_processing
nnet		nnet
process_data		process_data
results		results
support		support
README.md		README.md
capstone_project_description.pdf		capstone_project_description.pdf
create_data_pickles.py		create_data_pickles.py
create_data_pickles.sh		create_data_pickles.sh
evals.py		evals.py
explore_data.py		explore_data.py
graphs.py		graphs.py
live_predictor.py		live_predictor.py
main.py		main.py
misc_support.py		misc_support.py
report.pdf		report.pdf
settings.py		settings.py
tester.py		tester.py
trainer.py		trainer.py
vis.py		vis.py

wandering-sage/real_time_multi_digit_recognition

Folders and files

Latest commit

History

Repository files navigation

Software Dependencies

Running the Realtime Multi-Digit Recognition App

Train from Scratch

1. Data

2. Training

3. Evaluate on Test Dataset

4. Real Time Prediction

Pre-Trained Models

About

Resources

Stars

Watchers

Forks

Languages