Project I for Pattern Recognition @ RIT
Both classifiers are found in the classification directory and can be run using the following
usage: kd_tree.py [-h] cmd f [c [c ...]]
Classify isolated handwritten math symbols with a k-d tree implementation of
the 1-NN algorithm.
positional arguments:
cmd Whether to test or train the model
f The filename for the csv file that contains the dataset.
c The class(es) that should be processed by the classifier
optional arguments:
-h, --help show this help message and exit
usage: hmm.py [-h] cmd f [c [c ...]]
Classify isolated handwritten math symbols with an HMM.
positional arguments:
cmd Whether to test or train the model
f The filename for the csv file that contains the dataset.
c The class(es) that should be processed by the classifier
optional arguments:
-h, --help show this help message and exit
For instance, the kd_tree
implementation can be trained
python kd_tree.py train path/to/training_set.csv
which will save a model at classification/models/kd_tree.model
and then
tested
python kd_tree.py test path/to/testing_set.csv
which will load the model from this location.
Pre-trained models have been provided so you need only run the test command to see our models in action.
usage: segmenter.py [-h] cmd type
Segmentation algorithm that uses Adaboost trained on a combination of
geometric, multi-scale shape context, and classifier features.
positional arguments:
cmd Whether to test or train the model
type Determine what size of data to train on
optional arguments:
-h, --help show this help message and exit
usage: parser.py [-h] cmd type
Parsing algorithm that uses an MST based aproach to perform structural
analysis on groupings of math symbols.
positional arguments:
cmd Whether to test, test_segmenter, or train the model
type Determine what size of data to train on
optional arguments:
-h, --help show this help message and exit
It is worth noting that the test_segmenter
option exists to allow you to test
parsing on segmentation output instead of ground truth. Though this is a requirement,
the pretrained models do not support the functionality as the segmenter was trained in full
before the method allowing this functionality was added. If you would like to test this, you
must retrain the segmenter.
The classifiers currently expects the trainingSymbols/
and trainingJunk/
to reside under
dataset/
. To change this behavior, edit the location attribute of the dataset_meta dictionary in classification/config.py
.
The segmenter and parser expect the dataset at whatever is specified in the base level config.py. Change this to point to your dataset.
This project uses the following libraries.
- sklearn
- hmmlearn
- progressbar2
- numpy
- scipy
- networkx
- PIL
- BeautifulSoup