This repository contains the skeleton code and dataset files that you need in order to complete the coursework.
The data/
directory contains the character datasets:
The primary datasets are:
train_full.txt
train_sub.txt
train_noisy.txt
validation.txt
Some simpler datasets that you may use to help you with implementation or debugging:
toy.txt
simple1.txt
simple2.txt
The official test set is test.txt
.
-
classification.py
- Contains the code
DecisionTreeClassifier
class.
- Contains the code
-
eval.py
- Contains the code for the
Evaluator
class.
- Contains the code for the
-
main.py
- Contains the implementation of the classes in classification.py and eval.py
-
pruning.py
- Contains the code to prune decision tree