This repository contains the skeleton code and dataset files that you need in order to complete the coursework.
The data/
directory contains the datasets you need for the coursework.
The primary datasets are:
train_full.txt
train_sub.txt
train_noisy.txt
validation.txt
Some simpler datasets that you may use to help you with implementation or debugging:
toy.txt
simple1.txt
simple2.txt
The official test set is test.txt
. Please use this dataset sparingly and
purely to report the results of evaluation. Do not use this to optimise your
classifier (use validation.txt
for this instead).
-
classification.py
- Contains the skeleton code for the
DecisionTreeClassifier
class. Your task is to implement thetrain()
andpredict()
methods.
- Contains the skeleton code for the
-
eval.py
- Contains the skeleton code for the
Evaluator
class. Your task is to implement theconfusion_matrix()
,accuracy()
,precision()
,recall()
, andf1_score()
methods.
- Contains the skeleton code for the
-
example_main.py
- Contains an example of how the evaluation script on LabTS might use the classes
and invoke the methods defined in
classification.py
andeval.py
.
- Contains an example of how the evaluation script on LabTS might use the classes
and invoke the methods defined in
- We're not using exception handling because that's too much effort.