Skip to content

Machine learning coursework for decision trees implemented from scratch

Notifications You must be signed in to change notification settings

EthanSK/Decision-Trees

Repository files navigation

CO395 Introduction to Machine Learning: Coursework 1 (Decision Trees)

Introduction

This repository contains the skeleton code and dataset files that you need in order to complete the coursework.

Data

The data/ directory contains the datasets you need for the coursework.

The primary datasets are:

  • train_full.txt
  • train_sub.txt
  • train_noisy.txt
  • validation.txt

Some simpler datasets that you may use to help you with implementation or debugging:

  • toy.txt
  • simple1.txt
  • simple2.txt

The official test set is test.txt. Please use this dataset sparingly and purely to report the results of evaluation. Do not use this to optimise your classifier (use validation.txt for this instead).

Codes

  • classification.py

    • Contains the skeleton code for the DecisionTreeClassifier class. Your task is to implement the train() and predict() methods.
  • eval.py

    • Contains the skeleton code for the Evaluator class. Your task is to implement the confusion_matrix(), accuracy(), precision(), recall(), and f1_score() methods.
  • example_main.py

    • Contains an example of how the evaluation script on LabTS might use the classes and invoke the methods defined in classification.py and eval.py.

Instructions

  • We're not using exception handling because that's too much effort.

About

Machine learning coursework for decision trees implemented from scratch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages