Skip to content

pascalc/decision_trees

Repository files navigation

--- Assignment 1 ---
Initial entropy of the datasets
+---------+----------------+
| Dataset | Entropy        |
+---------+----------------+
| Monk-1  | 1.0            |
+---------+----------------+
| Monk-2  | 0.957117428265 |
+---------+----------------+
| Monk-3  | 0.999806132805 |
+---------+----------------+

--- Assignment 2 ---
Selecting the root of the decision tree
+-------------+-------------+-------------+-------------+-------------+-------------+-------------+
| Dataset     | a1          | a2          | a3          | a4          | a5          | a6          |
+-------------+-------------+-------------+-------------+-------------+-------------+-------------+
| Monk-1      | 0.075272555 | 0.005838429 | 0.004707566 | 0.026311696 | 0.287030749 | 0.000757855 |
|             | 6083        | 96291       | 6173        | 5077        | 716         | 715864      |
+-------------+-------------+-------------+-------------+-------------+-------------+-------------+
| Monk-2      | 0.003756177 | 0.002458498 | 0.001056147 | 0.015664247 | 0.017277176 | 0.006247622 |
|             | 37751       | 66608       | 71589       | 2926        | 9379        | 23688       |
+-------------+-------------+-------------+-------------+-------------+-------------+-------------+
| Monk-3      | 0.007120868 | 0.293736173 | 0.000831114 | 0.002891817 | 0.255911724 | 0.007077026 |
|             | 39607       | 508         | 044534      | 28865       | 62          | 0741        |
+-------------+-------------+-------------+-------------+-------------+-------------+-------------+

--- Assignment 3 ---
Performance of the decision trees
+---------+----------+----------------+
| Dataset | Training | Test           |
+---------+----------+----------------+
| Monk-1  | 1.0      | 0.828703703704 |
+---------+----------+----------------+
| Monk-2  | 1.0      | 0.69212962963  |
+---------+----------+----------------+
| Monk-3  | 1.0      | 0.944444444444 |
+---------+----------+----------------+

--- Assignment 4 ---
Selecting the best fraction to divide training and validation sets for pruning
+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
| Dataset   | 0.3       | 0.4       | 0.5       | 0.6       | 0.7       | 0.8       | Benchmark |
+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
| Monk-1    | 0.7453703 | 0.7777777 | 0.8611111 | 0.8240740 | 0.8611111 | 0.8125    | 0.8287037 |
|           | 7037      | 77778     | 11111     | 74074     | 11111     |           | 03704     |
+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
| Monk-2    | 0.6712962 | 0.6712962 | 0.6712962 | 0.6712962 | 0.6712962 | 0.6875    | 0.6921296 |
|           | 96296     | 96296     | 96296     | 96296     | 96296     |           | 2963      |
+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
| Monk-3    | 0.8796296 | 0.9166666 | 0.9305555 | 0.9537037 | 0.9259259 | 0.8703703 | 0.9444444 |
|           | 2963      | 66667     | 55556     | 03704     | 25926     | 7037      | 44444     |
+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+

About

An investigation of Decision Tree Learning in Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages