Data Mining Project
This Project was devided into two tasks
- Supervised Data mining To refer to Dataset refer to EireJet.csv To refer to code refer to code1_nagarathna_sali.py
- UnSupervised Data Mining To refer to Dataset refer to EireJet.csv To refer to code refer to Code 2_Nagarathna_Sali.py To refer to t-SNE VIZ1_Nagarathna_Sali.html, VIZ2_NagarathnaSali.html, VIZ3_1_Nagarathna_Sali.html, VIZ3_2_Nagarathna_sali.html and VIZ5_Nagarathna_sali.html
To refer to report analysis of both the task refer to Report_Nagarathna_sali.pdf
Below is the detailed description about both tasks and datsets used.
- Supervised Data mining Task: Task is to construct a classification model in Python that can predict passengers’ satisfaction with the EireJet Airlines . Classification models used are random forest, AdaBoost, and Gradient Boost models, and finally pick the best performing model that can be used by EireJet to use in the real-world
About Dataset Each row in EireJet.csv corresponds to a passenger who travelled with EireJet Airlines. Relevant information about this dataset can be found in Appendix1.
- UnSupervised Data Mining Task Task is to find natural clusters in the given dataset using t-SNE and clustering algorithms in Python. The goal is to interpret these clusters to see if appropriate labels can be assigned to them.
About Dataset Each row in EireStay.csv corresponds to a booking at EireStay Resort. Relevant information about this dataset can be found in Appendix 2.