Unsupervised-Learning-Middle-Project

Middle project in Prof. Louzoun's Unsupervised Learning course.

Abstract

Three data sets were analyzed using five unsupervised learning methods. The first data set is of online shoppers purchasing intention. The second one represents a decade (1999-2008) of clinical care at 130 US hospitals of patients with diabetes. The third data set contains information on click-stream from an online store offering clothing for pregnant women. For each data set, the goal was to cluster the data, visualize the clustering results, compute how well each clustering method fits the external classification, determine which clustering algorithm is better and explain the reason for the difference between them. Out of the five algorithms tested, Hierarchical Complete with four clusters was the best algorithm for the data of online shoppers' intention and e-shop clothing. However, for the clinical data, K Means with three clusters provided the best results.

Data Sets

The data are too large to upload. They can be found here:

In order to run the code, the data sets shall be downloaded and placed in a directory named 'dataset'.

Python Modules

The main modules used on this project are:

Sklearn
Matplotlib
Skfuzzy
Numpy
Pandas
Scipy
Yellowbrick

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
data set 1		data set 1
data set 2		data set 2
data set 3		data set 3
README.md		README.md
clustering.py		clustering.py
data_set_preparations.py		data_set_preparations.py
fit_to_external_classification.py		fit_to_external_classification.py
main.py		main.py
predict_nuber_of_clusters.py		predict_nuber_of_clusters.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data set 1

data set 1

data set 2

data set 2

data set 3

data set 3

README.md

README.md

clustering.py

clustering.py

data_set_preparations.py

data_set_preparations.py

fit_to_external_classification.py

fit_to_external_classification.py

main.py

main.py

predict_nuber_of_clusters.py

predict_nuber_of_clusters.py

Repository files navigation

Unsupervised-Learning-Middle-Project

Abstract

Data Sets

Python Modules

About

Releases

Packages

Languages

roysgitprojects/Unsupervised-Learning-Middle-Project

Folders and files

Latest commit

History

Repository files navigation

Unsupervised-Learning-Middle-Project

Abstract

Data Sets

Python Modules

About

Resources

Stars

Watchers

Forks

Languages