Skip to content

Simple k-means clustering algorithm python implementation for educational purposes.

Notifications You must be signed in to change notification settings

notini/python_k-means

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

python_k-means

Simple k-means clustering algorithm python implementation for educational purposes.

About

dataGenerator.py and normalized.py are utils classes that return random scatter points based on provided parameters and normalize data (if desired), respectively.

kMeans.py contains a class implementing the algorithm aswell as aditional visualization(plotting) methods.

example.py contains an executable example for the algorithm, as explained below. To run, type python example.py while at the project folder.

Usage

You can initialize data on your own, or use the dataGenerator.py provided class. Data must be an array where each entrance is a cartesian point in the format [a,b]. A data array of 3 values would be [[25,40], [12,28], [76,90]]. If you choose to use the provided class:

from dataGenerator import dataGenerator

dGen = dataGenerator()
data = dGen.generateCartesianPoints([0,10000], 500)

will generate 500 cartesian points with values ranging from 0 to 10000. Normalization is not required, but a common practice in Data Mining algorithms, if you choose to use it:

from normalizer import normalizer

norm = normalizer()
normalized_data = norm.normalize_data(data, 0, 100)

Now, in order to clusterize your data:

from kMeans import k_means

kMeans = k_means()
clusters = kMeans.clusterize(normalized_data, 5)

This will cluster your data in 5 clusters (you can decide how many clusters you want by simply informing a different value on the second parameter). If you choose not to normalize the data, simply pass the original unnormalized array as the first parameter.

If you wish to visualize the clustered data:

kMeans.plotCluster(clusters, data)

Any doubts/suggestions, feel free to contact me.

About

Simple k-means clustering algorithm python implementation for educational purposes.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages