Fast Subspace Clustering

A GPU Implementation of subKMeans^[1]

This is a CPU and GPU implementation of the KDD 2017 paper Towards an Optimal Subspace for K-Means^[1]. The CPU implementation (mode=cpu) is written in Numpy. There are 2 GPU implementations:

Using only PyCUDA and scikit-cuda API (mode=gpu)
Using PyCUDA with custom kernels optimized for this algorithm (mode=gpu_custom)

Dependencies

Numpy
PyCUDA: pip install pycuda
scikit-cuda: Install from source as described here. We tested using commit #249538c.
Matplotlib (for plots)
scikit-learn (for computing NMI score)

Note: We tested this implementation only on Python 2. There are some issues with the GPU version on Python 3.

Usage

Go to src/

python main.py -d=<dataset_name> -k=<number_of_clusters> -mode=<mode>

For help: python main.py -h

3 available modes: cpu, gpu, gpu_custom

Example Usage

python main.py -d=wine -k=3 -mode=cpu

Sample Output

[i] Itr 1: 24 points changed
[i] Itr 2: 7 points changed
[i] Itr 3: 7 points changed
[i] Itr 4: 2 points changed
[i] Itr 5: 1 points changed
[i] Itr 6: 0 points changed

[i] Results
[*] m: 2
[*] NMI: 0.87590

References

[1] Mautz et. al. Towards an Optimal Subspace for K-Means

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
datasets		datasets
src		src
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

datasets

datasets

src

src

.gitignore

.gitignore

README.md

README.md

Repository files navigation

Fast Subspace Clustering

A GPU Implementation of subKMeans^[1]

Dependencies

Usage

Example Usage

Sample Output

References

About

Releases

Packages

Languages

fagan2888/subKmeans

Folders and files

Latest commit

History

Repository files navigation

Fast Subspace Clustering

A GPU Implementation of subKMeans[1]

Dependencies

Usage

Example Usage

Sample Output

References

About

Resources

Stars

Watchers

Forks

Languages

A GPU Implementation of subKMeans^[1]