GitHub - ajauhri/knn-exps

Project Description

This is an implementation and a comparitive study of nearest-neighbour algorithms. NN algorithms are relevant for machine learning applications since it is believed that high-dimensoinal datasets have low-dimensional intrinsic structure. Specifically, we shall look at cover trees[1], and locality sensitive hashing[2] on different datasets and perform a comparitive study of their performance. Several datasets like user ratings, user tags on stackoverflow.com, and if possible, image classification will be considered. Moreover, from this study I aim to infer the correlation of a NN algorithm and structure of data via certain metrics of performance.

Dependencies

psutil
numpy v1.8.1
vlfeat
guppy for memory profiling

Testing

After cloning the repository, download datasets using wget,

wget https://dl.dropboxusercontent.com/u/10119973/15-853/a.png 
wget https://dl.dropboxusercontent.com/u/10119973/15-853/mnist.data 
wget https://dl.dropboxusercontent.com/u/10119973/15-853/netflix.mat

To run image patching use:

./img_patching.py data/a.png

The approximated image result will be saved as res_img.eps and the root mean squared error as rms.txt as <rms_error>, <#of patches>

To run LSH and cover trees with sparse data use:

./main.py -t data/netflix.mat -i 1000 -j 10 -n -r 30

To run LSH and cover trees with dense data use:

./main.py -t data/mnist.data -i 1000 -j 10 -g -r 1800

To get memory or timings results, please uncomment two functions collect_timings and do_profiling in main.py.

References

[1] Alina Beygelzimer, Sham Kakade, and John Langford. Cover trees for nearest neighbor. In Proceedings of the 23rd international conference on Machine learning, pages 97–104. ACM, 2006.

[2] Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S Mirrokni. Locality-sensitive hashing scheme based on p-stable distributions. In Proceedings of the twentieth annual symposium on Computational geometry, pages 253–262. ACM, 2004.

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
cover_tree		cover_tree
data		data
extras		extras
lsh		lsh
naive		naive
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
img_patching.py		img_patching.py
img_patching_with_angles.py		img_patching_with_angles.py
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cover_tree

cover_tree

data

data

extras

extras

lsh

lsh

naive

naive

.gitignore

.gitignore

README.md

README.md

init.py

init.py

img_patching.py

img_patching.py

img_patching_with_angles.py

img_patching_with_angles.py

main.py

main.py

Repository files navigation

Project Description

Dependencies

Testing

References

About

Releases

Packages

Languages

ajauhri/knn-exps

Folders and files

Latest commit

History

Repository files navigation

Project Description

Dependencies

Testing

References

About

Topics

Resources

Stars

Watchers

Forks

Languages