Skip to content

greeness/streaming_lsh

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Streaming LSH

Description

A project for clustering text streams using locality-sensitive hashing (LSH) in Python.

Author

Krishna Y. Kamath

Demo

For a demo of how to use this class take a look at the streamingLSHClusteringDemo() method in the Python module demo/StreamingLSHClusteringDemo.py.

Dependancies

Applications Using Streaming LSH

Applications currently using Streaming LSH:

Note: If you want your application listed here, contact the author.

References

  • Aristides Gionis, Piotr Indyk, and Rajeev Motwani. 1999. Similarity Search in High Dimensions via Hashing. In Proceedings of the 25th International Conference on Very Large Data Bases (VLDB '99), Malcolm P. Atkinson, Maria E. Orlowska, Patrick Valduriez, Stanley B. Zdonik, and Michael L. Brodie (Eds.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 518-529.
  • Moses S. Charikar. 2002. Similarity estimation techniques from rounding algorithms. In Proceedings of the thiry-fourth annual ACM symposium on Theory of computing (STOC '02). ACM, New York, NY, USA, 380-388. DOI=10.1145/509907.509965
  • Deepak Ravichandran, Patrick Pantel, and Eduard Hovy. 2005. Randomized algorithms and NLP: using locality sensitive hash function for high speed noun clustering. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics (ACL '05). Association for Computational Linguistics, Stroudsburg, PA, USA, 622-629. DOI=10.3115/1219840.1219917
  • Morchen, F.; Brinker, K., and Neubauer, C., Any-time clustering of high frequency news streams. The Thirteenth ACM SIGKDD Int'l. Conference on Knowledge Discovery and Data Mining: Data Mining Case Studies Workshop (DMCS), August 2007.
  • Benjamin Van Durme and Ashwin Lall. 2010. Online generation of locality sensitive hash signatures. In Proceedings of the ACL 2010 Conference Short Papers (ACLShort '10). Association for Computational Linguistics, Stroudsburg, PA, USA, 231-235.

About

A project for clustering text streams using locality-sensitive hashing (LSH) in Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published