Skip to content

rockhowse/lda

 
 

Repository files navigation

microscopes-lda

A Python package for finding unobserved structure in unstructed data.

This package contains an implementation of the nonparametric (HDP) latent Dirichlet allocation (LDA) model described by Teh et al in Hierarchal Dirichlet Processes (Journal of the American Statistical Association 101: pp. 1566–1581). Unlike the original LDA model, nonparametric LDA does not require the user to select a number of topics. Instead, the number of topics is inferred from the data using a hierarchal Dirichlet process prior.

The current kernel follows the sampling scheme described in Section 5.1 Posterior sampling in the Chinese restaurant franchise. In the future, we may support the other kernels described in Teh's paper.

Numerical computation is implemented in C++ for efficiency.

Installation

OS X and Linux builds of microscopes-lda are released to Anaconda.org. Installing them requires Conda. To install the current release version run:

$ conda install -c datamicroscopes -c distributions microscopes-lda

About

Latent dirichlet allocation (LDA) for datamicroscopes

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 50.5%
  • Python 42.8%
  • CMake 4.1%
  • Makefile 1.3%
  • Shell 1.3%