Skip to content

A wrapper on Mathieu Blondel's PLSA package to run on different types of data.

Notifications You must be signed in to change notification settings

sperez8/microbPLSA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

microbPLSA

Why microbPLSA?

Big data needs big analyses and big visualization. Probabilistic Latent Semantic Analysis (PLSA) was originally developped as an indexing tool to organize large collections of word documents from word occurences. PLSA is a dimension reduction technique that finds patterns in the dataset by probabilistically determining the 'topics' driving the word-document structure. For example, the frequent co-occurence of the words 'hollywood', 'love', and 'celebrity' could be detected in a collection of magazines as being strongly associated to a topic. Different visualization can are used to explore the relationship between topics and the word-document structure such as parallel plots.

What is microbPLSA?

MicrobPLSA expands Mathieu Blondel's PLSA python package by adding some analyses modules and automizing different visualization techniques.

Packages:

  • numpy
  • scipy
  • matplotlib

Note: microbPLSA was developped in the 2.7 version of Python

About

A wrapper on Mathieu Blondel's PLSA package to run on different types of data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published