microbPLSA

Why microbPLSA?

Big data needs big analyses and big visualization. Probabilistic Latent Semantic Analysis (PLSA) was originally developped as an indexing tool to organize large collections of word documents from word occurences. PLSA is a dimension reduction technique that finds patterns in the dataset by probabilistically determining the 'topics' driving the word-document structure. For example, the frequent co-occurence of the words 'hollywood', 'love', and 'celebrity' could be detected in a collection of magazines as being strongly associated to a topic. Different visualization can are used to explore the relationship between topics and the word-document structure such as parallel plots.

What is microbPLSA?

MicrobPLSA expands Mathieu Blondel's PLSA python package by adding some analyses modules and automizing different visualization techniques.

Packages:

numpy
scipy
matplotlib

Note: microbPLSA was developped in the 2.7 version of Python

Name		Name	Last commit message	Last commit date
Latest commit History 318 Commits
.settings		.settings
MicrobProcessor		MicrobProcessor
PLSA		PLSA
taskmanager		taskmanager
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
gistfile1.sh		gistfile1.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.settings

.settings

MicrobProcessor

MicrobProcessor

PLSA

PLSA

taskmanager

taskmanager

.DS_Store

.DS_Store

.gitignore

.gitignore

README.md

README.md

gistfile1.sh

gistfile1.sh

Repository files navigation

microbPLSA

Why microbPLSA?

What is microbPLSA?

About

Releases

Packages

Languages

sperez8/microbPLSA

Folders and files

Latest commit

History

Repository files navigation

microbPLSA

Why microbPLSA?

What is microbPLSA?

About

Resources

Stars

Watchers

Forks

Languages