Skip to content

willis-hu/spyn

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PSPN

Python implementation of the Poisson Sum-Product Networks (PSPN). It provides routines to do inference and learning.

Overview

Paper reference here

Requirements

numpy, sklearn, scipy, numba, matplotlib joblib networkx pandas sympy statsmodels h2o Required in some experiments gensim Required in some experiments mpmath

Usage

please look in the experiments folder, there you will find code and data for the experiments in the paper. usually there are two files, one for computing the results and one for plotting the results

to Learn, do: from algo.learnspn import LearnSPN

spn = LearnSPN(alpha=0.001, min_instances_slice=50).fit_structure(train_data)

to Learn using cache, do: from algo.learnspn import LearnSPN

memory = Memory(cachedir="/tmp", verbose=0, compress=9)

spn = LearnSPN(alpha=0.001, min_instances_slice=50, cache=memory).fit_structure(data)

to do inference, do: spn.eval(test_data)

Good starting points: /spyn/experiments/MI/nipsGraph.py /spyn/experiments/dependencytypes/plotDifferentDistributionTypes.py

more details in:

  • The learning algorithm is in: algo/learnspn.py the learning method is called "fit_structure"

  • Inference algorithms are located in: spn/linked/spn.py: to compute log likelihood use the method "eval" to get LL per data row use individual=True, to get LL for the whole dataset use individual=False to compute perplexity use the method "perplexity" to do Max Prod use the method "complete" and pass instances with "None" where you want to get the MPE

    to compute Mutual Information I(X,Y) = Sum_x,y Pxy * (log2(Pxy) - (log2(Px) + log2(Py))) use "computeMI" passing the feature names you want to compute and the array of feature names to compute Entropy H(X) = Sum_x Px * log2(Px) use "computeEntropy" passing the feature name you want to compute and the array of feature names to compute Entropy H(X, Y) = Sum_x,y Pxy * log2(Pxy) use "computeEntropy2" passing the feature names you want to compute and the array of feature names to compute Expectation E(X) = Sum_x x * Px use "computeExpectation" passing the feature name you want to compute and the array of feature names to compute Expectation E(X, Y) = Sum_x,y (x * y) * Pxy use "computeExpectation2" passing the feature names you want to compute and the array of feature names to compute Covariance Cov(X, Y) = E(X,Y) - E(X) * E(Y) use "computeCov" passing the feature names you want to compute and the array of feature names to compute distance d(x,y) = H(x) + H(y) - 2.0 * I(x,y) use "computeDistance" passing the feature names you want to compute and the array of feature names to compute the normalized distance D(x,y) = d(x,y) / H(x,y) use "computeNormalizedDistance" passing the feature names you want to compute and the array of feature names

    to get the number of node in the spn use "size", this gives an idea of how big/deep the model is to get the SPN in a text representation use "to_text" and pass the names of the features as an array of strings to get the SPN in a graph representation use "to_graph" this returns a networkx DiGraph and you can use "save_pdf_graph" to plot the graph

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 51.4%
  • Jupyter Notebook 46.3%
  • R 1.6%
  • Other 0.7%