GitHub

SAVIGP

This code is an implementation of the inference framework for Gaussian process (GP) models proposed in [1]. The framework is able to perform inference for Gaussian process models with arbitrary likelihood functions and it is scalable to large datasets.

SAVIGP stands for Scalable Automated Variational Inference for Gaussian Process models.

Experiments

Several examples for different forms of likelihood functions are available in the GP/experiment_setup.py file.

In order to replicate the experiments reported in the paper, the corresponding line in the GP/experiment_run.py file should be uncommented. For example, the following line in the file will run the experiment using the Boston dataset:

ExperimentRunner.boston_experiment()

The experiments can also be run concurrently, as it is shown in the file.

Output

After an experiment is finished, the results will be saved in a directory called results, which is a folder one level higher than the directory of the code. The result of each experiment will be saved in a separate directory which contains several files, as follows:

File	content
test_.csv	Result of the prediction on test data
train_.csv	Training data used. Model can be configured to not save these data, in the case the training dataset is large
model.dump	An image of the model. After each iteration of the optimisation this image will be updated, and it can be used later on to initialize the model
opt.dump	Last state of the optimiser. It can be used to continue the optimisation from the last iteration
config_.csv	Configuration of the model
*.log	Log file

Visualization

In order to plot the results generated, the last line in the GP/experiment_run.py file should be uncommented:

ExperimentRunner.plot()

In the case that the results of more than one experiment are in the result folder, then this method will plot the average of the results. The types of plots depend on the likelihood function. For example bar-charts are generated in the case of classification, and box-plots in the case of regression models. The type of likelihood is extracted from the config_.csv file. The plot function also exports the data used for graphs (e.g., the height of the bar charts and size of error-bars) into a separate folder called graph_data, which is one lever higher than the code folder. These data can be used to regenerate the plots using other tools. I used an R code for the plots in the paper. The code is in the file graphs_R/graphs.R.

Likelihood functions

The code comes with a set of pre-defined likelihood functions as follows:

Likelihood class	problem type
Likelihood.UnivariateGaussian	Normal Gaussian process regression with single output
Likelihood.MultivariateGaussian	Normal Gaussian process regression with multi-dimensional output
Likelihood.LogGaussianCox	Log Gaussian Cox process. Can be used for example for the prediction of the rate of incidents
Likelihood.LogisticLL	Logistic likelihood function. Can be used for binary classification
Likelihood.SoftmaxLL	SoftmaxLL likelihood function. Can be used for multi-class classification
Likelihood.WarpLL	Likelihood corresponding to Warp Gaussian process
Likelihood.CogLLL	Likelihood corresponding to Gaussian process regression networks

For defining a new likelihood function the class Likelihood.likelihood should be extended and the required functions should be implemented. See the class documentation for more details.

Example

Below is a short example using the Boston dataset:

import logging
from ExtRBF import ExtRBF
from model_learn import ModelLearn
from data_transformation import MeanTransformation
from likelihood import UnivariateGaussian
from data_source import DataSource
import numpy as np

# defining model type. It can be "mix1", "mix2", or "full"
method = "full"

# number of inducing points
num_inducing = 30

# loading data
data = DataSource.boston_data()

d = data[0]
Xtrain = d['train_X']
Ytrain = d['train_Y']
Xtest = d['test_X']
Ytest = d['test_Y']

# it is just a name that will be used for naming folders and files when exporting results
name = 'boston'

# defining the likelihood function
cond_ll = UnivariateGaussian(np.array(1.0))

# number of samples used for approximating the likelihood and its gradients
num_samples = 2000

# defining the kernel
kernels = [ExtRBF(Xtrain.shape[1], variance=1, lengthscale=np.array((1.,)), ARD = False)]

ModelLearn.run_model(Xtest,
                     Xtrain,
                     Ytest,
                     Ytrain,
                     cond_ll,
                     kernels,
                     method,
                     name,
                     d['id'],
                     num_inducing,
                     num_samples,
                     num_inducing / Xtrain.shape[0],

                     # optimise hyper-parameters (hyp), posterior parameters (mog), and likelihood parameters (ll)
                     ['hyp', 'mog', 'll'],

                     # Transform data before training
                     MeanTransformation,

                     # place inducting points on training data. If False, they will be placed using clustering
                     True,
                     
                     # level of logging
                     logging.DEBUG,
                     
                     # do not export training data into csv files
                     False,
                     
                     # add a small latent noise to the kernel for the stability of numerical computations
                     latent_noise=0.001,

                     # for how many iterations each set of parameters will be optimised
                     opt_per_iter={'mog': 25, 'hyp': 25, 'll': 25},

                     # total number of global optimisations
                     max_iter=200,

                     # number of threads
                     n_threads=1,

                     # size of each partition of data
                     partition_size=3000)

The code shows how to configure the model. There are two options that can significantly affect the speed of the code and the amount of memory usage: n_threads and partition_size. The whole dataset is divided into partitions of size partition_size and calculations on each partition is performed on a separate thread, where the maximum number of threads is n_threads.

Dependences

Following packages are required:

Python 2.7 (2.7.6)
Scipy (0.15.1)
Numpy (1.9.1)
GPy (0.6.0)
pandas (0.16.0)
scikit-learn (0.14.1)
matplotlib (1.3.1)

Following are required for tests:

DerApproximator (0.52)
texttable (0.8.2)

Numbers in the parenthesis indicate the tested version.

References

[1] A. Dezfouli, E. V. Bonilla. Scalable Inference for Gaussian Process Models with Black-Box Likelihoods, Advances in Neural and Information Processing Systems (NIPS), Montreal, December 2015

Name		Name	Last commit message	Last commit date
Latest commit History 640 Commits
GP		GP
clust		clust
data		data
graphs_R		graphs_R
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GP

GP

clust

clust

data

data

graphs_R

graphs_R

.gitignore

.gitignore

LICENSE.txt

LICENSE.txt

README.md

README.md

Repository files navigation

SAVIGP

About

Releases

Packages

Languages

License

jfutoma/savigp

Folders and files

Latest commit

History

Repository files navigation

SAVIGP

About

Resources

License

Stars

Watchers

Forks

Languages