Skip to content

sanghack81/SDCIT

Repository files navigation

Build Status

SDCIT: Self-Discrepancy Conditional Independence Test

Overview

sdcit is a package for testing conditional independence in python implementing SDCIT by Lee and Honavar (2017). The algorithm utilizes the notion of closeness among observations, defined by a kernel function, and conditional permutation, which allows us to yield a pseudo-null sample.

This algorithm depends on Blossom-V (Kolmogorov 2009), which is freely available for the research purpose. To be used commercially, users must buy commercial license for Blossom-V.

Installation

You are required to use python>=3.6. Other required packages are described in requirements.txt. Following scripts will download sdcit code together with Blossom-V and install it using python3 and pip in your path.

cd ~/Downloads
git clone https://github.com/sanghack81/SDCIT
cd SDCIT
pip install -r requirements.txt
pip install 'gpflow<2.0' 'tensorflow<2.0'
./setup.sh
python3 setup.py build_ext --inplace
pip install -e .

Examples

We provide three simple examples, where kernel matrices are computed based on median heuristic.

import numpy as np
from sdcit.sdcit_mod import SDCIT
from sdcit.utils import rbf_kernel_median

np.random.seed(0)

N = 200
# Three independent random variables
X = np.random.randn(N, 2)
Y = np.random.randn(N, 2)
Z = np.random.randn(N, 2)
Kx, Ky, Kz = rbf_kernel_median(X, Y, Z)  # median heuristic
test_statistic, p_value = SDCIT(Kx, Ky, Kz)
print('p_value: {:.4f}'.format(p_value))

# (conditionally dependent)
# X --> Z <-- Y 
Z = X + Y + np.random.randn(N, 2)
Kx, Ky, Kz = rbf_kernel_median(X, Y, Z)  # median heuristic
test_statistic, p_value = SDCIT(Kx, Ky, Kz)
print('p_value: {:.4f}'.format(p_value))

# (conditionally independent)
# X <-- Z --> Y 
Z = np.random.randn(N, 2)
X = Z + np.random.randn(N, 2)
Y = Z + np.random.randn(N, 2)
Kx, Ky, Kz = rbf_kernel_median(X, Y, Z)  # median heuristic
test_statistic, p_value = SDCIT(Kx, Ky, Kz)
print('p_value: {:.4f}'.format(p_value))

References

Sanghack Lee, Vasant Honavar Self-Discrepancy Conditional Independence Test Proceedings of the 33rd Conference on Uncertainty in Artificial Intelligence. 2017. (to appear)

Gary Doran, Krikamol Muandet, Kun Zhang, and Bernhard Schölkopf. A Permutation-Based Kernel Conditional Independence Test Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence. 2014.

Vladimir Kolmogorov. Blossom V: A new implementation of a minimum cost perfect matching algorithm. In Mathematical Programming Computation (MPC), July 2009, 1(1):43-67.