Paper (IEEE Xplore)
Code is mostly self-explanatory via file, variable and function names; but more complex lines are commented.
Designed to require minimal setup overhead, using as much Keras and sacred integration and reusability as possible.
Installing Python 3.7.9 on Ubuntu 20.04.2 LTS:
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt-get update
sudo apt-get install python3.7
Installing CUDA 10.0:
wget https://developer.nvidia.com/compute/cuda/10.0/Prod/local_installers/cuda_10.0.130_410.48_linux
sudo bash cuda_10.0.130_410.48_linux --override
echo 'export PATH=/usr/local/cuda-10.0/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
source ~/.bashrc
Installing cuDNN 7.6.5:
wget http://people.cs.uchicago.edu/~kauffman/nvidia/cudnn/cudnn-10.0-linux-x64-v7.6.5.32.tgz
# if link is broken, login and download from nvidia:
# https://developer.nvidia.com/compute/machine-learning/cudnn/secure/7.6.5.32/Production/10.0_20191031/cudnn-10.0-linux-x64-v7.6.5.32.tgz
tar -xvzf cudnn-10.0-linux-x64-v7.6.5.32.tgz
sudo cp -r cuda/include/* /usr/local/cuda-10.0/include/
sudo cp -r cuda/lib64/* /usr/local/cuda-10.0/lib64/
Installing Python packages with pip:
python3.7 -m pip install h5py==2.10.0 ipython==7.16.1 keras==2.2.4 matplotlib==3.3.2 numpy==1.19.2 pillow==8.1.0 pywavelets==1.1.1 sacred==0.8.2 scikit-learn==0.23.2 scipy==1.5.2 tensorflow-gpu==1.14.0 tqdm==4.56.0
Reproduction should be as easy as executing this in the root folder (after installing all dependencies):
python3.7 -m IPython experiments/mnistrotated.py with groupwtacrnn nospatial seed=123
In general:
python3.7 -m IPython experiments/dataset.py with algorithm optional_config seed=number
where dataset
is either:
mnistrotated
: the Rotated MNIST video set, artificially generated by rotating and picking the top left corner,cifar10scanned
: the Scanned CIFAR-10 video set, artificially generated by sliding a window,coil100
: the COIL-100 natural video set, placing objects on turning table;necanimal
: the NEC Animal natural video set, placing animal figures on turning table;
algorithm
is either:
wtacnn
: Winner-Take-All (WTA) Time Distributed CNN Autoencoder,wtacrnn
: Winner-Take-All (WTA) Recurrent CNN Autoencoder,randominitcnn
: Glorot Initialized Time Distributed CNN,randominitcrnn
: Glorot Initialized Recurrent CNN,denoisingcnn
: Denoising Time Distributed CNN Autoencoder,denoisingcrnn
: Denoising Recurrent CNN Autoencoder,vgg19
: ImageNet Pretrained Time Distributed VGG19,groupwtacnn
: Group k-Sparse Time Distributed CNN Autoencoder,groupwtacrnn
: Group k-Sparse Recurrent CNN Autoencoder;
and optional_config
is either nothing (both spatial and lifetime sparsity enabled by default), or:
nospatial
: disable spatial sparsity,nolifetime
: disable lifetime sparsity.
seed
: 123
in all of our experiments, should yield very similar numbers as in the table of our paper
algorithms/
keraswtacnn.py : base class, the original WTA autoencoder baseline method
keraswtacrnn.py : subclass, WTA with recurrent connections
kerasrandominitcnn.py : subclass, no pretraining baseline method
kerasrandominitcrnn.py : subclass, no pretraining with recurrent connections
kerasdenoisingcnn.py : subclass, input dropout autoencoder baseline method
kerasdenoisingcrnn.py : subclass, input dropout with recurrent connections
kerasvgg19.py : subclass, imagenet pretraining baseline method
kerasgroupwtacnn.py : subclass, our group k-sparse autoencoder
kerasgroupwtacrnn.py : subclass, our group k-sparse autoencoder with recurrent connections
datasets/
mnistrotated.py : base class, loads Rotated MNIST data set and generates given number of labeled samples
cifar10scanned.py : subclass, same but for Scanned CIFAR-10
coil100.py : subclass, same but for COIL-100
necanimal.py : subclass, same but for NEC Animal
experiments/
mnistrotated.py : config file for hyperparameters, loads Rotated MNIST data set and an algorithm,
conducts experiment
cifar10scanned.py : same, but for Scanned CIFAR-10
coil100.py : same, but for COIL-100
necanimal.py : same, but for NEC Animal
results/ : experimental results will be saved to this directory with sacred package
utils/
layers.py : custom Keras layer classes, including
ConvMinimalRNN2D
: the convolutional minimal recurrent layer
ops.py : custom Keras/Tensorflow operations, including
n_p
: p-norm computation
group_norms
: grouped p-norm computation
ksparse
: top-k masking activation function
group_ksparse
: our grouped top-k masking activation function
pil.py : functions for backwards compatibility for saving all kinds of figures
plot.py : functions for saving video frame figures
preprocessing.py : functions for ZCA whitening
utils.py : additional things, including
VideoSequence
: Keras Sequence subclass generating random videos
@inproceedings{milacski2019group,
title={Group k-sparse temporal convolutional neural networks: unsupervised pretraining for video classification},
author={Milacski, Zolt{\'a}n {\'A} and P{\'o}czos, Barnab{\'a}s and L{\H{o}}rincz, Andr{\'a}s},
booktitle={2019 International Joint Conference on Neural Networks (IJCNN)},
pages={1--10},
year={2019},
organization={IEEE}
}
In case of any questions, feel free to create an issue here on GitHub, or mail me at srph25@gmail.com.