GitHub

=======================================================================

Requirements

Latest Python2 (python2.7.*)
numpy + scipy
Theano
HDF5

How to Install

Official Installation Guide For Related Packages
    
    Numpy & Scipy:
        http://docs.scipy.org/doc/numpy/user/install.html
    Theano:
        http://deeplearning.net/software/theano/install.html
    HDF5:
        https://hdfgroup.org/HDF5/

How to Config Theano

Theano is the backbone of this project. To configure theano, view theano-config for more detailed help. You need to write the configuration to ~/.theanorc. The followingTheano configuration is recommended

For CPU users:

[global]
floatX = float32
device = cpu
mode = FAST_RUN
warn_float64 = warn

For GPU users(here the device can be any other GPU):

[global]
floatX = float32
device = gpu0
mode = FAST_RUN
warn_float64 = warn

=======================================================================

Library

4 main components:

iterator: data handler

layer: network layers, to construct a network, you have to have 3 kinds of layers

interface layer: declare input, mask, output
middle layer: construct the main network layers (a list of layers)
cost layer: construct the network cost

model: network model

optimizer: optimizer to optimize the model

Data Format

The data_file is an folder path with a list of hdf5 files for videos:

v_ApplyEyeMakeup_g08_c01.h5
v_ApplyEyeMakeup_g08_c02.h5
v_ApplyEyeMakeup_g08_c03.h5
v_ApplyEyeMakeup_g08_c04.h5
v_ApplyEyeMakeup_g08_c05.h5

Each hdf5 file stores all the frame features for this video row by row, i.e., a matrix with size (#frames, #featureDim)

The train_framenum.txt file contains number of frames for each video:

The train_filenames.txt file contains the video filenames relative to the root video directory:

v_ApplyEyeMakeup_g08_c01
v_ApplyEyeMakeup_g08_c02
v_ApplyEyeMakeup_g08_c03
v_ApplyEyeMakeup_g08_c04
v_ApplyEyeMakeup_g08_c05

The train_labels.txtfile for single-label datasets looks like

0
7
43

and for multi-label datasets:

0,0,0,0,0,0,0,1,0,0,0,0
0,0,0,0,0,0,0,1,0,0,0,0
0,0,0,0,0,0,1,1,0,0,0,0
0,0,0,0,0,0,0,0,0,0,0,1

The same format is required for the validation and test files too.

=======================================================================

Network Architectures

LSTM: LSTM

ALSTM: Attention LSTM

convLSTM: Convolutional LSTM

convALSTM: Convolutional Attention LSTM

motion ALSTM: Attention LSTM with motion-based attention

motion convALSTM: Convolutional Attention LSTM with motion-based attention

Example to run the scripts:

THEANO_FLAGS='floatX=float32,device=gpu0,mode=FAST_RUN,nvcc.fastmath=True' python evaluate_ucf101_rgb_LSTM.py
THEANO_FLAGS='floatX=float32,device=gpu1,mode=FAST_RUN,nvcc.fastmath=True' python evaluate_ucf101_flow_ALSTM.py

=======================================================================

Feature Extraction

We use extract_rgbcnn.py and extract_flowcnn.py scripts to extract feature maps (e.g. pool5 features) for rgb and flow input. While extract_rgbcnn_fc.py and extract_rgbcnn_fc.py are used to extract fc features.

Caffe network definition files and models:

rgb

prototxt: ucf101_action_rgb_vgg_16_deploy_features_fc7.prototxt, ucf101_action_rgb_vgg_16_deploy_features_pool5.prototxt

model: ucf101_action_rgb_vgg_16_split1.caffemodel

flow (single flow)

prototxt: ucf101_action_singleflow_vgg_16_deploy_features_fc7.prototxt, ucf101_action_singleflow_vgg_16_deploy_features_pool5.prototxt

model: ucf101_action_singleflow_vgg_16_split1.caffemodel

Example to run the scripts:

python extract_rgbcnn.py --model_def ucf101_action_rgb_vgg_16_deploy_features_pool5.prototxt --model ucf101_action_rgb_vgg_16_split1.caffemodel --gpu_id 0

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
extract_features		extract_features
sparnn		sparnn
ucf101		ucf101
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

extract_features

extract_features

sparnn

sparnn

ucf101

ucf101

LICENSE

LICENSE

README.md

README.md

init.py

init.py

Repository files navigation

Requirements

How to Install

How to Config Theano

Library

Data Format

Network Architectures

Feature Extraction

About

Releases

Packages

Languages

License

xiaoyu5301/VideoLSTM

Folders and files

Latest commit

History

Repository files navigation

Requirements

How to Install

How to Config Theano

Library

Data Format

Network Architectures

Feature Extraction

About

Resources

License

Stars

Watchers

Forks

Languages