Multi-Stream CNNs for Video Analysis

We use a spatial and a temporal stream with VGG-16 and CNN-M respectively for modeling video information. LSTMs are stacked on top of the CNNs for modeling long term dependencies between video frames. For more information, see these papers:

Two-Stream Convolutional Networks for Action Recognition in Videos

Fusing Multi-Stream Deep Networks for Video Classification

Modeling Spatial-Temporal Clues in a Hybrid Deep Learning Framework for Video Classification

Towards Good Practices for Very Deep Two-Stream ConvNets

Here are the steps to run the project on video dataset:

Get the Video data, remove broken videos and negative instances and finally create a pickle file of the dataset by running scripts from the utility_scripts folder
Temporal Stream (in the temporal folder):
Run temporal_vid2img to create optical flow frames and the related files
Run temporal_stream_cnn to start with the temporal stream training
Spatial Stream (in the spatial folder):
Run the spatial_vid2img to create static frames and related files
Download the vgg16_weights.h5 file from here and put it in the spatial folder
Run spatial_stream_cnn to start with the spatial stream training
Temporal Stream LSTM: Will soon update the code
Spatial Stream LSTM: Will soon update the code

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dataset

dataset

spatial

spatial

temporal

temporal

utility_scripts

utility_scripts

README.md

README.md

Repository files navigation

Multi-Stream CNNs for Video Analysis

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
dataset		dataset
spatial		spatial
temporal		temporal
utility_scripts		utility_scripts
README.md		README.md

abitbetter/Multi-Stream-CNNs-for-Video-Analysis

Folders and files

Latest commit

History

Repository files navigation

Multi-Stream CNNs for Video Analysis

About

Resources

Stars

Watchers

Forks

Languages