GitHub - singhkavinder/Hierarchical-Boundary-Aware-Neural-Encoder-for-Video-Captioning: This repository contains the source code of CVPR 2017 submission : Hierarchical Boundary-Aware Neural Encoder for Video Captioning

This repository contains the source code of CVPR 2017 submission "Hierarchical Boundary-Aware Neural Encoder for Video Captioning".

The code belongs to the original authors of the papers.

Please cite the work if you intend to use it.

Requirements

Theano 0.9.0
Keras 1.1.0, configured for using Theano as backend

Note: Be sure to have "image_dim_ordering": "th" and "backend": "theano" in your keras.json file.

Dataset setup

This code comes with support to the Montreal Video Annotation Dataset (M-VAD) and to the MPII Movie Description dataset (MPII-MD). Before running a pre-trained model or training your own, you must follow the instructions for the dataset you intend to use.

M-VAD

Request access and download the dataset from the MILA website. Then create a folder datasets/M-VAD in the root of the project, and prepare three subfolders inside it:

datasets/M-VAD/videos. Put here all the videos, organized by movie as in the repository from MILA (for instance, you should have datasets/M-VAD/videos/21_JUMP_STREET/video/21_JUMP_STREET_DVS20.avi).
datasets/M-VAD/annotations. Create three subfolders here: train, test, val, and put in each of them the .srt files corresponding to training (download), test (download) and validation (download) respectively.
datasets/M-VAD/features. Leave this folder empty.

Then, compute C3D and ResNet features by typing in a Python console:

from datasets import MVAD
dataset = MVAD()
dataset.compute_c3d_descriptors()
dataset.compute_resnet_descriptors()

MPII-MD

Request access and download the dataset from the MPI website. Then create a folder datasets/MPII-MD in the root of the project, and prepare three subfolders inside it:

datasets/MPII-MD/jpgAllFrames. Unpack here the package with the jpeg frames as provided by MPI. For instance, you should have datasets/MPII-MD/jpgAllFrames/0001_American_Beauty/0001_American_Beauty_00.00.51.926-00.00.54.129/0001.jpg.
datasets/MPII-MD/annotations. Put here annotations-someone.csv, dataSplit.txt and uniqueTestIds.txt.
datasets/MPII-MD/features. Leave this folder empty.

Then, compute C3D and ResNet features by typing in a Python console:

from datasets import MPII_MD
dataset = MPII_MD()
dataset.compute_c3d_descriptors()
dataset.compute_resnet_descriptors()

Running a pre-trained model

Download one of the pre-trained model from the Releases page, then edit main.py as follows:

Change line 16 and set the dataset you intend to use:
```
  dataset = MPII_MD()
```
Disable the training flag (line 48)
```
  # Training
  if False:
```
Set the path to the pretrained model in line 57:
```
  m.load_weights('model.pkl')    
```

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
models		models
README.md		README.md
datasets.py		datasets.py
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

models

models

README.md

README.md

datasets.py

datasets.py

main.py

main.py

Repository files navigation

Requirements

Dataset setup

M-VAD

MPII-MD

Running a pre-trained model

About

Releases

Packages

Languages

singhkavinder/Hierarchical-Boundary-Aware-Neural-Encoder-for-Video-Captioning

Folders and files

Latest commit

History

Repository files navigation

Requirements

Dataset setup

M-VAD

MPII-MD

Running a pre-trained model

About

Resources

Stars

Watchers

Forks

Languages