Spatial-Temporal-Adaptive-Attention-for-Video-Captioning

This is an implementation for "Hierarchical LSTMs with Adaptive Attention for Visual Captioning" and it is extension of the paper "Hierarchical LSTM with Adjusted Temporal Attention for Video Captioning" accepted in International Joint Conference on Artificial Intelligence (IJCAI) 2017 https://github.com/zhaoluffy/hLSTMat. This work has been accepted by IEEE Transactions of Pattern Analysis and Machine Intelligence (TPAMI), 2019.

Three corresponding version are: concatenation fusion, dynamic fusion and ensemble fusion.

Requirements

Python 2.7.6

Theano 0.8.2

processed data

You need to download pretrained resnet model for extracting features. We provide our extracted ResNet video feature and processed caption in:https://drive.google.com/open?id=1HymvVvAEygM6UJm41dQkQ4IbTWcHT0iQ. Download this dataset and replace RAB_FEATURE_BASE_PATH in config.py with your feature path and replace RAB_DATASET_BASE_PATH in config.py with your processed data path. Besides, you should assign where to store your result in config.py.

Evaluation

If you'd like to evaluate BLEU/METEOR/CIDER scores during training. Don't forget to download coco-caption:https://github.com/tylin/coco-caption and Jobman:http://deeplearning.net/software/jobman/install.html.

Also you should add coco-caption path to $PYTHONPATH and add jobman path to $PYTHONPATH as well.

Others

If you have any question, drop us email at:xiangpengli.cs@gmail.com

@article{gao2019hierarchical,
  title={Hierarchical LSTMs with adaptive attention for visual captioning},
  author={Gao, Lianli and Li, Xiangpeng and Song, Jingkuan and Shen, Heng Tao},
  journal={IEEE transactions on pattern analysis and machine intelligence},
  volume={42},
  number={5},
  pages={1112--1131},
  year={2019},
  publisher={IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
concate_res_c3d		concate_res_c3d
dynamicfusion_res_c3d		dynamicfusion_res_c3d
ensemble_res_c3d		ensemble_res_c3d
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

concate_res_c3d

concate_res_c3d

dynamicfusion_res_c3d

dynamicfusion_res_c3d

ensemble_res_c3d

ensemble_res_c3d

README.md

README.md

Repository files navigation

Spatial-Temporal-Adaptive-Attention-for-Video-Captioning

Requirements

processed data

Evaluation

Others

About

Releases

Packages

Languages

lixiangpengcs/Spatial-Temporal-Adaptive-Attention-for-Video-Captioning

Folders and files

Latest commit

History

Repository files navigation

Spatial-Temporal-Adaptive-Attention-for-Video-Captioning

Requirements

processed data

Evaluation

Others

About

Topics

Resources

Stars

Watchers

Forks

Languages