Our paper has been accepted by ACCV 2018. You can find the pre-print here.
- Python Version: 3.6
- Required Package: tensorflow:1.7.0, imageio, numpy, scipy, pillow, bokeh, seaborn
- Git clone MovieQA_benchmark from github, and change the path of
MovieQAPath.benchmark_dir
to the path where you clone to. - Download Faster-RCNN pretrained model from model zoo and change the path of
MovieQAPath.faster_rcnn_graph
to the path where you download to. - Download all data from MovieQA to MovieQA_benchmark. (Note: you have to register first.)
- Extract all frames from video clips. It will store all frames into
MovieQAPath.image_dir
.
python -m process.video
| [--check] [--no_extract]
- Prepare GloVe embedding [link] to the destination in
./embed/args.py
, and move current directory to./embed
. Then, type: (Note: please refer to./embed/args.py
for more information.)
python data.py
| [--debug]
python train.py
python deploy.py
- Process all sentences in MovieQA, including tokenizing, generating sentence embedding and sampling frames.
python -m process.text_v3 --one
- Extract bounding box feature.
python extract_bbox.py
- You can train now. If you want to use different model or tune with different hyper-parameters, you can follow: (Note: please refer to
train.py
to get more information about flags)
python train.py --mode subt+feat --mod model_full.1 --hp 02