This release of the dataset consists of
- 82783 MS COCO training images and 40504 MS COCO validation images (images are obtained from [MS COCO website] (http://mscoco.org/dataset/#download))
- 248349 questions for training and 121512 questions for validation (3 per image)
- 2483490 answers for training and 1215120 answers for validation (10 per question)
There are two types of tasks
- Open-ended task
- Multiple-choice task (18 choices per question)
- python 2.7
- scikit-image (visit this page for installation)
- matplotlib (visit this page for installation)
./Annotations
- Download annotations files from here, extract them and place in this folder.
- After download and extraction, this folder should have the following two files
- OpenEnded_mscoco_train2014.json
- OpenEnded_mscoco_val2014.json
- MultipleChoice_mscoco_train2014.json
- MultipleChoice_mscoco_val2014.json
- Annotations files from Beta v0.1 release (10k MSCOCO images, 30k questions, 300k answers) can be found here.
./Images
- Create a directory with name train2014, download training images from MS COCO website, place training images in train2014 folder after extracting
- Create a directory with name val2014, download validation images from MS COCO website, place validation images in val2014 folder after extracting
./PythonHelperTools
- This directory contains the Python API to read and visualize the VQA dataset
- vqaDemo.py (demo script)
- vqaTools (API to read and visualize data)
./PythonEvaluationTools
- This directory contains the Python evaluation code
- vqaEvalDemo.py (evaluation demo script)
- vqaEvaluation (evaluation code)
./Results
- OpenEnded_mscoco_train2014_fake_results.json (an example of a fake results file to run the demo)
- Visit [VQA evaluation page] (http://visualqa.org/evaluation) for more details.
- Aishwarya Agrawal (Virginia Tech)
- Code for API is based on MSCOCO API code
- The format of the code for evaluation is based on MSCOCO evaluation code