This project hosts the code for our CVPR 2017 paper.
- Cesc Chunseong Park, Byeongchang Kim and Gunhee Kim. Attend to You: Personalized Image Captioning with Context Sequence Memory Networks. In CVPR, 2017. (Spotlight) [arxiv]
We address personalization issues of image captioning, which have not been discussed yet in previous research. For a query image, we aim to generate a descriptive sentence, accounting for prior knowledge such as the user's active vocabularies in previous documents. As applications of personalized image captioning, we tackle two post automation tasks: hashtag prediction and post generation, on our newly collected Instagram dataset, consisting of 1.1M posts from 6.3K users. We propose a novel captioning model named Context Sequence Memory Network (CSMN).
If you use this code as part of any published research, please refer the following paper.
@inproceedings{attend2u:2017:CVPR,
author = {Cesc Chunseong Park and Byeongchang Kim and Gunhee Kim},
title = "{Attend to You: Personalized Image Captioning with Context Sequence Memory Networks}"
booktitle = {CVPR},
year = 2017
}
git clone https://github.com/cesc-park/attend2u
- Install python modules
pip install -r requirements.txt
- Download pre-trained resnet checkpoint
cd ${project_root}/scripts
./download_pretrained_resnet_101.sh
- Download our dataset (coming soon)
cd ${project_root}/scripts
./download_dataset.sh
- Generate formatted dataset and extract Resnet-101 pool5 features
cd ${project_root}/scripts
./extract_features.sh
Run training script. You can train the model with multiple gpus.
python -m train --num_gpus 4 --batch_size 200
Run evaluation script. You can evaluate the model with multiple gpus
python -m eval --num_gpus 2 --batch_size 500
Comming soon!
Here are post generation examples:
Here are hashtag generation examples:
Here are (little bit wrong but) interesting post generation examples:
Here are (little bit wrong but) interesting hashtag generation examples:
We implement our model using tensorflow package. Thanks for tensorflow developers. :)
We also thank Instagram for their API and Instagram users for their valuable posts.
Additionally, we thank coco-caption developers for providing caption evaluation tools.
Cesc Chunseong Park, Byeongchang Kim and Gunhee Kim
Vision and Learning Lab @ Computer Science and Engineering, Seoul National University, Seoul, Korea
MIT license