Neural style transfer is the process of taking the style of one image then applying it to the content of another image. There is a review about Neural Style Transfer that is worked out by Yongcheng Jing, Yezhou Yang, Zunlei Feng, Jingwen Ye, Yizhou Yu, Mingli Song
In this practice, we choose "Model-Optimisation-Based Offline Neural Methods", also called as "fast style transfer". It's Per-Style-Per-Model Neural Methods.
Following describes the details about how we cook Vangogh Crazy World. Following this practice, you can learn how to build same scenario to Web, UWP(Windows), Android, and NCS that Intel announced for AIOT usage.
For more Vincent van Gogh, please refer to wiki
The source code in this project is written in Tensorflow
You can find related instruction to train your own model file by this sample code.
Following also brief some tips about the experience while we were working on this project.
In this project, it will provide the following packages
- Training Van Gogh gallery with Python
- Inference with real time camera and still images and
- Deployment on Windows applications
- Deployment on Android applications
- Deployment on Web pages
- Deployment on NCS that Intel announced for AIOT usage.
git clone https://github.com/acerwebai/VangoghCrazyWorld.git
You can download the pre-trained models from here and should find the checkpoint files for each models
-
Python 3.6
-
(Optional) If your machine support nVidia GPU with CUDA, please refer to the installation from nVidia
- CUDA v9.0: https://developer.nvidia.com/cuda-90-download-archive
- cuDNN v7.3.0 for CUDA 9.0: https://developer.nvidia.com/rdp/cudnn-archive
- Note: CUDA and cuDNN has dependencies
-
Tensorflow 1.12.0
- pip install tensorflow==1.12.0 for CPU
- pip install tensorflow-gpu==1.12.0 for GPU
-
Pillow 3.4.2, scipy 0.18.1, numpy 1.11.2, ffmpeg 3.1.3 or later version
In creating a virtual environment you will create a directory containing a python binary and everything needed to run VangoghCrazyWorld.
Firstly, you need to install virtualenvwrapper via
pip install virtualenvwrapper
Then you can create a virtual environment using this command:
virtualenv -p python3 $HOME/tmp/VangoghCrazyWorld-venv/
And activate the virtual environment like this
source $HOME/tmp/VangoghCrazyWorld-venv/Scripts/activate
In that, you can isolate the working environments project by project.
So, please work on this virtual environment for the following installations.
Change directory to VangoghCrazyWorld, where the git clone goes
cd VangoghCrazyWorld
We have already make all required packages listing in the requirements.txt, all you need to do is just to pip install the dependencies
pip install -r requirements.txt
Note: If your machine do not support nVidia GPU, please replace Tensorflow-gpu as Tensorflow inside the requirements.txt
Before training, you need get dataset from COCO and VGG19 from matconvnet, or execute setup.sh to get dataset and VGG19
./setup.sh
Now, you have all the packages for running the pre-trained models You can have a trial run the starrynight style model that we have pre-trained, from the as following
for example: you want to evaluate images in examples/content with starrynight-300-255-NHWC_nbc8_bs1_7e00_1e03_0.01the instruction is as here.
python evaluate.py --data-format NHWC --num-base-channels 4 --checkpoint tf-models/starrynight-300-255-NHWC_nbc4_bs1_7e00_1e03_0.01 \
--in-path examples/content \
--out-path examples/results \
--allow-different-dimensions
where
- --data-format: NHWC is for tensorflow sieries framework, NCHW is for non-tensorflow series, ex. ONNX that WinML required.
- --num-base-channels: it's used to reduce model size to improve inference time on Web, and other lower compute platform.
- --checkpoint: is the path where you place the pre-trained model checkpoint
- --in-path: is the path to input images, can be a folder or a file
- --out-path: is the path to output images, can be a folder or a file
Let's start to do the training
python style.py --data-format NHWC --num-base-channels 4 --style examples/style/starrynight-300-255.jpg \
--checkpoint-dir ckpts \
--test examples/content/farm.jpg \
--test-dir examples/result \
--content-weight 7e0 \
--style-weight 1e3
--checkpoint-iterations 1000 \
--learning-rate 1e-3
--batch-size 1
where
you need create a folder "ckpts" in the root of this project to save chackpoint files.
- --data-format: NHWC is for tensorflow sieries framework, NCHW is for non-tensorflow series, ex. ONNX that WinML required.
- --num-base-channels: it's used to reduce model size to improve inference time on Web, and other lower compute platform.
- --checkpoint-dir: is the path to save checkpoint in
- --style: style image path
- --train-path: path to training images folder
- --test: test image path
- --test-dir: test image save dir
- --epochs: number of epochs
- --batch-size: number of images feed for a batch
- --checkpoint-iterations: checkpoint save frequency
- --vgg-path: path to VGG19 network
- --content-weight: content weight
- --style-weight: style weight
- --tv-weight: total variation regularization weight
- --learning-rate: learning rate
You can evaluate the trained models via
python evaluate.py --data-format NHWC --num-base-channels 4 --checkpoint tf-models/starrynight-300-255-NHWC_nbc4_bs1_7e00_1e03_0.01 \
--in-path examples/content/farm.jpg \
--out-path examples/results/
In this practice, we offer 3 style similar level to let you experience the different style level. the are tuned by content-weight, style-weight, and learning-rate
- --content-weight
- --style-weight
- --learning-rate
If you need get freeze model file, following the instruction that tensorflow bundled here
python -m tensorflow.python.tools.freeze_graph --input_graph=tf-model/starrynight-300-255-NHWC_nbc8_bs1_7e00_1e03_0.001/graph.pbtxt \
--input_checkpoint=tf-model/starrynight-300-255-NHWC_nbc8_bs1_7e00_1e03_0.001/saver \
--output_graph=tf-models/starrynight.pb --output_node_names="output"
The implementation is based on the Fast Style Transfer in TensorFlow from from lengstrom
Here are the source code for you practice on your local machine.
We also share some experience on how to fine tune the hyperparameter to gain a more suitable result of transfer target contents to as Vangogh's style.
Our implemetation is base on fast-style-transfer and revise pooling function, maxpooling -> avgpooling. here is the pooling concept for your reference.
Because VGG19 network get the feature for style image by resizing image size to 256x256, we found revising style image closed to 256x256. Then we can get hyperparameter as more close to style when apply style to target contents. for example:
Content | Style | Result |
CC BY 2.0 by Bryce Edwards | starry night with 300x255 | |
CC BY 2.0 by Bryce Edwards | starry night with 1280x1014 |
Following are some reuslt by different content weight & style weight for reference.
Base on starry night with 300x255:
Content weight | Style weight | learning rate | Result |
7e1 | 6e2 | 1e-2 | |
7e0 | 6e2 | 1e-2 | |
7e0 | 1e3 | 1e-2 |
Following are some example that training target style by parameters, content weight(cw), style weight(sw), learning rate(lr), and batch size: 1.
content | Result | Vangogh Style |
CC BY 2.0 by Sinchen.Lin | cw:7e0, sw:1e3, lr:1e-2 | The starry night |
CC BY 2.0 by ppacificvancouver | cw:7e0, sw:1e3, lr:1e-3 | Vincent's bedroom in Aries |
CC BY 2.0 by Andrew Gould | cw:7e0, sw:1e3, lr:1e-3 | The Red Vineyard |
CC BY 2.0 by Sam Beebe | cw:7e0, sw:1e3, lr:1e-3 | Self-Portrait |
CC BY 2.0 by Eli Christman | cw:7e0, sw:1e3, lr:1e-3 | Sien with a cigar |
CC BY 2.0 by nan palmero | cw:7e0, sw:1e3, lr:1e-3 | Soup Distribution in a Public Soup Kitchen |
CC BY 2.0 by Ms. Phoenix | cw:7e0, sw:1e3, lr:1e-3 | Sunflowers(1889) |
CC BY 2.0 by Andrew Gould | cw:7e0, sw:6e2, lr:1e-3 | Wheatfield with Crows |
CC BY 2.0 by Andrew Gould | cw:7e0, sw:6e2, lr:1e-4 | Rest from Work |
This project is licensed under the MIT, see the LICENSE.md
Thanks all authors of following projects.
- The source code of this practice is major borrowed from fast-style-transfer Github repository.
- refer to some opinion in Neural Style Transfer: A Review