Neural Networks using compressed JPEG images.

This repository provides code to train and used neural network on compressed JPEG images. No pre-trained weights are/will be made available.

This implementation relies on the module jpeg2dct from uber research team. The SSD used in this repository was taken from this repository and then modified.

All the networks proposed in this repository are modified versions of the three following architectures

Installation

The provided code can be used directly or install as a package. The following steps are to install the dependencies in a virtual env:

# Making virtualenv
mkdir .venv
cd .venv
python3 -m venv jpeg_deep
source jpeg_deep/bin/activate

cd ..

# Installing all the dependencies (the code was tested with the specified version numbers on python 3.+)
pip install keras
pip install tensorflow-gpu==1.14.0
pip install pillow
pip install opencv-python
pip install jpeg2dct
pip install albumentations
pip install tqdm
pip install bs4
pip install cython
pip install pycocotools
pip install matplotlib

Training

The training uses a system of configuration files and experiments. This system aims to help saving the parameters of a given run. On start of the training, an experiment folder will be created with copies of the configuration files, weights and logs. Example config files are available in the config folder. The config files defines all the training and testing parameters.

System variables

To simplify deployment on different machines, the following variables need to be defined (see the Classification/Detections sections for details in the dataset_path):

# Setting the main dirs for the training datasets
export DATASET_PATH_TRAIN=<path_to_train_directory>
export DATASET_PATH_VAL=<path_to_validation_directory>
export DATASET_PATH_TEST=<path_to_test_directory>

# Setting the directory were the experiment folder will be created
export EXPERIMENTS_OUTPUT_DIRECTORY=<path_to_output_directory>

Starting the training

Once you have defined all the variables and modified the config files to your needs, simply run the following command (you will need to update some of the parameters to when not using horovod):

python scripts/training.py -c <config_dir_path> --no-horovod

The config file in the <config_dir_path> needs to be named "config.py" for the script to run correctly.

For more details on classification training on ImageNet dataset, refer to this section, for more details for training on Pascal VOC dataset, refer to this section and for more details for training MS-COCO dataset, refer to this section

Training using horovod

The training script support the usage of horovod. I highly recommend to train on multiple GPUs for the classification given the size of the dataset. An exemple file for training with horovod using slurm is provided jpeg_deep.sl.

cd slurm
sbatch jpeg_deep.sl

If you do not run on a multi-cluster computation facility that uses slurm, please refer to the original horovod git

Predict

No pre-trained weights are/will be made available. To get this section running, you'll have to retrain the networks from scratch.

Display the results

Displaying the results can be done using the prediction.py script. In order to use the script you have to first carry a training for at least one epoch (the prediction pre-suppose that you have an experiment folder).

The prediction will be done on the test set. You need to modify the config_temp.py file in the experiment generated folder in order to use a different dataset.

For the vgg16 based classifiers: The prediction script uses the test generator specified in the config file to get the data. Hence, with the provided examples, you may need first to convert the weights to a fully convolutional version of the network. This can be done using the classification2ssd.py script.

Once this is done, simply run the following command:

python scripts/prediction.py <experiment_path> <weights_path>

Prediction time

We also provide with a way to test the speed of the trained networks. This is done using the prediction_time.py script.

In order to test the speed of the networks, a batch of data is preloaded into memory then prediction is run over this batch for P times, and the overall is done N times. Results is then the averaged time. You may or may not load weights.

python scripts/prediction_time.py <experiment_path> -nr 10 -w <weights_path>

Classification (ImageNet)

Results on ImageNet

The table below shows the results obtained (accuracy) compared with the state of the art. All the presented results are on the validation dataset. All the FPS were calculated using a NVIDIA GTX 1080 and using the prediction_time.py script. Batch size was set to 8.

Official Newtorks	top-1	top-5	FPS
VGG16	73.0	91.2	N/A
VGG-DCT	42.0	66.9	N/A
ResNet50	75.78	92.65	N/A
LC-RFA	75.92	92.81	N/A
LC-RFA-Thinner	75.39	92.57	N/A
Deconvolution-RFA	76.06	92.02	N/A

VGG based Newtorks (our trainings)	top-1	top-5	FPS
VGG16	71.9	90.8	267
VGG-DCT	65.5	86.4	553
VGG-DCT Y	62.6	84.6	583
VGG-DCT Deconvolution	65.9	86.7	571

ResNet50 based Newtorks (our trainings)	top-1	top-5	FPS
ResNet50	74.73	92.33	324
LC-RFA	74.82	92.58	318
LC-RFA Y	73.25	91.40	329
LC-RFA-Thinner	74.62	92.33	389
LC-RFA-Thinner Y	72.48	91.04	395
Deconvolution-RFA	74.55	92.39	313

Training on ImageNet

The dataset can be downloaded here. Choose the version that suits your needs, I used the 2012 (Object Detection) data.

Once the data is downloaded, to use the provided generators, it should be stored following this tree (as long as you have separeted train and validation folders you should be okay)

imagenet
|
|_ train
|  |_ n01440764
|  |_ n01443537
|  |_ ...
|
|_ validation
   |_ n01440764
   |_ n01443537
   |_ ...

Then you'll just need to set the configuration files to fit your needs and follow the procedure described in the training section. Keep in mind that the provided configuration files were used in a distributed training, hence the hyper parameters fit this particular settings. If you don't train that way, you'll need to change them.

Also the system variable should be set to the ImageNet folder (if you use the provided config files)

# Setting the main dirs for the training datasets
export DATASET_PATH_TRAIN=<path_to_train_directory>/imagenet
export DATASET_PATH_VAL=<path_to_validation_directory>/imagenet
export DATASET_PATH_TEST=<path_to_test_directory>/imagenet

Detection (Pascal VOC)

Results on the PASCAL VOC dataset

Results for training on the Pascal VOC dataset are presented bellow. Networks were either trained on the 2007 train/val set (07) or 2007+2012 train/val sets (07+12) and evaluated on the 2007 test set.

Official Networks	mAP (07)	mAP (07+12)	FPS
SSD300	68.0	74.3	N/A
SSD300 DCT	39.2	47.8	N/A

Networks, VGG based (our trainings)	mAP (07)	mAP (07+12)	FPS
SSD300	65.0	74.0	102
SSD300 DCT	48.9	60.0	262
SSD300 DCT Y	50.7	59.8	278
SSD300 DCT Deconvolution	38.4	53.5	282

Network, ResNet50 based (our trainings)	mAP (07)	mAP (07+12)	FPS
SSD300-Resnet50 (retrained)	61.3	73.1	108
SSD300 DCT LC-RFA	61.7	70.7	110
SSD300 DCT LC-RFA Y	62.1	71.0	109
SSD300 DCT LC-RFA-Thinner	58.5	67.5	176
SSD300 DCT LC-RFA-Thinner Y	60.6	70.2	174
SSD300 DCT Deconvolution-RFA	54.7	68.8	104

Training on the PASCAL VOC dataset

The data can be downloaded on the official website.

After downloading you should have directories following this architecture:

VOCdevkit
|
|_ VOC2007
|  |_ Annotations
|  |_ ImageSets
|  |_ JPEGImages
|  |_ ...
|
|_ VOC2012
   |_ Annotations
   |_ ImageSets
   |_ JPEGImages
   |_ ...

Then you'll just need to set the configuration files to fit your needs and follow the procedure described in the training section. The hyper-parameters provided for the training were not used in a parallel setting.

Also the system variable should be set to the Pascal VOC folder (if you use the provided config files)

# Setting the main dirs for the training datasets
export DATASET_PATH_TRAIN=<path_to_train_directory>/VOCdevkit
export DATASET_PATH_VAL=<path_to_validation_directory>/VOCdevkit
export DATASET_PATH_TEST=<path_to_test_directory>/VOCdevkit

Detection (MS-COCO)

Details in the dataset path

Running the documentation for a deeper usage of the provided code

I know from experience that diving into ones code to adapt to its own project is often hard and confusing at first. To help you if you ever want to toy with the code, a built-in documentation is provided. It uses a modify version of the keras documentation generator (here).

To generate the documentation:

pip install mkdocs

cd docs

python autogen.py

To display the documentation:

# From root of the repository
mkdocs serve

Method limitations

The presented method has some limitations especially for general purpose deployments. The two main issues I see are described hereafter.

Image Resizing

Resizing images in the RGB domain is straightforward whereas resizing in the DCT domain is more complicated. Although theoretically doable, methods for such usage are not implemented. The following list of articles explore the possibility to resize images directly in the frequency domain:

For classification, the impact is limited as long as the images are about the same size as the original training images. This is due to the fact that the network can be made fully convolutionnals. For detection, this is a bit more complicated as the SSD in the presented implementation does not scale well (although it should theoretically be able to do so). This is due to the original design of the network and the need for padding layers. I intend to test modified version of the network if I find some time to do so.

Training Pipeline

The second limitation is for training. Data-augmentation has to be carried in the RGB domain, thus the data-augmentation pipeline is the following one: JPEG => RGB => data-augmentation => JPEG => Compressed Input. This slows down the training.

Name		Name	Last commit message	Last commit date
Latest commit History 595 Commits
config		config
data		data
docs		docs
jpeg_deep		jpeg_deep
notebooks		notebooks
scripts		scripts
slurm		slurm
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
mkdocs.yml		mkdocs.yml
setup.py		setup.py

License

YannSc/jpeg_deep

Folders and files

Latest commit

History

Repository files navigation