Skip to content
/ AdaS Public
forked from mahdihosseini/RMSGD

AdaS: Adaptive Scheduling of Stochastic Gradients

License

Notifications You must be signed in to change notification settings

llucid-97/AdaS

 
 

Repository files navigation

Status

License: MIT maintenance python size

Table of Contents

Introduction

AdaS is an optimizer with adaptive scheduled learning rate methodology for training Convolutional Neural Networks (CNN).

  • AdaS exhibits the rapid minimization characteristics that adaptive optimizers like AdaM are favoured for
  • AdaS exhibits generalization (low testing loss) characteristics on par with SGD based optimizers, improving on the poor generalization characteristics of adaptive optimizers
  • AdaS introduces no computational overhead over adaptive optimizers (see experimental results)
  • In addition to optimization, AdaS introduces new quality metrics for CNN training (quality metrics)

This repository contains a PyTorch implementation of the AdaS learning rate scheduler algorithm.

License

AdaS is released under the MIT License (refer to the LICENSE file for more information)

Permissions Conditions Limitations
license Commerical use license License and Copyright Notice license Liability
license Distribution license Warranty
license Modification
license Private Use

Citing AdaS

@misc{hosseini2020adas,
    title={AdaS: Adaptive Scheduling of Stochastic Gradients},
    author={Mahdi S. Hosseini and Konstantinos N. Plataniotis},
    year={2020},
    eprint={2006.06587},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

Empirical Classification Results on CIFAR10 and CIFAR100

Figure 1: Training performance using different optimizers across two datasets and two CNNs figure 1

Table 1: Image classification performance (test accuracy) with fixed budget epoch of ResNet34 training table 1

QC Metrics

Please refer to QC on Wiki for more information on two metrics of knowledge gain and mapping condition for monitoring training quality of CNNs

Requirements

Software/Hardware

We use Python 3.7.

Please refer to Requirements on Wiki for complete guideline.

Computational Overhead

AdaS introduces no overhead (very minimal) over adaptive optimizers e.g. all mSGD+StepLR, mSGD+AdaS, AdaM consume 40~43 sec/epoch to train ResNet34/CIFAR10 using the same PC/GPU platform

Optimizer Learning Rate Scheduler Epoch Time (avg.) RAM (Memory) Consumed GPU Memory Consumed
mSGD StepLR 40-43 seconds ~2.75 GB ~3.0 GB
mSGD AdaS 40-43 seconds ~2.75 GB ~3.0 GB
ADAM None 40-43 seconds ~2.75 GB ~3.0 GB

Installation

There are two versions of the AdaS code contained in this repository.

  1. a python-package version of the AdaS code, which can be pip-installed.
  2. a static python module (unpackaged), runable as a script.

All source code can be found in src/adas

For more information, also refer to Installation on Wiki

Usage

Moving forward, I will refer to console usage of this library. IDE usage is no different. Training options are split two ways:

  1. all environment/infrastructure options (GPU usage, output paths, etc.) is specified using arguments.
  2. training specific options (network, dataset, hyper-parameters, etc.) is specified using a configuration config.yaml file:
###### Application Specific ######
dataset: 'CIFAR10'
network: 'VGG16'
optimizer: 'SGD'
scheduler: 'AdaS'


###### Suggested Tune ######
init_lr: 0.03
early_stop_threshold: 0.001
optimizer_kwargs:
  momentum: 0.9
  weight_decay: 5e-4
scheduler_kwargs:
  beta: 0.8

###### Suggested Default ######
n_trials: 5
max_epoch: 150
num_workers: 4
early_stop_patience: 10
mini_batch_size: 128
p: 1 # options: 1, 2.
loss: 'cross_entropy'

For complete instruction on configuration and different parameter setup, please refer to Configuration on Wiki

Common Issues (running list)

  • None :)

TODO

  • Add medical imaging datasets (e.g. digital pathology, xray, and ct scans)
  • Extension of AdaS to Deep Neural Networks

Pytest

Note the following:

  • Our Pytests write/download data/files etc. to /tmp, so if you don't have a /tmp folder (i.e. you're on Windows), then correct this if you wish to run the tests yourself

About

AdaS: Adaptive Scheduling of Stochastic Gradients

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%