Skip to content

boringlee24/dash

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

DASH

DASH performs deep learning training job scheduling on heterogeneous GPU types in a cluster

Hardware setup

16 NVIDIA TESLA K80 GPUs

8 NVIDIA TESLA V100 GPUs

Software environment

  • Python 3.6
  • Tensorflow 1.14
  • CUDA 10.0

Benchmarking training models and dataset

  1. DenseNet: link
  2. ResNet: link
  3. VGG: link
  4. MobileNet: link
  5. MnasNet: link

CIFAR10 dataset: link

Start DASH

Go to directory final/final4_new/.

First allocate an external node (or scheduler node) for tcp client

On each node with the GPUs, start the tcp server by

python gpu_server.py args

Go to the external node, start DASH scheduling on the benchmark

python main.py args