Skip to content

benghaem/MutNN

Repository files navigation

MutNN

WARNING: This is alpha software. Have fun!

MutNN is an experimental ONNX runtime that supports seamless multi-GPU graph execution on CUDA GPUs and provides baseline implementations of both model and data parallelism.

V0.2: Adds support for multi-graph / multi-tenant NN execution!

Developed by Benjamin Ghaemmaghami & Saurabh Gupta

Usage

import onnx2parla as o2p
import numpy as np

def load_batch(start, end):
    batch_size = end - start
    return np.random.random((batch_size,3,224,224))

def store_batch(res):
    print(res)

# config with batch_size = 16, total_batches = 256
cfg = o2p.Config(store_batch, load_batch, 16, 256)
o2p_model = o2p.build("resnet18v1.onnx", config)

o2p_model.run()

Operator Support

Current operator support is limited with most of the focus placed on supporting CNNs.

The full list is here

Where can I get ONNX graphs?

The ONNX Model Zoo is a great place to find a wide selection of pre-trained neural network graphs. All of the major frameworks have some capability to be converted to ONNX graphs.

Known Issues / Limitations

  • Datatypes are assumed to be float32
  • No support for multiple graph inputs/outputs
  • Conv operator can occasionally crash when run on GPU
  • No ONNX versioning support, most operators support only the newest version

Diagnosing Problems

First check your ONNX graph using something like Netron to ensure all operators in your graph are supported by ONNX2Parla

Next, enable pass debugging mode

cfg = o2p.Config(...)
cfg.debug_passes = True

and inspect the generated GML files to see if your graph is correctly being processed by the system.

About

An experimental multi-GPU aware ONNX runtime

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published