Skip to content

mspandit/neon

 
 

Repository files navigation

neon

neon is Intel Nervana's reference deep learning framework committed to best performance on all hardware. Designed for ease-of-use and extensibility.

For fast iteration and model exploration, neon has the fastest performance among deep learning libraries (2x speed of cuDNNv4, see benchmarks).

  • 2.5s/macrobatch (3072 images) on AlexNet on Titan X (Full run on 1 GPU ~ 26 hrs)
  • Training VGG with 16-bit floating point on 1 Titan X takes ~10 days (original paper: 4 GPUs for 2-3 weeks)

We use neon internally at Intel Nervana to solve our customers' problems across many domains. We are hiring across several roles. Apply here!

See the new features in our latest release. We want to highlight that neon v2.0.0+ has been optimized for much better performance on CPUs by enabling Intel Math Kernel Library (MKL). Remember to turn on MKL by adding -b mkl when running neon on Intel Xeon and Xeon Phi CPUs! The DNN (Deep Neural Networks) component of MKL that is used by neon is provided free of charge and downloaded automatically as part of the neon installation.

Quick Install

On a Mac OSX or Linux machine, enter the following to download and install neon (conda users see the guide), and use it to train your first multi-layer perceptron. To force a python2 or python3 install, replace make below with either make python2 or make python3.

    git clone https://github.com/NervanaSystems/neon.git
    cd neon
    make
    . .venv/bin/activate
    # use a script to  run an example with the **optimized** CPU (mkl) backend (defaults to the non-optimized CPU backend (cpu) if no `-b mkl` is specified):
    python examples/mnist_mlp.py -b mkl
    # alternatively, use a yaml file (defaults to gpu backend if available, adding a line that contains``backend: mkl`` to enable MKL backend):
    neon examples/mnist_mlp.yaml

Recommended Settings for neon with MKL on Intel Architectures

The Intel Math Kernel Library takes advantages of the parallelization and vectorization capabilities of Intel Xeon and Xeon Phi systems. When hyperthreading is enabled on the system, we recommend the following KMP_AFFINITY setting to make sure parallel threads are 1:1 mapped to the available physical cores.

    export OMP_NUM_THREADS=<Number of Physical Cores>
    export KMP_AFFINITY=compact,1,0,granularity=fine

For more information about KMP_AFFINITY, please check here. We encourage users to set out trying and establishing their own best performance settings.

Documentation

The complete documentation for neon is available here. Some useful starting points are:

Support

For any bugs or feature requests please:

  1. Search the open and closed issues list to see if we're already working on what you have uncovered.
  2. Check that your issue/request hasn't already been addressed in our Frequently Asked Questions (FAQ) or neon-users Google group.
  3. File a new issue or submit a new pull request if you have some code you'd like to contribute

For other questions and discussions please post a message to the neon-users Google group

License

We are releasing neon under an open source Apache 2.0 License. We welcome you to contact us with your use cases.

About

Intel® Nervana™ reference deep learning framework committed to best performance on all hardware

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 52.7%
  • CSS 37.6%
  • C++ 3.5%
  • Perl 3.3%
  • C 2.0%
  • Cuda 0.4%
  • Other 0.5%