Skip to content

sidagarwal04/neon-tf-mnist

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MNIST in Neon and TensorFlow

This repository includes implementations of a deep learning model using two different frameworks: Intel Nervana Neon and Google TensorFlow.

These implementations are intended to illustrate the differences in the programming models presented by the two frameworks.

The Problem

The model solves an old problem from the machine learning community: assign a 28 × 28 pixel grayscale image of a handwritten digit to the correct one of ten classes.

The model is trained and tested on 70,000 images from the MNIST database.

The Implementations

Parsing Arguments

The Neon implementation uses a NeonArgparser instance to parse command-line arguments:

if __name__ == '__main__':
    main(NeonArgparser(__doc__).parse_args())

The TensorFlow implementation uses an ArgumentParser instance. The data_dir argument specifies the location of cached training data (if any).

It then calls our main() function, providing command-line arguments.


if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('--data_dir', type=str, default='/tmp/mnist_data',
                        help='Directory for storing input data')
    FLAGS, unparsed = parser.parse_known_args()
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)

Preparing Data

The Neon implementation uses an MNIST instance to aquire data sets. The MNIST instance handles downloading the MNIST database into a local cache.

    dataset = MNIST(path=args.data_dir)

The TensorFlow implementation uses an mnist.input_data instance.

    mnist = input_data.read_data_sets(FLAGS.data_dir, one_hot=True)

Defining the Model

The Neon implementation defines the model as two Affine layers with Gaussian initialization. The first layer has a rectified linear activation, and the second a Logistic activation.

The model is instantiated directly with these two layers.

    mlp = Model(
        layers=[
            Affine(nout=100, init=Gaussian(loc=0.0, scale=0.01),
                   activation=Rectlin()),
            Affine(nout=10, init=Gaussian(loc=0.0, scale=0.01),
                   activation=Logistic(shortcut=True))])

The TensorFlow implementation defines the model as a collection of

These objects are actually references into a graph representation of the model. This representation expresses the dependencies between the outputs, various intermediate values, inputs, and the matrix operations on them.

    x = tf.placeholder(tf.float32, [None, 784])
    W1 = tf.Variable(tf.random_normal_initializer()([784, 100]))
    b1 = tf.Variable(tf.random_normal_initializer()([100]))
    W2 = tf.Variable(tf.random_normal_initializer()([100, 10]))
    b2 = tf.Variable(tf.random_normal_initializer()([10]))
    y = tf.matmul(tf.nn.relu(tf.matmul(x, W1) + b1), W2) + b2

Defining the Optimizer

The Neon implementation defines the optimizer as an instance of GradientDescentMomentum

    optimizer = GradientDescentMomentum(
        0.1, momentum_coef=0.9, stochastic_round=args.rounding)

The TensorFlow implementation defines the optimizer as a collection of

Again, these objects are references into a graph representation of the optimizer.

    y_ = tf.placeholder(tf.float32, [None, 10])
    train_step = tf.train.MomentumOptimizer(0.1, 0.9).minimize(
      tf.reduce_mean(
          tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y)))

Fitting the Model

The Neon implementation fits the model to the training set by passing the optimizer to its fit() method. The cost function is specified here using a GeneralizedCost layer and CrossEntropyBinary function. The number of training epochs is derived from the command line arguments.

    mlp.fit(
        dataset.train_iter,
        optimizer=optimizer,
        num_epochs=args.epochs,
        cost=GeneralizedCost(costfunc=CrossEntropyBinary()),
        callbacks=callbacks)

The TensorFlow implementation fits the model to the training set by

  1. registering a default session in the context of which to execute the graph.

  2. initializing global variables

  3. acquiring a batch of training data and

  4. running the optimizer with the batch mapped to placeholders in the model.

  5. repeating steps 3. and 4.

    sess = tf.InteractiveSession()
    tf.global_variables_initializer().run()
    for _ in range(4690):
        batch_xs, batch_ys = mnist.train.next_batch(128)
        sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

Displaying Accuracy

The Neon implementation evaluates the accuracy of the model on the validation set by calling its eval() method and passing the Misclassification metric.

    error_rate = mlp.eval(dataset.valid_iter, metric=Misclassification())
    neon_logger.display('Classification accuracy = %.4f' % (1 - error_rate))

The TensorFlow implementation defines an accuracy measurement as a collection of

These objects are actually references into a graph representation of the accuracy formula. It evaluates the accuracy by running this graph with the test set mapped to placeholders.

    accuracy = tf.reduce_mean(
        tf.cast(
            tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1)), 
            tf.float32))
    print(sess.run(accuracy, feed_dict={x: mnist.test.images,
                                        y_: mnist.test.labels}))

Running

To run the Neon implementation, follow instructions for installing Neon. Then, simply enter

(.venv2) :neon-tf-mnist $ python neon_mnist_mlp.py 

To run the TensorFlow implementation, follow instructions for installing TensorFlow. Then simply enter

(tensorflow) :neon-tf-mnist $ python tf_mnist_mlp.py

About

Implementations of a deep learning model using Intel Nervana Neon and Google Tensorflow

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%