Skip to content

dtmooreiv/distributedNeuralNetwork

Repository files navigation

This project uses code from Michael Nielsen's Neural Networks and Deep Learning repository https://github.com/mnielsen/neural-networks-and-deep-learning

It currently handles provisioning a GPU instance on AWS and installing the project's dependencies. We plan on changing the project to support a multi-node training system.

We treat the first node in the hosts file generated by buildInventory.js as the master, and the rest as slaves. In a production system, it might be worthwhile to have the master node NOT be a GPU-capable server since the master doesn't need this, and instead benefits from a fast CPU. However, while I didn't investigate this, I expect the effect to be marginal.

###To do

  • In network_master, need to update the weight and biases of net in dispatch_request.
  • In network_master, rounds_done needs to udpate to stop recalling train method on slaves when work is done

###To run

  • Start network_slaves on all slave nodes.
  • Ensure there is a file named hosts at '/home/ubuntu/hosts'. This file is normally built using buildInventory.js, but since you (Michael) will be mostly testing this on your own machine, make one in that location. The only relevant part is that the ips of the slave nodes appear from the third line to the end, and there is at least a space (and maybe more characters) to separate the ips from the rest of the characters in the same line.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published