Smart and distributed hyperparameter/architecture search for Neural nets and other models.
If you are going to play with creating your own clusters it is highly encouraged
to use a separate AWS account (make sure to provide proper path to a credentials file).
Currently when cluster is terminated it kills all running instances it can find.
Isolating netron clusters from other running intances is in TODO.
- Isolate netron clusters from other instances.
- JobReporter. How to store the progress of training? How to display the progress? Separate dashboard?
- Plugins for model evolution:
- GridSearch
- RandomSearch
- NEAT/HyperNEAT
- Bayesian Optimization
- Reversible learning
- QLearning?
- ...
- Install python dependencies:
sudo pip install -r requirements.txt
- Install latest VirtualBox and vagrant.
- Start a vagrant box:
vagrant up
. It will start up a virtual machine with a MongoDB running inside of a docker container.
If you are running on a Linux server, you don't need vagrant and Virtual Machine and can simply start Mongo in a docker container:
docker pull mongo
docker run -p 27017:27017 --name netron-mongo -d mongo
And then to stop docker containerdocker stop netron-mongo
and to startdocker start netron-mongo
.
This will find a network to model sin(x)
.
- Start a server:
python server.py --input_shape 1 --output_dim 1 --data sin_data.npz --solver GridSearch --grid simple_params_grid.json
- Start a worker:
python worker.py --server http://localhost:8080 --nb_epoch 10 --patience 5
Check python server.py -h
and python worker.py -h
for the explanation of the arguments.
You can check the results of the training by opening http://localhost:8080/stats/
in your browser.
This will find a networks for MNIST database using RandomSearch.
- Start a server:
python server.py --input_shape 1,28,28 --output_dim 10 --data mnist_data.npz --solver RandomSearch --grid convnet_grid.json --params_sample_size 1 --structure_sample_size 1
- Start a worker:
python worker.py --server http://localhost:8080 --nb_epoch 10 --patience 5
--structure_sample_size
specifies how many network structures to sample.
--param_sample_size
specifies how many sets of parameters to sample for every network structure.
Training data must be stored as a compressed numpy file in
netron/server/static/
. To create such file for your training data:
np.savez_compressed("data_filename.npz", X_train = your_x_train, y_train = your_y_train)
Where your_x_train
and your_y_train
are numpy arrays.
- Open a separate branch and make your changes there.
- Open a pull request to master (For now. It'll change later).