Skip to content

trevorlindsay/lstm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RNN Project

Model Configuration File

Overview

The model is run and built based on the information in the configuration file. It is divided into two parts: Data and Model. Below are the meanings of each of the parameters.

Data

Required

  1. Data::Input:: - path to csv input data
  2. Data::Id:: - instrument id column
  3. Data::Time:: - time column
  4. Data::Weight:: - column to use for sample weights
  5. Data::NumericalFeatures:: - comma-separated list of numerical features (including timeindex and instrument id)
  6. Data::CategoricalFeatures:: - comma-separated list of categorical features

Required Only for Training

  1. Data::OutputInSample:: - path to csv output predictions
  2. Data::OutputOutSample:: - path to csv output predictions
  3. Data::Target:: - target column to predict
  4. Data::StartInSample:: - time index of where to begin in-sample, can be omitted for alpha predictions
  5. Data::EndInSample:: - time index of where to end in-sample
  6. Data::StartOutSample:: - time index of where to begin out-sample
  7. Data::EndOutSample:: - time index of where to end out-sample

Required Only for Alpha Predictions (model read from file)

  1. Data::AlphaDirectory:: - path of where to savae output of predictions when model read from file

Model

Required (even if model read from file)

  1. Model::RNN:: - number of neurons in each of the model’s RNN layers (comma-separated list, one number for each layer)
  2. Model::Dense:: - number of neurons in each of the model’s fully-connected layers (comma-separated list, one number for each layer)
  3. Model::Activation:: - activation function to use (MUST BE ONE OF THESE: tanh, relu, sigmoid, softmax)
  4. Model::BatchSize:: - number of instruments to feed through model at once
  5. Model::StepSize:: - number of timesteps to train model on at once (set to -1 to use the max available)

Required Only for Training

  1. Model::NumEpochs:: - number of epochs for training
  2. Model::LearningRate:: - learning rate for Adam optimizer
  3. Model::KeepProb:: - dropout probability between layers
  4. Model::InitScale:: - max of range for initializing weights
  5. Model::MaxGradNorm:: - max allowed gradient, anything above clipped to this value
  6. Model::OutputDirectory:: - directory to save model

Required Only When Model Read from File (must omit otherwise)

  1. Model::InputDirectory:: - path to saved model

Optional

  1. Model::NumCores:: - number of CPU cores to use when running model (if not specified, will be set to max available)

Using Tensorboard Summaries

Whenever a model is run, it will produce summaries that can be viewed with Tensorboard. The summaries track two scalar values (mean squared error and r-squared) while the model trains.

To view the summaries with Tensorboard, run the following command: tensorboard --logdir=$PATH/TO/SUMMARIES

Then navigate to the path given (usually http://0.0.0.0:6006) in a web browser to view the graphs (note: there are known issues with using Tensorboard with Safari).

Example of Tensorboard output:

alt text

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages