Timeseries Forecasting with Deep Learning

This Python project uses LSTM (Long Short-Term Memory) and GRU (Gated Recurrent Unit) based Recurrent Neural Networks to forecast (predict) timeseries using Keras + Theano. We compare the results produced by each of these deep neural networks with those from a linear regression model.

Dataset: Number of daily births in Quebec, Jan. 01 '77 - Dec. 31 '90 (Hipel & McLeod, 1994)

##Usage I suggest you install Virtualenv before trying this out.

git clone https://github.com/dhrushilbadani/deeplearning-timeseries.git
cd deeplearning-timeseries
virtualenv ENV
source ENV/bin/activate
pip install --upgrade pip
pip install keras h5py pandas sklearn
python evaluate.py

##Architecture & Model Properties We use Keras' Sequential model to construct recurrent neural networks. There are 3 layers:

Layer 1 : Either a LSTM (with output dimension 10, and statefulness enabled) layer or a GRU (with output dimension 4) layer.
Layer 2 : A Dropout layer with dropout probability = 0.2, to prevent overfitting.
Layer 3 : A fully-connected Dense Layer with output dimension 1.
Default optimizer: rmsprop; Default # of epochs: 150.
Accuracy Metric: Mean Squared Error.

This architecture can certainly further be optimized - I just haven't had the chance to experiment too much thanks to my laptop's constraints!

##Results & Observations

The LSTM-RNN model performed the best with a MSE of 1464.78 (look back = 37).
Naively making the RNN "deeper" did not yield immediate results; I didn't fine-tune the parameters (output_dim, for example) though.
Making the LSTM network stateful (setting stateful=true when initializing the LSTM layer) did yield a significant performance improvement though. In stateless LSTM layers, the cell states are reset at each sequence. When stateful=true however, the states are propagated onto the next batch i.e. the state of the sample located at index trainX[i] will be used in the computation of the sample trainX[i+k] in the next batch, where k is the batch size. You can read more about this at the Keras docs.
Using Glorot initializations yielded a performance improvement. However, using He uniform initialization (Gaussian initialization scaled by fan_in) yielded even better results than with Glorot.

##Files

```data/number-of-daily-births-in-quebec.csv``` : Dataset.

```lstm_model.py```: Contains the class ```LSTM_RNN``` for LSTM-based Recurrent Neural Networks.

```gru_model.py```: Contains the class ```GRU_RNN``` for GRU-based Recurrent Neural Networks.

```evaluate.py```: Loads and preprocesses the dataset, creates LSTM-RNN, GRU-RNN and Linear Regression models, and outputs results.

##To-do

K-fold cross validation.

Add plots to aid in visualization.

##References

On the use of ‘Long-Short Term Memory’ neural networks for time series prediction, Gomez-Git et. al, 2014.
Dropout: A Simple Way to Prevent Neural Networks from Overfitting, Srivastava et. al 2014.
Learning to forget, Gers, Schmidhuber & Cummins, 2000.
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling, Chung et. al, 2014.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

README.md

README.md

evaluate.py

evaluate.py

gru_model.py

gru_model.py

lstm_model.py

lstm_model.py

Repository files navigation

Timeseries Forecasting with Deep Learning

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
README.md		README.md
evaluate.py		evaluate.py
gru_model.py		gru_model.py
lstm_model.py		lstm_model.py

EAboelhamd/deeplearning-timeseries

Folders and files

Latest commit

History

Repository files navigation

Timeseries Forecasting with Deep Learning

About

Resources

Stars

Watchers

Forks

Languages