Skip to content

meelement/recnet

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RecNet - Recurrent Neural Network Framework

Build Status License Python Theano

About

RecNet is a easy to use framework for recurrent neural networks. It implements a deep uni/bidirectional Conventional/LSTM/GRU architecture in Python with use of the Theano library. The intension is a easy handling, light weight implementation with the opportunity to check out new ideas and to implement current research.

Current implemented features:

  • Conventional Recurrent Layers (tanh/relu activation)
  • LSTM (with and without peepholes) and GRU [1,2]
  • uni/bidirectional Training
  • Layer Normalization [3]
  • Softmax Output
  • SGD, Nesterov momentum, RMSprop and AdaDelta optimization [4, 5]
  • Dropout Training [6]
  • MSE, Cross-Entropy Loss and Weighted Cross-Entropy Loss
  • normal and log Connectionist Temporal Classification [7]
  • Regularization (L1/L2)
  • Noisy Inputs
  • Mini Batch Training

Example of use:

Little timer task Numbers recognition using CTC Phoneme recognition

How to install it

git clone https://github.com/joergfranke/recnet.git
cd recnet
python setup.py install

In case of error try to update pip/setuptools.

How to use it

1. Please provide your data in form of two lists and storage it in a klepto file. One list contains sequences of features and another the corresponding targets. Each element of the list should be a matrix with shape sequence length | feature/target size .

    d = klepto.archives.file_archive("train_data_set.klepto")
    d['x'] = input_features #example shape [ [123,26] , [254,26] , [180,26] , [340,26] , ... ]
    d['y'] = output_targets #example shape [ [123,61] , [254,61] , [180,61] , [340,61] , ... ]
    d.dump()
    d.clear()

2. Instantiate RecNet, define parameters and create model.

rn = rnnModel()
rn.parameter["train_data_name"] = "train_data_set.klepto"
rn.parameter["net_size"      ] = [      2,     10,         2]
rn.parameter["net_unit_type" ] = ['input',  'GRU', 'softmax']
rn.parameter["net_arch"      ] = [    '-',    'bi',     'ff']
rn.parameter["optimization"  ] = "adadelta"
rn.parameter["loss_function" ] = "cross_entropy"
rn.create()

Please find a full list of possible parameters below.

3. Use the provided function for generating mini batches, training, validation or usage.

mb_train_x, mb_train_y, mb_mask = rn.get_mini_batches("train")
for j in range(train_batch_quantity):
    net_out, train_error = rn.train_fn( mb_train_x[j], mb_train_y[j], mb_mask[j] )

Please find complete training and usage scripts in the provided examples.

Documentation

Parameters

Parameter Description Value
train_data_name Name of the training data set String
valid_data_name Name of the validation data set String
data_location Path/dictionary to the data set in kelpto files Path
batch_size Size of the mini batches Integer >=1
output_location Path/dictionary for saving the log/prm files Path
output_type Log during training in console, log-file or both "console"/"file"/"both"
net_size input size, size of each hidden layer, output size List of integer
net_unit_type unit type of each layer (input, GRU, LSTM, conv, GRU_ln ...) List of unit types
net_act_type activation function of each layer (tanh, relu, softplus) List of activation functions
net_arch architecture of each layer (unidirectional, bidirectional, feed forward) List of architectures
epochs Number of epochs to train Integer >=1
learn_rate Lerning rate for optimization algorithm Float [0.0001...0.5]
optimization Optimization algorithm "sgd" / "rmsprop" / "nesterov_momentum" / "adadelta"
momentum Momentum for some optimization algorithms Float [0...1]
decay_rate Decay rate for some optimization algorithms Float [0...1]
use_dropout Use of dropout between layers vertical False/True
dropout_level Probability of dropout Float [0...1]
regularization Use of regularization (L1/L2) False/"L1"/"L2"
reg_factor Influence of regularization Float [0...1]
noisy_input Add noise to the input True/False
noise_level Factor for noise level Float [0...1]
loss_function Loss function (weighted or normal ce) MSE/w2_cross_entropy/cross_entropy/CTC/CTClog
bound_weight Weight for weighted cross entropy Integer

Functionality

Function Describtion Arguments Return
create Create model and compile functions List of function to compile ['train','valid','forward'] -
pub Publish in console or log-file String of text -
get_mini_batches Create model and compile functions 'train'/'valid'/'test', opt:'data_name' -
dump Make a dump of current model - -
train_fn Train model with mini batch features, targets, mask training error, network output
valid_fn Determine validation error without update features, targets, mask validation error, network output
forward_fn Determin output based on mini batch features, mask network output

Credits

References

  1. Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." Neural computation 9.8 (1997): 1735-1780.
  2. Chung, Junyoung, et al. "Empirical evaluation of gated recurrent neural networks on sequence modeling." arXiv preprint arXiv:1412.3555 (2014).
  3. Ba, Jimmy Lei, Jamie Ryan Kiros, and Geoffrey E. Hinton. "Layer normalization." arXiv preprint arXiv:1607.06450 (2016).
  4. Zeiler, Matthew D. "ADADELTA: an adaptive learning rate method." arXiv preprint arXiv:1212.5701 (2012).
  5. Hinton, Geoffrey, N. Srivastava, and Kevin Swersky. "Lecture 6a Overview of mini-‐batch gradient descent." Coursera Lecture slides https://class. coursera. org/neuralnets-2012-001/lecture,[Online.
  6. Zaremba, Wojciech, Ilya Sutskever, and Oriol Vinyals. "Recurrent neural network regularization." arXiv preprint arXiv:1409.2329 (2014).
  7. Graves, Alex, et al. "Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks." Proceedings of the 23rd international conference on Machine learning. ACM, 2006.

Further work

  • Extend documentation
  • Add tests
  • Implementations:
    • CTC decoder
    • Parametrize initialization
    • Lern initialization
    • Annealed Gradient Descent
    • Mix of SGD and others like AdaDelta

About

RecNet - Recurrent Neural Network Framework

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%