Python get_stream_and_vocab_dict_baseline Examples

Programming Language: Python

Namespace/Package Name: Data_Stream.data_stream

Method/Function: get_stream_and_vocab_dict_baseline

Examples at hotexamples.com: 2

Python get_stream_and_vocab_dict_baseline - 2 examples found. These are the top rated real world Python examples of Data_Stream.data_stream.get_stream_and_vocab_dict_baseline extracted from open source projects. You can rate examples to help us improve the quality of examples.

Example #1

Show file

File: Tests.py Project: arvieFrydenlund/Neural_Language_Models

    do_test_1 = True
    if do_test_1:

        data_path = '/u/arvie/PHD/Neural_Language_Models/penn_tree_bank_data/train.txt'
        data_path2 = '/u/arvie/PHD/Neural_Language_Models/penn_tree_bank_data/test.txt'

        dataset_options = dict(dictionary=char2code, level="character", preprocess=lower,
                               bos_token=None, eos_token=None)
        EOW_id = char2code[' ']
        padding_token = EOW_id
        print('EOW_ID = ' + str(EOW_id))
        batch_size = 5

        stream = get_stream_and_vocab_dict_baseline(data_path_list=[data_path, data_path2],
                                                    dataset_options=dataset_options,
                                                    max_sent_size = 13, max_subword_size = 10, debug_print=False,
                                                    EOW_id=EOW_id, padding_token=EOW_id, batch_size=batch_size,
                                                    data_dtype=np.dtype(np.uint16), mask_dtype=np.dtype(np.float32)) # these are important

        print(stream.sources)
        '''
        Remember that your theano variables need to match the stream names
        x = T.tensor3('features', dtype='uint16')
        and dtypes need to match i.e. float32 and uint16
        '''

        print('Starting classification example 1 with real data')

        x = T.tensor3('features', dtype='uint16')
        x_mask = T.tensor3('features_mask', dtype='float32')
        y = T.matrix('targets', dtype='uint16')

Example #2

Show file

File: Training.py Project: arvieFrydenlund/Neural_Language_Models

    # Train
    main_loop.run()

    print('DONE TRAINING')


if __name__ == "__main__":
    # Grab a GPU
    gpu_board = lock_GPU()
    print('STRARTING TRAINING BASELINE MODEL')

    data_path = '/u/arvie/PHD/Neural_Language_Models/penn_tree_bank_data/out/char_level/'

    # Load config parameters
    config = Config(data_path=data_path, sets=['train', 'valid'])

    # Create data stream
    train_stream = get_stream_and_vocab_dict_baseline(data_path_list=config.params['data_path_list_train'],
                                                      dataset_options=config.params['dataset_options_train'],
                                                      max_sent_size=config.params['max_sent_size'],
                                                      max_subword_size=config.params['max_subword_size'],
                                                      debug_print=config.params['debug_print'],
                                                      EOW_id=config.params['EOW_id'],
                                                      padding_token=config.params['padding_token'],
                                                      batch_size=config.params['batch_size'],
                                                      data_dtype=config.params['data_dtype'],
                                                      mask_dtype=config.params['mask_dtype'])

    print(train_stream.sources)
    run_training(config, train_stream, use_bokeh=False)