Python get_stream_and_vocab_dict_baseline 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: Data_Stream.data_stream

메소드/함수: get_stream_and_vocab_dict_baseline

hotexamples.com에서의 예제들: 2

Python get_stream_and_vocab_dict_baseline - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 Data_Stream.data_stream.get_stream_and_vocab_dict_baseline에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

파일: Tests.py 프로젝트: arvieFrydenlund/Neural_Language_Models

    do_test_1 = True
    if do_test_1:

        data_path = '/u/arvie/PHD/Neural_Language_Models/penn_tree_bank_data/train.txt'
        data_path2 = '/u/arvie/PHD/Neural_Language_Models/penn_tree_bank_data/test.txt'

        dataset_options = dict(dictionary=char2code, level="character", preprocess=lower,
                               bos_token=None, eos_token=None)
        EOW_id = char2code[' ']
        padding_token = EOW_id
        print('EOW_ID = ' + str(EOW_id))
        batch_size = 5

        stream = get_stream_and_vocab_dict_baseline(data_path_list=[data_path, data_path2],
                                                    dataset_options=dataset_options,
                                                    max_sent_size = 13, max_subword_size = 10, debug_print=False,
                                                    EOW_id=EOW_id, padding_token=EOW_id, batch_size=batch_size,
                                                    data_dtype=np.dtype(np.uint16), mask_dtype=np.dtype(np.float32)) # these are important

        print(stream.sources)
        '''
        Remember that your theano variables need to match the stream names
        x = T.tensor3('features', dtype='uint16')
        and dtypes need to match i.e. float32 and uint16
        '''

        print('Starting classification example 1 with real data')

        x = T.tensor3('features', dtype='uint16')
        x_mask = T.tensor3('features_mask', dtype='float32')
        y = T.matrix('targets', dtype='uint16')

예제 #2

파일 보기

파일: Training.py 프로젝트: arvieFrydenlund/Neural_Language_Models

    # Train
    main_loop.run()

    print('DONE TRAINING')


if __name__ == "__main__":
    # Grab a GPU
    gpu_board = lock_GPU()
    print('STRARTING TRAINING BASELINE MODEL')

    data_path = '/u/arvie/PHD/Neural_Language_Models/penn_tree_bank_data/out/char_level/'

    # Load config parameters
    config = Config(data_path=data_path, sets=['train', 'valid'])

    # Create data stream
    train_stream = get_stream_and_vocab_dict_baseline(data_path_list=config.params['data_path_list_train'],
                                                      dataset_options=config.params['dataset_options_train'],
                                                      max_sent_size=config.params['max_sent_size'],
                                                      max_subword_size=config.params['max_subword_size'],
                                                      debug_print=config.params['debug_print'],
                                                      EOW_id=config.params['EOW_id'],
                                                      padding_token=config.params['padding_token'],
                                                      batch_size=config.params['batch_size'],
                                                      data_dtype=config.params['data_dtype'],
                                                      mask_dtype=config.params['mask_dtype'])

    print(train_stream.sources)
    run_training(config, train_stream, use_bokeh=False)