Python load_imdb_data示例

编程语言: Python

命名空间/包名称: text_processing

方法/功能: load_imdb_data

hotexamples.com的示例: 4

Python load_imdb_data - 已找到4个示例。这些是从开源项目中提取的最受好评的text_processing.load_imdb_data现实Python示例。您可以评价示例，以帮助我们提高示例质量。

示例#1

显示文件

文件： LSTM_categories.py 项目： Jewelryland/imdb_experiments

    Try to guess the numerical rating that corresponds with the text review.

    This one doesn't do so well; haven't messed with configurations much and there
    are many that could likely be made to improve it from it's current <10%
    prediction rate. LSTM is maybe not so helpful with this categorization problem.

    GPU command:
        THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python imdb_lstm.py
'''

max_features=20000
maxlen = 100 # cut texts after this number of words (among top max_features most common words)
batch_size = 16

print("Loading data...")
(X_train, y_train), (X_test, y_test), w = load_imdb_data(
    binary=False, seed=113, maxlen=maxlen, max_features=max_features)

# for categories, convert label lists to binary arrays
nb_classes = np.max(y_train)+1
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)

print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')

print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)

print('Build model...')
model = Sequential()
model.add(Embedding(max_features, 256))

示例#2

显示文件

文件： LSTM_keras.py 项目： Jewelryland/imdb_experiments

    Modified version of keras LSTM example: trains an LSTM network on 
    the imdb sentiment analysis data set. In addition to predicting on
    test data, also stores the model's weights and intermediate
    activation values for training and test data.

    GPU command:
        THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python imdb_lstm.py
'''

max_features=20000
maxlen = 100 # cut texts after this number of words (among top max_features most common words)
batch_size = 16

# had some luck with seed 111
print("Loading data...")
(X_train, y_train), (X_test, y_test), w = load_imdb_data(
    binary=True, max_features=max_features, maxlen=maxlen, seed=37)

print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')

print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)

print('Build model...')
model = Sequential()
model.add(Embedding(max_features, 256))
model.add(LSTM(256, 128)) # try using a GRU instead, for fun
model.add(Dropout(0.5))
model.add(Dense(128, 1))
model.add(Activation('sigmoid'))

示例#3

显示文件

    test data, also stores the model's weights and intermediate
    activation values for training and test data.

    GPU command:
        THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python imdb_lstm.py
'''

max_features = 20000
maxlen = 100  # cut texts after this number of words (among top max_features most common words)
batch_size = 16

# had some luck with seed 111
print("Loading data...")
(X_train, y_train), (X_test,
                     y_test), w = load_imdb_data(binary=True,
                                                 max_features=max_features,
                                                 maxlen=maxlen,
                                                 seed=37)

print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')

print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)

print('Build model...')
model = Sequential()
model.add(Embedding(max_features, 256))
model.add(LSTM(256, 128))  # try using a GRU instead, for fun
model.add(Dropout(0.5))
model.add(Dense(128, 1))
model.add(Activation('sigmoid'))

示例#4

显示文件

    This one doesn't do so well; haven't messed with configurations much and there
    are many that could likely be made to improve it from it's current <10%
    prediction rate. LSTM is maybe not so helpful with this categorization problem.

    GPU command:
        THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python imdb_lstm.py
'''

max_features = 20000
maxlen = 100  # cut texts after this number of words (among top max_features most common words)
batch_size = 16

print("Loading data...")
(X_train, y_train), (X_test,
                     y_test), w = load_imdb_data(binary=False,
                                                 seed=113,
                                                 maxlen=maxlen,
                                                 max_features=max_features)

# for categories, convert label lists to binary arrays
nb_classes = np.max(y_train) + 1
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)

print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')

print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)

print('Build model...')
model = Sequential()