Python Vocabulary.fitの例

プログラミング言語: Python

名前空間/パッケージ名: vocabulary

クラス/型: Vocabulary

メソッド/関数: fit

hotexamples.comのコード掲載数: 1

Python Vocabulary.fit - 1件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのvocabulary.Vocabulary.fitの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

Vocabulary(30)

add_word(15)

clean_text(8)

build_vocab(8)

add_words(8)

deserialize(7)

compile(4)

add(4)

antonym(4)

auto_punctuate(3)

add_token(3)

encode(3)

add_from_file(2)

decode_output(2)

getUniGrams(2)

from_documents(2)

build_corpus(2)

getVocabularyByDocument(2)

getBiGrams(2)

get_id_from_token(2)

add_a_word(2)

add_text(2)

add_many(2)

getFullDict(2)

gen_DAG(1)

from_text_files(1)

from_text(1)

from_serializable(1)

from_sentences(1)

get(1)

add_constant(1)

getPTStopWords(1)

getQuestions(1)

getVocabularySize(1)

get_all_source_words(1)

get_all_translations(1)

get_pos(1)

get_term_text(1)

make_dictionary(1)

seg_content(1)

from_nlp_data(1)

encode_sent(1)

from_idx2word_dict(1)

convert_sentence(1)

add_new_word(1)

add_sentence(1)

add_chunk(1)

add_word_lst(1)

append(1)

build(1)

コード例 #1

ファイルを表示

         y_shuffled = y[shuffle_indices]
         # Split train/test set
         n_dev_samples = int(0.1 * len(y))
         # TODO: Create a f****n' correct cross validation procedure
         x_train, x_dev = x_shuffled[:-n_dev_samples], x_shuffled[-n_dev_samples:]
         y_train, y_dev = y_shuffled[:-n_dev_samples], y_shuffled[-n_dev_samples:]
 print("Train/Dev split: {:d}/{:d}".format(len(y_train), len(y_dev)))
 
 vocab_file = '{}/vocab.pkl'.format(runs_dir)
 if os.path.exists(vocab_file):
     with open(vocab_file, 'rb') as fi:
         vocab = cPickle.load(fi)
 else:
     x_train_tokens = [ tokenize(sample) for sample in tqdm(x_train) ]
     vocab = Vocabulary(min_freq=5)
     vocab.fit(x_train_tokens)
     with open(vocab_file, 'wb') as fo:
         cPickle.dump(vocab, fo)
     
     f = h5py.File(dataset_file, 'w')
     x_train_dataset = f.create_dataset('x_train', shape=(len(x_train), FLAGS.length), dtype=np.int32)
     y_train_dataset = f.create_dataset('y_train', shape=y_train.shape, dtype=np.int32)
     x_dev_dataset = f.create_dataset('x_dev', shape=(len(x_dev), FLAGS.length), dtype=np.int32)
     y_dev_dataset = f.create_dataset('y_dev', shape=y_dev.shape, dtype=np.int32)
     y_train_dataset[:] = y_train
     y_dev_dataset[:] = y_dev
     vocab.max_sequence_length = FLAGS.length
     x_train_dataset[:] = vocab.transform(x_train_tokens).astype(np.int32)
     x_dev_dataset[:] = vocab.transform(x_dev).astype(np.int32)
     x_train = x_train_dataset[:]
     x_dev = x_dev_dataset[:]