Python Vocab.add_list示例

编程语言: Python

命名空间/包名称: utils

类/类型: Vocab

方法/功能: add_list

hotexamples.com的示例: 1

Python Vocab.add_list - 已找到1个示例。这些是从开源项目中提取的最受好评的utils.Vocab.add_list现实Python示例。您可以评价示例，以帮助我们提高示例质量。

常用方法

显示隐藏

encode(30)

construct(30)

load(15)

Vocab(9)

add_word(6)

to_input_tensor(4)

get_word_list(3)

pickle(3)

decode(3)

get_train_dev_test(3)

get_pre_trained_examples(2)

build_vocab(2)

add_dataframe(2)

save_to_file(2)

add(2)

add_special_token(2)

update(2)

build_bert_vocab(2)

build(2)

add_words(2)

add_special_tokens(2)

build_embedding_matrix(2)

word2id(1)

get_wv(1)

id2word(1)

indices2tokens(1)

transform(1)

_looking_up(1)

load_pretrained_char_embeddings(1)

load_pretrained_word_embeddings(1)

py_size(1)

randomly_init_py_embeddings(1)

size(1)

add_char(1)

filter_pys_by_cnt(1)

get_vocab(1)

construct_phrase(1)

add_list(1)

add_py(1)

build_from_counter(1)

char_size(1)

check_words(1)

construct_batch(1)

convert_to_str(1)

getIndex(1)

decode_docs(1)

emb_wordtoindex(1)

__len__(1)

filter_chars_by_cnt(1)

filter_tokens_by_cnt(1)

示例#1

0

显示文件

# split training and validation datasets

random.shuffle(train_data)

num = len(train_data)
val_num = int(num * config['val_rate'])
val_data = train_data[num - val_num:num]
train_data = train_data[0:num - val_num]

# build vocabulary

vocab = Vocab()

for line in train_data:
    line = line.strip().split('\t')[1].split(' ')
    vocab.add_list(line)

word2index, index2word = vocab.get_vocab(max_size=config['max_size'],
                                         min_freq=config['min_freq'])

vocab_size = len(index2word)
oov_size = len(word2index) - len(index2word)

with open(word2index_path, 'wb') as handle:
    pickle.dump(word2index, handle)
with open(index2word_path, 'wb') as handle:
    pickle.dump(index2word, handle)

glove = load_glove(config['glove_path'], vocab_size, word2index)
np.save(glove_path, glove)