Exemplos de Preprocessor.fit_on_corpus em Python

Linguagem de programação: Python

Espaço para nome / nome do pacote: preprocess

Classe / Tipo: Preprocessor

Método / Função: fit_on_corpus

Exemplos em hotexamples.com: 1

Preprocessor.fit_on_corpus em Python - 1 exemplos encontrados. Esses são os exemplos do mundo real mais bem avaliados de preprocess.Preprocessor.fit_on_corpus em Python extraídos de projetos de código aberto. Você pode avaliar os exemplos para nos ajudar a melhorar a qualidade deles.

Métodos Frequentes

Exibir Ocultar

Preprocessor(30)

add(4)

execute(3)

load(3)

import_video(3)

get_vocabulary(2)

get_states(2)

get_standard_form(2)

get_representer(2)

gen_data_vec(2)

setNextPitchCorner(2)

count_lines(1)

bgsub(1)

load_data(1)

_line_cleanup(1)

lda(1)

investigate_whitelist(1)

index_list_to_word_list(1)

apply(1)

basic_preprocess(1)

get_values_all(1)

get_training_data(1)

get_train_test_data_tag(1)

get_testing_data(1)

get_target_names(1)

build_vocab(1)

convert_text_to_index(1)

build_vocabulary_and_categories(1)

get_feature_names(1)

get_data(1)

get_all_text(1)

_clean_data(1)

getSentences(1)

generateTrainData(1)

convert_index_to_text(1)

gaussian(1)

format_to_nn(1)

format_to_lines(1)

fit_on_corpus(1)

get_all_tag_idx(1)

Métodos Frequentes

Preprocessor (30)

add (4)

execute (3)

load (3)

import_video (3)

get_vocabulary (2)

get_states (2)

get_standard_form (2)

get_representer (2)

gen_data_vec (2)

Métodos Frequentes

setNextPitchCorner (2)

count_lines (1)

bgsub (1)

load_data (1)

_line_cleanup (1)

lda (1)

investigate_whitelist (1)

index_list_to_word_list (1)

apply (1)

basic_preprocess (1)

get_values_all (1)

get_training_data (1)

get_train_test_data_tag (1)

get_testing_data (1)

get_target_names (1)

build_vocab (1)

convert_text_to_index (1)

build_vocabulary_and_categories (1)

get_feature_names (1)

get_data (1)

Métodos Frequentes

get_values_all (1)

get_training_data (1)

get_train_test_data_tag (1)

get_testing_data (1)

get_target_names (1)

build_vocab (1)

convert_text_to_index (1)

build_vocabulary_and_categories (1)

get_feature_names (1)

get_data (1)

get_all_text (1)

_clean_data (1)

getSentences (1)

generateTrainData (1)

convert_index_to_text (1)

gaussian (1)

format_to_nn (1)

format_to_lines (1)

fit_on_corpus (1)

get_all_tag_idx (1)

Métodos Frequentes

get_all_text (1)

_clean_data (1)

getSentences (1)

generateTrainData (1)

convert_index_to_text (1)

gaussian (1)

format_to_nn (1)

format_to_lines (1)

fit_on_corpus (1)

get_all_tag_idx (1)

Exemplo n.º 1

0

Exibir arquivo

write2word2vec(wordlist) # Use an out-of-the-box dictionary sent = u'“年”字有多少笔？笔顺编号:311212,，?？!！.。>》\、' sentence = [word for word in jieba.cut(Preprocessor().replace_line(sent))] p = Preprocessor() p.load_dictionary(dict_name='../data/dbqa.word2vec.wordlist.txt') print len(p.word_to_index) print '/'.join(sentence) indices = p.word_list_to_index_list(sentence) print indices print '/'.join(p.index_list_to_word_list(indices)) # You may also want to fit the dictionary from corpus #p.reset() p.fit_on_corpus(insert_new_word_into_dict=False) #p.save_dictionary() print 'Vocab size:', p.vocab_size # questions: list of sentences, where a sentence is a list comprising of word indices # answers: list of sentences, where a sentence is a list comprising of word indices # labels: a 1-D numpy array # These 3 variables should share the same length. questions, answers, labels = p.get_training_data() print('writing '+gl.train_pkl) with open(gl.train_pkl,'wb') as pkl: pickle.dump([questions, answers, labels],pkl) print('done') # if you want to transfer a sequence of indices into a sequence of words, # you may want to use: print '/'.join(p.index_list_to_word_list(questions[0]))