Python LatentDirichletAllocation.get_document_topics Exemples

Langage de programmation: Python

Espace de nommage/Pack: sklearn.decomposition

Méthode/Fonction: get_document_topics

Exemples au hotexamples.com: 2

Python LatentDirichletAllocation.get_document_topics - 2 exemples trouvés. Ce sont les exemples réels les mieux notés de sklearn.decomposition.LatentDirichletAllocation.get_document_topics extraits de projets open source. Vous pouvez noter les exemples pour nous aider à en améliorer la qualité.

Méthodes fréquemment utilisées

Afficher Cacher

LatentDirichletAllocation(30)

fit_transform(30)

score(30)

perplexity(30)

fit(30)

get_params(21)

partial_fit(17)

set_params(6)

_e_step(2)

print_topics(2)

get_document_topics(2)

predict(2)

print_topic(1)

n_jobs(1)

rename(1)

reset_index(1)

save(1)

show_topics(1)

n_topics(1)

get_perplexity(1)

learn(1)

isnull(1)

head(1)

get_feature_names_out(1)

get_feature_names(1)

dump(1)

decision_function(1)

components_(1)

compile(1)

columns(1)

append(1)

add_prefix(1)

_unnormalized_transform(1)

_perplexity_precomp_distr(1)

tolist(1)

Méthodes fréquemment utilisées

LatentDirichletAllocation (30)

fit_transform (30)

score (30)

perplexity (30)

fit (30)

get_params (21)

partial_fit (17)

set_params (6)

_e_step (2)

print_topics (2)

Méthodes fréquemment utilisées

get_document_topics (2)

predict (2)

print_topic (1)

n_jobs (1)

rename (1)

reset_index (1)

save (1)

show_topics (1)

n_topics (1)

get_perplexity (1)

learn (1)

isnull (1)

head (1)

get_feature_names_out (1)

get_feature_names (1)

dump (1)

decision_function (1)

components_ (1)

compile (1)

columns (1)

Méthodes fréquemment utilisées

learn (1)

isnull (1)

head (1)

get_feature_names_out (1)

get_feature_names (1)

dump (1)

decision_function (1)

components_ (1)

compile (1)

columns (1)

append (1)

add_prefix (1)

_unnormalized_transform (1)

_perplexity_precomp_distr (1)

tolist (1)

Méthodes fréquemment utilisées

append (1)

add_prefix (1)

_unnormalized_transform (1)

_perplexity_precomp_distr (1)

tolist (1)

Exemple #1

0

Afficher le fichier

mytext = ["Some text about christianity and bible"] doc_ids, docs = similar_euc_documents(text=mytext, doc_topic_probs=lda_output, documents = data, top_n=1, verbose=True) print('\n', docs[0][:500]) doc_ids, docs = similar_cos_documents(text=mytext, doc_topic_probs=lda_output, documents = data, top_n=1, verbose=True) print('\n', docs[0][:500]) all_top_vecs = [lda.get_document_topics(serial_corp[n], minimum_probability=0) \ for n in range(len(serial_corp))] def find_most_similar(sim_vec, all_top_vecs, title_lst, vec_in_corp='Y', n_results=7): ''' Calculates cosine similarity across the entire corpus and returns the n_results number of most similar documents ''' cos_sims = [gensim.matutils.cossim(sim_vec, vec) for vec in all_top_vecs] if vec_in_corp == 'N': most_similar_ind = np.argsort(cos_sims)[::-1][:n_results] if vec_in_corp == 'Y': most_similar_ind = np.argsort(cos_sims)[::-1][:n_results+1][1:]

Exemple #2

0

Afficher le fichier

Fichier : chictr_trail_topic_cluster_lda.py Projet : vencent-debug/NLPVisualizationSystem

words.append(word) f_keyword.write(str(topic[0]) + '\t' + ','.join(words) + '\n') return lsi ## main if __name__ == '__main__': cluster_keyword_lda_filepath = './data_out/cluster_keywords_lda.txt' cluster_keyword_lsi_filepath = './data_out/cluster_keywords_lsi.txt' corpus_path = './data/self_info_results_all.xls' dictionary, corpus, corpus_tfidf = create_data(corpus_path) lda = lda_model(dictionary, corpus, corpus_tfidf, 6, cluster_keyword_lda_filepath) lsi = lsi_model(dictionary, corpus, corpus_tfidf, 6, cluster_keyword_lsi_filepath) # show cluster keywords for LDA f = open(cluster_keyword_lda_filepath, 'r', encoding='utf-8') cluster_keyword_lda = f.read() print(cluster_keyword_lda) f.close() # test test_data = open('./data/test_chictr.txt', 'r', encoding='utf-8-sig').readlines() test_dictionary, test_corpus, test_corpus_tfidf = data_process(test_data) topics_test = lda.get_document_topics(test_corpus) for i in range(len(test_data)): print(i, 'topic distribution: ', topics_test[i], '\n')