Python LdaModel.get_term_topics 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: gensim.models

클래스/타입: LdaModel

메소드/함수: get_term_topics

hotexamples.com에서의 예제들: 3

Python LdaModel.get_term_topics - 3개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 gensim.models.LdaModel.get_term_topics에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

LdaModel(30)

top_topics(30)

show_topics(30)

show_topic(30)

get_document_topics(30)

save(30)

print_topics(30)

print_topic(24)

log_perplexity(23)

get_topics(21)

get_topic_terms(13)

update(12)

diff(4)

load(3)

get_term_topics(3)

sync_state(3)

inference(3)

__getitem__(1)

get_vector(1)

callbacks(1)

bound(1)

예제 #1

파일 보기

# 参数(bow, minimum_probability=None, minimum_phi_value=None, per_word_topics=False)
# Parameters:
#   bow (list) – Bag-of-words representation of the document to get topics for.
#   minimum_probability (float) – Ignore topics with probability below this value (None by default). If set to None, a value of 1e-8 is used to prevent 0s.
#   per_word_topics (bool) – If True, also returns a list of topics, sorted in descending order of most likely topics for that word. It also returns a list of word_ids and each words corresponding topics’ phi_values, multiplied by feature length (i.e, word count).
#   minimum_phi_value (float) – if per_word_topics is True, this represents a lower bound on the term probabilities that are included (None by default). If set to None, a value of 1e-8 is used to prevent 0s.
# Returns:
#   topic distribution for the given document bow, as a list of (topic_id, topic_probability) 2-tuples.
test = dct.doc2bow("I love Kitten".lower().strip().split())
print(lda.get_document_topics(test))
print(lda[test])

# 参数(word_id, minimum_probability=None)
# 关联的topics for the given word.
# Each topic is represented as a tuple of (topic_id, term_probability).
print(lda.get_term_topics(0))

# ----- 输出指定topic的构成 -----
# 参数(word_id, minimum_probability=None)
# 输出形式 list, format: [(word, probability), … ].
print(lda.get_topic_terms(0))
# 参数(topicno, topn=10)
print(lda.show_topic(0))
# 输出形式 String， format: ‘-0.340 * “category” + 0.298 * “$M$” + 0.183 * “algebra” + … ‘.
# 参数(topicno, topn=10)
print(lda.print_topic(0))

# ----- 输出所有topic的构成 -----
# 默认参数(num_topics=10, num_words=10, log=False, formatted=True）
# 输出形式 String， format: [(0, ‘-0.340 * “category” + 0.298 * “$M$” + 0.183 * “algebra” + … ‘), ...]
print(lda.show_topics())

예제 #2

파일 보기

파일: predict.py 프로젝트: kw01sg/biomedical-topic-modelling

def get_term_topics(model: LdaModel, dictionary: Dictionary, term: str):
    if term in dictionary.token2id:
        return model.get_term_topics(dictionary.token2id[term])
    return None

예제 #3

파일 보기

파일: LDAGensimEx.py 프로젝트: kangbok/NLPPractice

#-*- coding: utf-8 -*-
import pickle

from gensim.corpora import Dictionary
from gensim.models import LdaModel

with open("../data/corpus_test.pkl", "rb") as f:
    corpus = pickle.load(f)

corpus_dictionary = Dictionary(corpus)
corpus = [corpus_dictionary.doc2bow(text) for text in corpus]

CORPUS = corpus
TOPIC_NUM = 10
lda = LdaModel(corpus=CORPUS, num_topics=TOPIC_NUM)

doc_topic_matrix = lda.get_document_topics([(0, 1), (1, 1)])
term_topic_matrix = lda.get_term_topics(1)
topic_term_matrix = lda.get_topic_terms(1)