Python MyMecab Exemples

Langage de programmation: Python

Espace de nommage/Pack: mymecab

Class/Type: MyMecab

Exemples au hotexamples.com: 2

Python MyMecab - 2 exemples trouvés. Ce sont les exemples réels les mieux notés de mymecab.MyMecab extraits de projets open source. Vous pouvez noter les exemples pour nous aider à en améliorer la qualité.

Méthodes fréquemment utilisées

Afficher Cacher

MyMecab(1)

getFeature(1)

Méthodes fréquemment utilisées

MyMecab (1)

getFeature (1)

Associées

Intracomm

TaskLinesMatch

GLOBAL_CONFIG_PATH

_cellRecognized

Mongo

json_field_too_long

getJson4Xml

PhotoService

Wiretap

sideProcess

Related in langs

Home\Model\MovieModel (PHP)

MergeString (PHP)

DapperPlusManager (C#)

StepSizeAttribute (C#)

qdict_next (C++)

deleteNode (C++)

PutDriver (Go)

EntityToLink (Go)

FormUiLogic (Java)

Declaration (Java)

Exemple #1

0

Afficher le fichier

Fichier : tfidf.py Projet : rupy/mecab_tfidf

class WordCount: def __init__(self): self.m = MyMecab() self.tf = {} self.all_tf = defaultdict(int) self.df = defaultdict(int) self.tfidf = defaultdict(dict) def calc_tfidf(self, documents): doc_num = len(documents) # 文書名と文書内容を取り出す for doc_name, doc in documents.items(): d = defaultdict(int) # 文書の形態素解析により単語の原型を取り出す words = self.m.getFeature(doc, MyMecab.MECAB_FEATURE_BASE) # 得られた単語に対して数を数える for word in words: d[word] += 1 # 単語頻度（tf）のため self.all_tf[word] += 1 # 文書頻度（df）のため self.tf[doc_name] = d # すべての単語でループする for word in self.all_tf.keys(): # 全ドキュメントでループする for doc_name in documents.keys(): # 単語が数えられていたら if word in self.tf[doc_name] and self.tf[doc_name][word] > 0: # 単語頻度dfの計算 self.df[word] += 1 for doc_name in documents.keys(): # tf-idfの計算 if word in self.tf[doc_name]: # print doc_name # print word # print self.tf[doc_name][word] # print float(self.df[word]) # print log(doc_num / float(self.df[word])) self.tfidf[doc_name][word] = self.tf[doc_name][word] * log(doc_num / float(self.df[word]))

Exemple #2

0

Afficher le fichier

Fichier : tfidf.py Projet : rupy/mecab_tfidf

def __init__(self): self.m = MyMecab() self.tf = {} self.all_tf = defaultdict(int) self.df = defaultdict(int) self.tfidf = defaultdict(dict)