Python TextCorpus.save Exemples

Langage de programmation: Python

Espace de nommage/Pack: gensim.corpora

Class/Type: TextCorpus

Méthode/Fonction: save

Exemples au hotexamples.com: 1

Python TextCorpus.save - 1 exemples trouvés. Ce sont les exemples réels les mieux notés de gensim.corpora.TextCorpus.save extraits de projets open source. Vous pouvez noter les exemples pour nous aider à en améliorer la qualité.

Méthodes fréquemment utilisées

Afficher Cacher

TextCorpus(10)

get_texts(3)

load(2)

__init__(1)

save(1)

Méthodes fréquemment utilisées

TextCorpus (10)

get_texts (3)

load (2)

__init__ (1)

save (1)

Exemple #1

0

Afficher le fichier

Fichier : make_corpus_lda1.py Projet : MikeLepekhin/Non-thematic-Text-Classification

else: keep_words = DEFAULT_DICT_SIZE if os.path.exists(outp + '_wordids.txt.bz2') and os.path.exists(outp + '_corpus.pkl.bz2'): dictionary = Dictionary.load_from_text(outp + '_wordids.txt.bz2') wiki = TextCorpus.load(outp + '_corpus.pkl.bz2') else: wiki = TextCorpus(inp) # only keep the most frequent words wiki.dictionary.filter_extremes(no_below=20, no_above=0.1, keep_n=keep_words) wiki.dictionary.save_as_text(outp + '_wordids.txt.bz2') wiki.save(outp + '_corpus.pkl.bz2') # load back the id->word mapping directly from file # this seems to save more memory, compared to keeping the wiki.dictionary object from above dictionary = Dictionary.load_from_text(outp + '_wordids.txt.bz2') # build tfidf if os.path.exists(outp + '_tfidf.mm'): mm = gensim.corpora.MmCorpus(outp + '_tfidf.mm') else: tfidf = TfidfModel(wiki, id2word=dictionary, normalize=True) #tfidf.save(outp + '.tfidf_model') # save tfidf vectors in matrix market format mm = tfidf[wiki] MmCorpus.serialize(outp + '_tfidf.mm', mm, progress_cnt=10000)