Python store_partitioned_docs 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: utils

메소드/함수: store_partitioned_docs

hotexamples.com에서의 예제들: 2

Python store_partitioned_docs - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 utils.store_partitioned_docs에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

import phrase_mining
import sys
import utils

arguments = sys.argv
print 'Running Phrase Mining...'

file_name = arguments[1]

# represents the minimum number of occurences you want each phrase to have.
min_support = 4

# represents the threshold for merging two words into a phrase. A lower value
# alpha leads to higher recall and lower precision,
alpha = 4

# length of the maximum phrase size
max_phrase_size = 10

phrase_miner = phrase_mining.PhraseMining(file_name, min_support,
                                          max_phrase_size, alpha)
partitioned_docs, index_vocab = phrase_miner.mine()
frequent_phrases = phrase_miner.get_frequent_phrases(min_support)
utils.store_partitioned_docs(partitioned_docs)
utils.store_vocab(index_vocab)
utils.store_frequent_phrases(frequent_phrases)

예제 #2

파일 보기

]
if len(arguments) > 6:
    stopwords_file = stop_word_files[int(arguments[6])]
    phrase_miner = og_phrase_mining.PhraseMining(file_name, min_support,
                                                 max_phrase_size, alpha,
                                                 stopwords_file)
else:
    phrase_miner = og_phrase_mining.PhraseMining(file_name, min_support,
                                                 max_phrase_size, alpha)
# phrase_miner = og_phrase_mining.PhraseMining(file_name, min_support, max_phrase_size, alpha);
partitioned_docs, index_vocab = phrase_miner.mine()
# print(partitioned_docs)
frequent_phrases = phrase_miner.get_frequent_phrases(min_support)
# print(frequent_phrases)
utils.store_partitioned_docs(
    partitioned_docs,
    path="src/topmine/{}/intermediate_output/{}.partitioneddocs.txt".format(
        folder,
        file_name.split('/')[-1]))
utils.store_vocab(
    index_vocab,
    path="src/topmine/{}/intermediate_output/{}.vocab.txt".format(
        folder,
        file_name.split('/')[-1]))
utils.store_frequent_phrases(
    frequent_phrases,
    path='src/topmine/{}/output/{}.frequent_phrases.txt'.format(
        folder,
        file_name.split('/')[-1]))

print(len(frequent_phrases))