Python cleanText 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: preprocess

메소드/함수: cleanText

hotexamples.com에서의 예제들: 3

Python cleanText - 3개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 preprocess.cleanText에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

파일: sim_textrank.py 프로젝트: Girija-Mage/Girija-Mage.github.io

def simTextRank(filePath, summarySentenceCount):
    '''
    summary generation using similarity between sentences
    '''
    global sentence_dictionary
    sentence_dictionary = {}
    sentences = []

    sentence_dictionary, sentences, size = cleanText(
        filePath)  #input after preprocessing

    graph = generateGraph(list(
        sentence_dictionary.keys()))  #keys are sentence ids

    pageRank = networkx.pagerank(
        graph)  #computes ranking of nodes in graph,return type is a dictionary

    output = "\n".join([
        sentences[sentenceID] for sentenceID in sorted(
            sorted(pageRank, key=pageRank.get, reverse=True)
            [:summarySentenceCount])
    ])

    with open(os.path.join(app.config['DOWNLOAD_FOLDER'], 'sim_textrank.txt'),
              "w",
              encoding="utf-8") as outFile:
        outFile.write(output)
        outFile.close()

예제 #2

파일 보기

def process(arg1, arg2, arg3):
    '''
    :param arg1: path to the file containing the text to be summarized
    :param arg2: Number of sentences to be extracted as summary
    :param arg3: size of the window to be used in the co-occurance relation
    '''
    global window, n, numberofSentences, textRank, sentenceDictionary, size, sentences
    if arg1 != None and arg2 != None and arg3 != None:
        sentenceDictionary, sentences, size = cleanText(arg1)
        window = int(arg3)
        numberofSentences = int(arg2)
        n = int(math.ceil(min(0.1 * size, 7 * math.log(size))))
        generatepositionaldistribution()
        keyphrases = textrank()
        summarize(arg1, keyphrases, numberofSentences)
    else:
        print("not enough parameters")

예제 #3

파일 보기

파일: text_rank_similarity_Marathi.py 프로젝트: shef4793/TextSummarizer

def textRankSimilarity(filePath, summarySentenceCount):
    global sentenceDictionary
    sentenceDictionary = {}
    sentences = []
    sentenceDictionary, sentences, size = cleanText(filePath)
    #printDictionary()
    graph = generateGraph(list(sentenceDictionary.keys()))
    pageRank = networkx.pagerank(graph)
    output = "\n".join([
        sentences[sentenceID] for sentenceID in sorted(
            sorted(pageRank, key=pageRank.get, reverse=True)
            [:summarySentenceCount])
    ])

    print("\nSummary:")
    print(output)

    with io.open("../Marathi/summaries/" + (filePath).split('/')[-1] +
                 "_TextRankSimilaritySummarizer",
                 "w",
                 encoding='utf-8') as outFile:
        outFile.write(output)
        outFile.close()