Python PreProcess.getWordListの例

プログラミング言語: Python

名前空間/パッケージ名: preprocess

クラス/型: PreProcess

メソッド/関数: getWordList

hotexamples.comのコード掲載数: 2

Python PreProcess.getWordList - 2件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのpreprocess.PreProcess.getWordListの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

PreProcess(30)

get_plate_like_objects(3)

lemmatization(3)

load_train_data(3)

clean_html(3)

load_test_data(2)

distance_in_meters(2)

getWordList(2)

id2chinese(1)

lemmatizeWordList(1)

balance_data(1)

labels_to_one_hot(1)

id2english(1)

get_precincts_wards_geometry_data(1)

get_refuse_routes_data(1)

complexity(1)

clean_data(1)

__init__(1)

generate_topic_concepts(1)

english2id(1)

copy(1)

get_CDBG_geometry_data(1)

コード例 #1

ファイルを表示

ファイル: word_frequency.py プロジェクト: ThapaMahesh/text-analysis

import time

start_time = time.time()
thisTime = start_time
files = []
dataFolder = os.path.dirname(os.path.abspath(__file__)) + "/data"
resultFolder = os.path.dirname(os.path.abspath(__file__)) + "/result"
count = 0
commonWordList = {}
for i in os.listdir(dataFolder):
    if i.endswith('.txt'):
        thisFile = os.path.join(dataFolder, i)
        reflection = open(thisFile, "r", encoding="utf8")

        processData = PreProcess(reflection.read())
        wordList = processData.getWordList(reflection.read(), True)
        wordFrequency = processData.wordFrequency(wordList)

        for wordTuple in wordFrequency:
            commonWordList[wordTuple[0]] = commonWordList[
                wordTuple[0]] + wordTuple[1] if wordTuple[
                    0] in commonWordList else wordTuple[1]

        print("--- %s seconds ---" % (time.time() - thisTime))
        thisTime = time.time()

        reflection.close()

result = open(resultFolder + "/wordfrequency.csv", "a+")
result.write("Word,WordCount\n")
iter = 0

コード例 #2

ファイルを表示

ファイル: spellTest.py プロジェクト: ThapaMahesh/text-analysis

dataFolder = os.path.dirname(os.path.abspath(__file__)) + "/data"
count = 0
for i in os.listdir(dataFolder):
    if i.endswith('.txt'):
        # if count != 3:
        #     count = count + 1
        #     continue
        thisFile = os.path.join(dataFolder, i)
        reflection = open(thisFile, "r", encoding='utf8')

        print("\n\n")

        print(os.path.basename(reflection.name))
        processData = PreProcess(reflection.read())

        wordList = processData.getWordList(True, True)
        # print("WordList Time: --- %s seconds ---\n\n\n\n\n" % (time.time() - thisTime))
        withoutContractions = processData.removeContractions(wordList)
        # print("WordList Time: --- %s seconds ---\n\n\n\n\n" % (time.time() - thisTime))
        # print(withoutContractions)
        lemmaWordList = processData.lemmatizeWordList(withoutContractions)
        print("WordList Time: --- %s seconds ---\n\n\n\n\n" % (time.time() - thisTime))

        spellErrors = findErrors(lemmaWordList, thisTime)

        print("File: " + os.path.basename(reflection.name))
        print("Words calculated: " + str(len(wordList)))
        print("Error Word Count: " + str(spellErrors['errorCount']))
        print("ErrorWord\t\t\tCorrection\t\t\tSuggestions")
        print("---------\t\t\t----------\t\t\t-----------")
        for eachError in spellErrors["errorList"]: