Python Lemmatizer.lemmatize_textの例

プログラミング言語: Python

名前空間/パッケージ名: lemmatizer

クラス/型: Lemmatizer

メソッド/関数: lemmatize_text

hotexamples.comのコード掲載数: 1

Python Lemmatizer.lemmatize_text - 1件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのlemmatizer.Lemmatizer.lemmatize_textの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

Lemmatizer(15)

lemmatize(9)

apply_lemma_rule(1)

create(1)

filter(1)

gen_absolute_lemma_rule(1)

gen_lemma_rule(1)

get_lang(1)

is_absolute_lemma_rule(1)

lemmatize_text(1)

lemmatizer(1)

makeDictionaryMap(1)

コード例 #1

ファイルを表示

ファイル: train_lda_de.py プロジェクト: Quving/newsminer

def prepare_articles(articles, from_cache=False):
    texts = []
    lemmatizer = Lemmatizer()
    german_stop_words = stopwords.words('german')
    filename = "data/lda-trainingdata.pickle"
    if from_cache:
        with open(filename, 'rb') as file:
            texts = pickle.load(file)
            return texts
    else:
        # Remove '... [+ xxx chars]' pattern from 'content'
        for article in progressbar(articles):
            article_text = ""
            for text in [article.description, article.title, article.fulltext if article.fulltext else article.content]:
                if text:
                    text = re.sub('\[.*?\]', '', text)
                    text = " ".join([x for x in text.split() if x.isalnum() or '.' in x])
                    article_text += lemmatizer.lemmatize_text(text=text, verbose=False)

            article_text = [x for x in article_text.split() if x not in german_stop_words]
            texts.append(article_text)

        # Cache lda-trainingdata
        if not os.path.exists("data"):
            os.makedirs("data")
        with open(filename, 'wb') as file:
            pickle.dump(texts, file)

    return texts