Python Tagger.parseToNodeListの例

プログラミング言語: Python

名前空間/パッケージ名: fugashi

クラス/型: Tagger

メソッド/関数: parseToNodeList

hotexamples.comのコード掲載数: 5

Python Tagger.parseToNodeList - 5件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのfugashi.Tagger.parseToNodeListの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

Tagger(21)

parse(7)

parseToNodeList(5)

nbest(1)

コード例 #1

ファイルを表示

ファイル: test_basic.py プロジェクト: HiromuHota/fugashi

def test_accent(text, accent):
    # This checks for correct handling of feature fields containing commas as reported in #13
    tagger = Tagger()
    tokens = tagger.parseToNodeList(text)
    # Skip if UnidicFeatures17 is used because it doesn't have 'atype' attribute
    if tokens and isinstance(tokens[0].feature, UnidicFeatures17):
        pytest.skip()
    accent_ = [tok.feature.aType for tok in tokens]
    assert accent_ == accent

コード例 #2

ファイルを表示

def test_tokens(text, saved):
    # testing the token objects is tricky, so instead just check surfaces
    #TODO: maybe save serialized nodes to compare?
    tagger = Tagger()
    tokens = [str(tok) for tok in tagger.parseToNodeList(text)]
    assert tokens == saved

コード例 #3

ファイルを表示

def test_accent(text, accent):
    # This checks for correct handling of feature fields containing commas as reported in #13
    tagger = Tagger()
    accent_ = [tok.feature.aType for tok in tagger.parseToNodeList(text)]
    assert accent_ == accent

コード例 #4

ファイルを表示

def test_pos(text, tags):
    # There should be a pos property when using the default tagger
    tagger = Tagger()
    tags_ = [tok.pos for tok in tagger.parseToNodeList(text)]
    assert tags == tags_

コード例 #5

ファイルを表示

ファイル: benchmark-fugashi.py プロジェクト: polm/ja-tokenizer-benchmark

#!/usr/bin/env python
from fugashi import Tagger

tt = Tagger()
from collections import Counter

wc = Counter()

for line in open('wagahai.txt'):
    for word in tt.parseToNodeList(line.strip()):
        wc[word.surface] += 1