Python tokenize 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: mptracker.nlp

메소드/함수: tokenize

hotexamples.com에서의 예제들: 5

Python tokenize - 5개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 mptracker.nlp.tokenize에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

0

파일 보기

파일: test_tokenize.py 프로젝트: rdragos/mptracker

def test_strip_punctuation():
    assert [t.text for t in tokenize("a, s, d? f!")] == ['a', 's', 'd', 'f']

예제 #2

0

파일 보기

파일: test_tokenize.py 프로젝트: rdragos/mptracker

def test_split_words():
    assert [t.text for t in tokenize("hello there  world")] == \
           ['hello', 'there', 'world']

예제 #3

0

파일 보기

파일: test_tokenize.py 프로젝트: rdragos/mptracker

def test_join_tokens():
    tokens = list(tokenize("hello    there  world  is great day"))
    big_token = join_tokens(tokens[1:3])
    assert big_token.text == 'there world'
    assert big_token.start == 9
    assert big_token.end == 21

예제 #4

0

파일 보기

파일: test_tokenize.py 프로젝트: rdragos/mptracker

def test_split_at_hyphen():
    assert [t.text for t in tokenize("cluj-napoca")] == ['cluj', 'napoca']

예제 #5

0

파일 보기

파일: test_tokenize.py 프로젝트: rdragos/mptracker

def test_preserve_start_and_end():
    assert [t.start for t in tokenize("a  .sunny,  day!")] == [0, 4, 12]
    assert [t.end for t in tokenize("a  .sunny,  day!")] == [1, 9, 15]