Python _en_tokenize 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: en.parser

메소드/함수: _en_tokenize

hotexamples.com에서의 예제들: 9

Python _en_tokenize - 9개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 en.parser._en_tokenize에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

파일: __init__.py 프로젝트: aburan28/pattern

def tokenize(s, punctuation=PUNCTUATION, abbreviations=["bv.", "blz.", "e.d.", "m.a.w.", "nl."], replace={}):
    # 's in Dutch preceded by a vowel indicates plural ("auto's"): don't replace.
    s = _en_tokenize(s, punctuation, abbreviations, replace)
    s = [s.replace("' s morgens", "'s morgens") for s in s]
    s = [s.replace("' s middags", "'s middags") for s in s]
    s = [s.replace("' s avonds" , "'s avonds" ) for s in s]
    return s

예제 #2

파일 보기

파일: __init__.py 프로젝트: neer300/dataproject

def tokenize(s,
             punctuation=PUNCTUATION,
             abbreviations=["bv.", "blz.", "e.d.", "m.a.w.", "nl."],
             replace={}):
    # 's in Dutch preceded by a vowel indicates plural ("auto's"): don't replace.
    s = _en_tokenize(s, punctuation, abbreviations, replace)
    s = [s.replace("' s morgens", "'s morgens") for s in s]
    s = [s.replace("' s middags", "'s middags") for s in s]
    s = [s.replace("' s avonds", "'s avonds") for s in s]
    return s

예제 #3

파일 보기

파일: __init__.py 프로젝트: quincysmiith/pattern

def tokenize(s,
             punctuation=PUNCTUATION,
             abbreviations=abbreviations,
             replace={"'n": " 'n"}):
    # 's in Dutch preceded by a vowel indicates plural ("auto's"): don't replace.
    s = _en_tokenize(s, punctuation, abbreviations, replace)
    s = [
        re.sub(r"' s (ochtends|morgens|middags|avonds)", "'s \\1", s)
        for s in s
    ]
    return s

예제 #4

파일 보기

파일: __init__.py 프로젝트: EthanBlackburn/pattern

def tokenize(s, punctuation=PUNCTUATION, abbreviations=ABBREVIATIONS, replace={}):
    return _en_tokenize(s, punctuation, abbreviations, replace)

예제 #5

파일 보기

파일: __init__.py 프로젝트: navtej/pattern

def tokenize(s, punctuation=PUNCTUATION, abbreviations=abbreviations, replace={"'n": " 'n"}):
    # 's in Dutch preceded by a vowel indicates plural ("auto's"): don't replace.
    s = _en_tokenize(s, punctuation, abbreviations, replace)
    s = [re.sub(r"' s (ochtends|morgens|middags|avonds)", "'s \\1", s) for s in s]
    return s

예제 #6

파일 보기

def tokenize(s,
             punctuation=PUNCTUATION,
             abbreviations=ABBREVIATIONS,
             replace={}):
    return _en_tokenize(s, punctuation, abbreviations, replace)

예제 #7

파일 보기

def tokenize(s,
             punctuation=PUNCTUATION,
             abbreviations=["bv.", "blz.", "e.d.", "m.a.w.", "nl."],
             replace={}):
    # 's in Dutch preceded by a vowel indicates plural ("auto's"): don't replace.
    return _en_tokenize(s, punctuation, abbreviations, replace)

예제 #8

파일 보기

파일: __init__.py 프로젝트: cloudappsetup/pattern

def tokenize(s, punctuation=PUNCTUATION, abbreviations=["bv.", "blz.", "e.d.", "m.a.w.", "nl."], replace={}):
    # 's in Dutch preceded by a vowel indicates plural ("auto's"): don't replace.
    return _en_tokenize(s, punctuation, abbreviations, replace)

예제 #9

파일 보기

def tokenize(s, punctuation=PUNCTUATION, abbreviations=ABBREVIATIONS, replace=replacements):
    s = _en_tokenize(s, punctuation, abbreviations, replace)
    s = [s.replace("&rsquo ;", u"’") if isinstance(s, unicode) else s for s in s]
    return s