Python IndicNlpExceptionの例

プログラミング言語: Python

名前空間/パッケージ名: indicnlp.common

クラス/型: IndicNlpException

hotexamples.comのコード掲載数: 7

Python IndicNlpException - 7件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのindicnlp.common.IndicNlpExceptionの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

IndicNlpException(7)

よく使われるメソッド

IndicNlpException (7)

コード例 #1

ファイルを表示

ファイル: indic_scripts.py プロジェクト: suman101112/online-hate-speech-recog

def offset_to_char(off, lang):
    """
    Applicable to Brahmi derived Indic scripts 
    """
    if not is_supported_language(lang):
        raise IndicNlpException('Language {}  not supported'.format(lang))
    return chr(off + li.SCRIPT_RANGES[lang][0])

コード例 #2

ファイルを表示

ファイル: indic_scripts.py プロジェクト: suman101112/online-hate-speech-recog

def get_phonetic_info(lang):
    if not is_supported_language(lang):
        raise IndicNlpException('Language {}  not supported'.format(lang))
    phonetic_data = ALL_PHONETIC_DATA if lang != li.LC_TA else TAMIL_PHONETIC_DATA
    phonetic_vectors = ALL_PHONETIC_VECTORS if lang != li.LC_TA else TAMIL_PHONETIC_VECTORS

    return (phonetic_data, phonetic_vectors)

コード例 #3

ファイルを表示

ファイル: indic_detokenize.py プロジェクト: vyshnavigutta369/indic_nlp_library

def trivial_detokenize(s, lang='hi'):
    """
    Trivial tokenizer for languages in the Indian sub-continent
    """
    if lang == 'ur':
        raise IndicNlpException('No detokenizer available for Urdu')
    else:
        return trivial_detokenize_indic(s)

コード例 #4

ファイルを表示

ファイル: indic_scripts.py プロジェクト: suman101112/online-hate-speech-recog

def is_indiclang_char(c, lang):
    """
    Applicable to Brahmi derived Indic scripts 
    Note that DANDA and DOUBLE_DANDA have the same Unicode codepoint for all Indic scripts 
    """
    if not is_supported_language(lang):
        raise IndicNlpException('Language {}  not supported'.format(lang))
    o = get_offset(c, lang)
    return (o>=SCRIPT_OFFSET_START and o<SCRIPT_OFFSET_RANGE) \
            or ord(c)==li.DANDA or ord(c)==li.DOUBLE_DANDA

コード例 #5

ファイルを表示

def trivial_detokenize(text, lang='hi'):
    """detokenize string for languages of the Indian subcontinent 

    A trivial detokenizer which:

        - decides whether punctuation attaches to left/right or both
        - handles number sequences
        - handles quotes smartly (deciding left or right attachment)

    Args:
        text (str): tokenized text to process 

    Returns:
        str: detokenized string

    Raises:
        IndicNlpException: If language is not supported        
    """
    if lang == 'ur':
        raise IndicNlpException('No detokenizer available for Urdu')
    else:
        return trivial_detokenize_indic(text)

コード例 #6

ファイルを表示

ファイル: indic_scripts.py プロジェクト: suman101112/online-hate-speech-recog

def in_coordinated_range(c, lang):
    if not is_supported_language(lang):
        raise IndicNlpException('Language {}  not supported'.format(lang))
    return in_coordinated_range_offset(get_offset(c, lang))

コード例 #7

ファイルを表示

ファイル: indic_scripts.py プロジェクト: suman101112/online-hate-speech-recog

def get_offset(c, lang):
    if not is_supported_language(lang):
        raise IndicNlpException('Language {}  not supported'.format(lang))
    return ord(c) - li.SCRIPT_RANGES[lang][0]