Python _is_punctuation Examples

Programming Language: Python

Namespace/Package Name: bert.tokenization

Method/Function: _is_punctuation

Examples at hotexamples.com: 3

Python _is_punctuation - 3 examples found. These are the top rated real world Python examples of bert.tokenization._is_punctuation extracted from open source projects. You can rate examples to help us improve the quality of examples.

Example #1

Show file

File: tokenization_test.py Project: lcz10/test

    def test_is_punctuation(self):
        self.assertTrue(tokenization._is_punctuation(u"-"))
        self.assertTrue(tokenization._is_punctuation(u"$"))
        self.assertTrue(tokenization._is_punctuation(u"`"))
        self.assertTrue(tokenization._is_punctuation(u"."))

        self.assertFalse(tokenization._is_punctuation(u"A"))
        self.assertFalse(tokenization._is_punctuation(u" "))

Example #2

Show file

File: utils.py Project: julian-pani/bert-text

def customize_tokenizer(text, do_lower_case=False):
    tokenizer = tokenization.BasicTokenizer(do_lower_case=do_lower_case)
    temp_x = ""
    text = tokenization.convert_to_unicode(text)
    for c in text:
        if tokenizer._is_chinese_char(ord(c)) or tokenization._is_punctuation(
                c) or tokenization._is_whitespace(
                    c) or tokenization._is_control(c):
            temp_x += " " + c + " "
        else:
            temp_x += c
    if do_lower_case:
        temp_x = temp_x.lower()
    return temp_x.split()

Example #3

Show file

 def _is_punctuation(self, char):
     return bert_tokenization._is_punctuation(char)  # pylint: disable=protected-access