Python w3lib_replace_entitiesの例

プログラミング言語: Python

名前空間/パッケージ名: w3lib.html

メソッド/関数: w3lib_replace_entities

hotexamples.comのコード掲載数: 2

Python w3lib_replace_entities - 2件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのw3lib.html.w3lib_replace_entitiesの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

コード例 #1

ファイルを表示

def extract_regex(regex, text, replace_entities=True, flags=0):
    """Extract a list of unicode strings from the given text/encoding using the following policies:
    * if the regex contains a named group called "extract" that will be returned
    * if the regex contains multiple numbered groups, all those will be returned (flattened)
    * if the regex doesn't contain any group the entire regex matching is returned
    """
    if isinstance(regex, six.string_types):
        regex = re.compile(regex, flags=flags)

    if 'extract' in regex.groupindex:
        # named group
        try:
            extracted = regex.search(text).group('extract')
        except AttributeError:
            strings = []
        else:
            strings = [extracted] if extracted is not None else []
    else:
        # full regex or numbered groups
        strings = regex.findall(text)

    # strings = flatten(strings) # 这东西会把多维列表铺平
    if not replace_entities:
        return strings

    values = []
    for value in strings:
        if isinstance(value, (list, tuple)):  # w3lib_replace_entities 不能接收list tuple
            values.append([w3lib_replace_entities(v, keep=['lt', 'amp']) for v in value])
        else:
            values.append(w3lib_replace_entities(value, keep=['lt', 'amp']))

    return values

コード例 #2

ファイルを表示

def extract_regex(regex, text, replace_entities=True):
    """Extract a list of unicode strings from the given text/encoding using the following policies:
    * if the regex contains a named group called "extract" that will be returned
    * if the regex contains multiple numbered groups, all those will be returned (flattened)
    * if the regex doesn't contain any group the entire regex matching is returned
    """
    if isinstance(regex, six.string_types):
        regex = re.compile(regex, re.UNICODE)

    if 'extract' in regex.groupindex:
        # named group
        try:
            extracted = regex.search(text).group('extract')
        except AttributeError:
            strings = []
        else:
            strings = [extracted] if extracted is not None else []
    else:
        # full regex or numbered groups
        strings = regex.findall(text)

    strings = flatten(strings)
    if not replace_entities:
        return strings
    return [w3lib_replace_entities(s, keep=['lt', 'amp']) for s in strings]