Python _html4_parse 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: calibre.ebooks.oeb.parse_utils

메소드/함수: _html4_parse

hotexamples.com에서의 예제들: 3

Python _html4_parse - 3개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 calibre.ebooks.oeb.parse_utils._html4_parse에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

def html_to_lxml(raw):
    raw = '<div>%s</div>' % raw
    root = parse(raw,
                 keep_doctype=False,
                 namespace_elements=False,
                 maybe_xhtml=False,
                 sanitize_names=True)
    root = next(root.iterdescendants('div'))
    root.set('xmlns', "http://www.w3.org/1999/xhtml")
    raw = etree.tostring(root, encoding='unicode')
    try:
        return safe_xml_fromstring(raw, recover=False)
    except:
        for x in root.iterdescendants():
            remove = []
            for attr in x.attrib:
                if ':' in attr:
                    remove.append(attr)
            for a in remove:
                del x.attrib[a]
        raw = etree.tostring(root, encoding='unicode')
        try:
            return safe_xml_fromstring(raw, recover=False)
        except:
            from calibre.ebooks.oeb.parse_utils import _html4_parse
            return _html4_parse(raw)

예제 #2

파일 보기

파일: opds.py 프로젝트: zhanghb1994/calibre

def html_to_lxml(raw):
    raw = '<div>%s</div>' % raw
    root = html.fragment_fromstring(raw)
    root.set('xmlns', "http://www.w3.org/1999/xhtml")
    raw = etree.tostring(root, encoding=None)
    try:
        return safe_xml_fromstring(raw, recover=False)
    except:
        for x in root.iterdescendants():
            remove = []
            for attr in x.attrib:
                if ':' in attr:
                    remove.append(attr)
            for a in remove:
                del x.attrib[a]
        raw = etree.tostring(root, encoding=None)
        try:
            return safe_xml_fromstring(raw, recover=False)
        except:
            from calibre.ebooks.oeb.parse_utils import _html4_parse
            return _html4_parse(raw)

예제 #3

파일 보기

def html_to_lxml(raw):
    raw = u'<div>%s</div>'%raw
    root = html.fragment_fromstring(raw)
    root.set('xmlns', "http://www.w3.org/1999/xhtml")
    raw = etree.tostring(root, encoding=None)
    try:
        return etree.fromstring(raw)
    except:
        for x in root.iterdescendants():
            remove = []
            for attr in x.attrib:
                if ':' in attr:
                    remove.append(attr)
            for a in remove:
                del x.attrib[a]
        raw = etree.tostring(root, encoding=None)
        try:
            return etree.fromstring(raw)
        except:
            from calibre.ebooks.oeb.parse_utils import _html4_parse
            return _html4_parse(raw)