Python HtmlParser示例

编程语言: Python

命名空间/包名称: lib.html_parser

类/类型: HtmlParser

hotexamples.com的示例: 6

Python HtmlParser - 已找到6个示例。这些是从开源项目中提取的最受好评的lib.html_parser.HtmlParser现实Python示例。您可以评价示例，以帮助我们提高示例质量。

常用方法

显示隐藏

clean_body(1)

download_links(1)

links(1)

organisations(1)

title(1)

to_json(1)

示例#1

0

显示文件

文件： test_html_parser.py 项目： jackscotti/smart-aleck

def test_to_json():
    p = HtmlParser(page)
    jsoned_document = p.to_json()

    assert "A fantastic title!" in jsoned_document['title']

    clean_body = jsoned_document['body']

    assert "A fantastic body!" in clean_body
    assert "The header" not in clean_body
    assert "The footer" not in clean_body

    assert jsoned_document["links"] == [
        '/defra',
        'www.links1.com',
        'www.links2.com',
        'www.links3.com',
        'http://www.gov.uk/stats.pdf'
    ]

    assert jsoned_document["download_links"] == [
        'http://www.gov.uk/stats.pdf'
    ]

    assert jsoned_document["organisations"] == [
        'DEFRA'
    ]

示例#2

0

显示文件

文件： test_html_parser.py 项目： jackscotti/smart-aleck

def test_clean_body():
    p = HtmlParser(page)
    clean_body = p.clean_body()

    assert "A fantastic body!" in clean_body
    assert "The header" not in clean_body
    assert "The footer" not in clean_body

示例#3

0

显示文件

文件： test_html_parser.py 项目： jackscotti/smart-aleck

def test_links():
    p = HtmlParser(page)

    assert p.links() == [
        '/defra',
        'www.links1.com',
        'www.links2.com',
        'www.links3.com',
        'http://www.gov.uk/stats.pdf'
    ]

示例#4

0

显示文件

文件： test_html_parser.py 项目： jackscotti/smart-aleck

def test_organisations():
    p = HtmlParser(page)

    assert p.organisations() == [
        'DEFRA'
    ]

示例#5

0

显示文件

文件： test_html_parser.py 项目： jackscotti/smart-aleck

def test_download_links():
    p = HtmlParser(page)

    assert p.download_links() == [
        'http://www.gov.uk/stats.pdf'
    ]

示例#6

0

显示文件

文件： test_html_parser.py 项目： jackscotti/smart-aleck

def test_title():
    p = HtmlParser(page)
    assert "A fantastic title!" in p.title()