Python Page示例

编程语言: Python

命名空间/包名称: metadata.page

类/类型: Page

hotexamples.com的示例: 2

Python Page - 已找到2个示例。这些是从开源项目中提取的最受好评的metadata.page.Page现实Python示例。您可以评价示例，以帮助我们提高示例质量。

常用方法

显示隐藏

get(1)

save(1)

示例#1

显示文件

文件： web_crawler.py 项目： ntodosiychuk/Simple-Search-Engine

def web_crawling(url):
    """
    Main task is extract page content by url,
    parse html and get all links and add to redis queue.
    """
    logging.info('Extracting content for: %s', url)
    #extract page content
    try:
        page = urlopen(url)
        content = page.read()
    except (HTTPError, URLError):
        return

    logging.info('Start to parse content for: %s', url)
    soup = BeautifulSoup(content, 'html.parser')

    #parse and store content of pages
    for s in soup(['style', 'script',
                   '[document]', 'had', 'title']):
        s.extract()

    page = Page(url, soup.getText())
    page.save()

    logging.info('Stored Content in for: %s', url)

    #find all links and add to queue
    links = soup.findAll('a', attrs={'href': re.compile('^http://')})
    for link in links:
        href = link.get('href')
        q.put(href)
        logging.info('Added %s to Url Queue for processing', url)

    logging.info('Finish to parse content for: %s', url)

示例#2

显示文件

文件： search_engine.py 项目： ntodosiychuk/Simple-Search-Engine

def search_result(search_query):
    return Page.get(search_query)