Python HtmlResponse._set_body示例

编程语言: Python

命名空间/包名称: scrapy.http.response.html

类/类型: HtmlResponse

方法/功能: _set_body

hotexamples.com的示例: 2

Python HtmlResponse._set_body - 已找到2个示例。这些是从开源项目中提取的最受好评的scrapy.http.response.html.HtmlResponse._set_body现实Python示例。您可以评价示例，以帮助我们提高示例质量。

常用方法

显示隐藏

css(30)

urljoin(22)

follow(20)

HtmlResponse(19)

json(4)

_set_body(2)

body(1)

replace(1)

示例#1

显示文件

from scrapy.http.response.html import HtmlResponse

response = HtmlResponse(
    'file://*****:*****@class="subject-item"]')
    #xpath
    for subject in subjects:
        #print(subject)
        title = subject.xpath('.//h2/a/text()').extract()  # selectorlist类型
        #print(type(title))
        print(title[0].strip())

        rate = subject.xpath('.//span[@class="rating_nums"]/text()')
        print(rate[0].extract().strip())  #lxml

    #css
    for subject in subjects:
        title = subject.css('h2 a::text')
        print(title[0].extract().strip())

        rate = subject.css('span.rating_nums::text').re(r'^9\..*')  #9分以上
        if rate:
            print(rate[0].strip())

示例#2

显示文件

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# File  : aa.py
# Author: HuXianyong
# Date  : 2019/8/17 9:39

from scrapy.http.response.html import HtmlResponse
response = HtmlResponse('', encoding='utf-8')

with open('../test.html',encoding='utf-8') as f:
    response._set_body(f.read())


    # xpath
    subjects = response.xpath('//li[@class="subject-item"]')
    for subject in subjects:
        # title = subject.xpath('.//h2/a/text()').getall()
        # title = subject.xpath('.//h2/a/text()').extract()
        title = subject.xpath('.//h2/a/text()').get()
        print(title.strip())
        rate = subject.xpath('.//span[@class="rating_nums"]/text()').get()
        print(rate)
    # CSS
    # subjects = response.css('li.subject-item')
    # for subject in subjects:
    #     title = subject.css('h2 a::text').get()
    #     print(title)
    #     # rate = subject.css('span.rating_nums::text').get()
    #     rate = subject.css('span.rating_nums::text').re(r'^9\.\d+')
    #     print(rate)