Python HtmlResponse._set_body Examples

Programming Language: Python

Namespace/Package Name: scrapy.http.response.html

Class/Type: HtmlResponse

Method/Function: _set_body

Examples at hotexamples.com: 2

Python HtmlResponse._set_body - 2 examples found. These are the top rated real world Python examples of scrapy.http.response.html.HtmlResponse._set_body extracted from open source projects. You can rate examples to help us improve the quality of examples.

Frequently Used Methods

Show Hide

css(30)

urljoin(22)

follow(20)

HtmlResponse(19)

json(4)

_set_body(2)

body(1)

replace(1)

Example #1

Show file

from scrapy.http.response.html import HtmlResponse

response = HtmlResponse(
    'file://*****:*****@class="subject-item"]')
    #xpath
    for subject in subjects:
        #print(subject)
        title = subject.xpath('.//h2/a/text()').extract()  # selectorlist类型
        #print(type(title))
        print(title[0].strip())

        rate = subject.xpath('.//span[@class="rating_nums"]/text()')
        print(rate[0].extract().strip())  #lxml

    #css
    for subject in subjects:
        title = subject.css('h2 a::text')
        print(title[0].extract().strip())

        rate = subject.css('span.rating_nums::text').re(r'^9\..*')  #9分以上
        if rate:
            print(rate[0].strip())

Example #2

Show file

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# File  : aa.py
# Author: HuXianyong
# Date  : 2019/8/17 9:39

from scrapy.http.response.html import HtmlResponse
response = HtmlResponse('', encoding='utf-8')

with open('../test.html',encoding='utf-8') as f:
    response._set_body(f.read())


    # xpath
    subjects = response.xpath('//li[@class="subject-item"]')
    for subject in subjects:
        # title = subject.xpath('.//h2/a/text()').getall()
        # title = subject.xpath('.//h2/a/text()').extract()
        title = subject.xpath('.//h2/a/text()').get()
        print(title.strip())
        rate = subject.xpath('.//span[@class="rating_nums"]/text()').get()
        print(rate)
    # CSS
    # subjects = response.css('li.subject-item')
    # for subject in subjects:
    #     title = subject.css('h2 a::text').get()
    #     print(title)
    #     # rate = subject.css('span.rating_nums::text').get()
    #     rate = subject.css('span.rating_nums::text').re(r'^9\.\d+')
    #     print(rate)