Python Selector.append 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: scrapy

클래스/타입: Selector

메소드/함수: append

hotexamples.com에서의 예제들: 2

Python Selector.append - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 scrapy.Selector.append에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

Selector(30)

css(30)

split(30)

xpath(30)

re(24)

extract(22)

replace(11)

strip(9)

__len__(8)

remove_namespaces(7)

startswith(7)

find(6)

select(6)

__contains__(4)

extract_first(3)

index(3)

append(2)

register_namespace(2)

re_first(2)

group(2)

get(2)

findall(2)

endswith(1)

rsplit(1)

json(1)

select_by_visible_text(1)

isdigit(1)

예제 #1

파일 보기

    def parse(self, response):
        """Parse first main sitemap.xml by initial parsing method.
        Getting sub_sitemaps.
        """
        body = response.body
        links = Selector(text=body).xpath("//loc/text()").getall()
        # Parse last sitemap xml number
        # (in this case: "1"): https://iz.ru/export/sitemap/1/xml
        sitemap_n = int(links[-1].split("sitemap/")[1].split("/")[0])

        # Get last empty sitemap link (main "sitemap.xml" on this site isn't updated frequently enough)
        # by iterating sitemap links adding "number" to it
        sitemap_n += 1
        while True:
            link = "https://iz.ru/export/sitemap/{}/xml".format(sitemap_n)
            body = requests.get(link).content

            sitemap_links = Selector(text=body).xpath("//loc/text()").getall()
            # If there are links in this sitemap
            if sitemap_links:
                links.append(link)
                sitemap_n += 1
            else:
                break

        # Get all links from sitemaps until reach "until_date"
        for link in links[::-1]:
            yield Request(url=link, callback=self.parse_sitemap)

예제 #2

파일 보기

    def parse_listing_contents(self, response):
        item = TvshowsItem()
        item["show_name"] = \
            response.xpath('//*[@id="main"]/section/div[1]/div/section[1]/section/div[1]/h2/a/text()').extract()[0]
        item["status"] = \
            response.xpath('//*[@id="media_v4"]/section/div[1]/div/section[1]/p[1]/text()').extract()[0].strip()
        item["network"] = \
            response.xpath('//*[@id="media_v4"]/section/div[1]/div/section[1]/p[2]/a/text()').extract()[0]
        item["language"] = \
            response.xpath('//*[@id="media_v4"]/section/div[1]/div/section[1]/p[4]/text()').extract()[0].strip()
        item["tv_db_score"] = \
            response.xpath('//*[@id="main"]/section/div[1]/div/section[1]/section/div[1]/div/div/span[2]/text()').extract()[0].strip()

        genre_panel = response.xpath(
            '//*[@id="media_v4"]/section/div[1]/div/section[2]').extract()
        i = 1
        genres = []
        while Selector(
                text=genre_panel[0]).xpath('//ul/li[' + str(i) +
                                           ']/a/text()').extract() != []:
            if genres == []:
                genres = \
                    Selector(text=genre_panel[0]).xpath('//ul/li[' + str(i) + ']/a/text()').extract()

            else:
                genres.append(
                    Selector(
                        text=genre_panel[0]).xpath('//ul/li[' + str(i) +
                                                   ']/a/text()').extract()[0])
            i += 1
        item["genre"] = genres

        casts_panel = response.xpath(
            '//*[@id="main"]/section/div[1]/div/section[2]/ol').extract()
        i = 1
        casts = []
        while Selector(
                text=casts_panel[0]).xpath('//li[' + str(i) +
                                           ']/p[1]/a/text()').extract() != []:

            if casts == []:
                casts = \
                    Selector(text=casts_panel[0]).xpath('//li[' + str(i) + ']/p[1]/a/text()').extract()

            else:
                casts.append(
                    Selector(text=casts_panel[0]).xpath(
                        '//li[' + str(i) + ']/p[1]/a/text()').extract()[0])
            i += 1
        item["casts"] = casts

        yield item