Python Selector.rsplit 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: scrapy

클래스/타입: Selector

메소드/함수: rsplit

hotexamples.com에서의 예제들: 1

Python Selector.rsplit - 1개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 scrapy.Selector.rsplit에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

Selector(30)

css(30)

split(30)

xpath(30)

re(24)

extract(22)

replace(11)

strip(9)

__len__(8)

remove_namespaces(7)

startswith(7)

find(6)

select(6)

__contains__(4)

extract_first(3)

index(3)

append(2)

register_namespace(2)

re_first(2)

group(2)

get(2)

findall(2)

endswith(1)

rsplit(1)

json(1)

select_by_visible_text(1)

isdigit(1)

예제 #1

파일 보기

    def parse(self, response, **kwargs):
        """
        Extracts all the data from the crawled pages and appends them to articles list
        """
        title = response.xpath('//*[@id="wrap"]/h1/text()').extract_first()
        if title:
            url_to_full_version = response._get_url()
            first_160 = ''.join(
                response.xpath(
                    '//*[@id="woe"]/section/div/p/text()').extract())[:160]
            base_date = response.xpath(
                '//*[@id="wrap"]/div/div[2]/text()').extract_first()
            date_formatted = conf.exec_func_chain(base_date, [
                conf.clean_records_regex, lambda v: v[0:-2],
                lambda v: conf.parse_dtts(v, '%b %d, %Y')
            ])

            tags = response.xpath(
                '//*[@id="woe"]/section[3]/div/div/a/text()').extract()
            authors_section = response.xpath(
                '//*[@id="wrap"]/div/div[1]/div/span/a')
            for row in authors_section:
                full_author_url = Selector(text=row.extract()).xpath('///@href') \
                    .extract_first()
                author_fullname = conf.clean_records_regex(
                    Selector(text=row.extract()).xpath(
                        '///span/text()').extract_first())
                if date_formatted >= conf.crawl_date[0].get(
                        'LastExecutionDate'):
                    conf.write_data_append(
                        'articles.json',
                        json.dumps({
                            'title':
                            title,
                            'urlFullVersion':
                            url_to_full_version,
                            'first160':
                            first_160,
                            'dateFormatted':
                            date_formatted,
                            'tags':
                            tags,
                            'authorUrl':
                            f"{conf.gd_base_url}"
                            f"{full_author_url}",
                            'authorName':
                            author_fullname,
                            'author_key':
                            full_author_url.rsplit('/')[-2]
                        }))