Python Selector.group 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: scrapy

클래스/타입: Selector

메소드/함수: group

hotexamples.com에서의 예제들: 2

Python Selector.group - 2개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 scrapy.Selector.group에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

자주 사용되는 메소드들

보기 숨기기

Selector(30)

css(30)

split(30)

xpath(30)

re(24)

extract(22)

replace(11)

strip(9)

__len__(8)

remove_namespaces(7)

startswith(7)

find(6)

select(6)

__contains__(4)

extract_first(3)

index(3)

append(2)

register_namespace(2)

re_first(2)

group(2)

get(2)

findall(2)

endswith(1)

rsplit(1)

json(1)

select_by_visible_text(1)

isdigit(1)

예제 #1

파일 보기

 def _parse_related_products(self, response):
     prod = response.meta['product']
     html = re.search(r"html:'(.+?)'\}\]\},", response.body_as_unicode())
     if not html:
         return prod
     html = Selector(text=html.group(1))
     key_name = is_empty(html.css('.rrStrat::text').extract())
     items = html.css('.rrRecs > ul > li')
     rel_prods = []
     for item in items:
         title = is_empty(item.css('.rrItemName > a ::text').extract())
         url = is_empty(item.css('a.rrLinkUrl::attr(href)').extract())
         url = urlparse.urlparse(url)
         qs = urlparse.parse_qs(url.query)
         url = is_empty(qs['ct'])
         rel_prods.append(RelatedProduct(title=title, url=url))
     prod['related_products'] = {key_name: rel_prods}
     return prod

예제 #2

파일 보기

    def start_requests(self):
        s = requests.Session()
        a = requests.adapters.HTTPAdapter(max_retries=3)
        b = requests.adapters.HTTPAdapter(max_retries=3)
        s.mount('http://', a)
        s.mount('https://', b)
        body = '{{"login": {{"username": "******", "password": "******"}}, '.format(
            email=self.login, password=self.password)
        with requests.Session() as s:
            # Set auth cookies
            s.get(self.AUTH_URL,
                  data=body,
                  headers=self.AUTH_HEADERS,
                  timeout=5)
            # An authorised request.
            response = s.post(self.start_urls[0],
                              headers=self.AUTH_HEADERS,
                              timeout=5)
            response = response.text

            total_match = Selector(text=response).xpath(
                '//div[@class="pager"]/p[@class="amount"]/text()').extract()
            if total_match:
                page_links = []
                total_match = re.search('(\d+) gesamt', total_match[0])
                if total_match:
                    total_match = total_match.group(1)
                    page_count = int(total_match) / 50
                    if page_count * 25 < total_match:
                        page_count += 1
                    for i in range(1, page_count + 1):
                        page_link = self.start_urls[0] + '?p=' + str(i)
                        page_links.append(page_link)
                    for page_link in page_links:
                        yield scrapy.Request(url=page_link,
                                             callback=self.parse_links,
                                             headers=self.HEADERS,
                                             dont_filter=True)
            else:
                yield scrapy.Request(url=self.start_urls[0],
                                     callback=self.parse_links,
                                     headers=self.HEADERS,
                                     dont_filter=True)