Python iter_samples 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: scrapely.tests

메소드/함수: iter_samples

hotexamples.com에서의 예제들: 5

Python iter_samples - 5개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 scrapely.tests.iter_samples에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

파일: test_pageparsing.py 프로젝트: Bghyun/scrapely

 def test_site_pages(self):
     """
     Tests from real pages. More reliable and easy to build for more complicated structures
     """
     for source, annotations in iter_samples('pageparsing'):
         template = HtmlPage(body=source)
         parser = TemplatePageParser(TokenDict())
         parser.feed(template)
         for annotation in parser.annotations:
             test_annotation = annotations.pop(0)
             for s in annotation.__slots__:
                 if s == "tag_attributes":
                     for pair in getattr(annotation, s):
                         self.assertEqual(list(pair), test_annotation[s].pop(0))
                 else:
                     self.assertEqual(getattr(annotation, s), test_annotation[s])
         self.assertEqual(annotations, [])

예제 #2

파일 보기

파일: test_pageparsing.py 프로젝트: xyb/scrapely

 def test_site_pages(self):
     """
     Tests from real pages. More reliable and easy to build for more complicated structures
     """
     for source, annotations in iter_samples('pageparsing'):
         template = HtmlPage(body=source)
         parser = TemplatePageParser(TokenDict())
         parser.feed(template)
         for annotation in parser.annotations:
             test_annotation = annotations.pop(0)
             for s in annotation.__slots__:
                 if s == "tag_attributes":
                     for pair in getattr(annotation, s):
                         self.assertEqual(list(pair),
                                          test_annotation[s].pop(0))
                 else:
                     self.assertEqual(getattr(annotation, s),
                                      test_annotation[s])
         self.assertEqual(annotations, [])

예제 #3

파일 보기

파일: test_scraper.py 프로젝트: xyb/scrapely

    def test_extraction(self):

        samples_encoding = 'latin1'
        [(html1, data1), (html2, data2)] = list(iter_samples(
            'scraper_loadstore', html_encoding=samples_encoding))
        sc = Scraper()
        page1 = HtmlPage(body=html1, encoding=samples_encoding)
        sc.train_from_htmlpage(page1, data1)

        page2 = HtmlPage(body=html2, encoding=samples_encoding)
        extracted_data = sc.scrape_page(page2)
        self._assert_extracted(extracted_data, data2)

        # check still works after serialize/deserialize 
        f = StringIO()
        sc.tofile(f)
        f.seek(0)
        sc = Scraper.fromfile(f)
        extracted_data = sc.scrape_page(page2)
        self._assert_extracted(extracted_data, data2)

예제 #4

파일 보기

파일: test_htmlpage.py 프로젝트: 4iji/scrapely

 def test_site_samples(self):
     """test parse_html from real cases"""
     for i, (source, parsed) in enumerate(
             iter_samples('htmlpage', object_hook=_decode_element)):
         self._test_sample(source, parsed, i)

예제 #5

파일 보기

 def test_site_samples(self):
     """test parse_html from real cases"""
     for i, (source, parsed) in enumerate(
             iter_samples('htmlpage', object_hook=_decode_element)):
         self._test_sample(source, parsed, i)