Python HTMLImageLinkExtractorの例

プログラミング言語: Python

名前空間/パッケージ名: scrapy.contrib.linkextractors.image

hotexamples.comのコード掲載数: 5

Python HTMLImageLinkExtractor - 5件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのscrapy.contrib.linkextractors.image.HTMLImageLinkExtractorの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

HTMLImageLinkExtractor(2)

extract_links(2)

コード例 #1

ファイルを表示

ファイル: test_contrib_linkextractors.py プロジェクト: pkufranky/scrapy

    def test_extraction(self):
        '''Test the extractor's behaviour among different situations'''

        lx = HTMLImageLinkExtractor(locations=('//img', ))
        links_1 = lx.extract_links(self.response)
        self.assertEqual(links_1,
            [ Link(url='http://example.com/sample1.jpg', text=u'sample 1'),
              Link(url='http://example.com/sample2.jpg', text=u'sample 2'),
              Link(url='http://example.com/sample4.jpg', text=u'sample 4') ])

        lx = HTMLImageLinkExtractor(locations=('//img', ), unique=False)
        links_2 = lx.extract_links(self.response)
        self.assertEqual(links_2,
            [ Link(url='http://example.com/sample1.jpg', text=u'sample 1'),
              Link(url='http://example.com/sample2.jpg', text=u'sample 2'),
              Link(url='http://example.com/sample4.jpg', text=u'sample 4'),
              Link(url='http://example.com/sample4.jpg', text=u'sample 4 repetition') ])

        lx = HTMLImageLinkExtractor(locations=('//div[@id="wrapper"]', ))
        links_3 = lx.extract_links(self.response)
        self.assertEqual(links_3,
            [ Link(url='http://example.com/sample1.jpg', text=u'sample 1'),
              Link(url='http://example.com/sample2.jpg', text=u'sample 2'),
              Link(url='http://example.com/sample4.jpg', text=u'sample 4') ])

        lx = HTMLImageLinkExtractor(locations=('//a', ))
        links_4 = lx.extract_links(self.response)
        self.assertEqual(links_4,
            [ Link(url='http://example.com/sample2.jpg', text=u'sample 2'),
              Link(url='http://example.com/sample3.html', text=u'sample 3') ])

コード例 #2

ファイルを表示

ファイル: test_contrib_linkextractors.py プロジェクト: serkanh/scrapy

    def test_extraction(self):
        """Test the extractor's behaviour among different situations"""

        lx = HTMLImageLinkExtractor(locations=("//img",))
        links_1 = lx.extract_links(self.response)
        self.assertEqual(
            links_1,
            [
                Link(url="http://example.com/sample1.jpg", text=u"sample 1"),
                Link(url="http://example.com/sample2.jpg", text=u"sample 2"),
                Link(url="http://example.com/sample4.jpg", text=u"sample 4"),
            ],
        )

        lx = HTMLImageLinkExtractor(locations=("//img",), unique=False)
        links_2 = lx.extract_links(self.response)
        self.assertEqual(
            links_2,
            [
                Link(url="http://example.com/sample1.jpg", text=u"sample 1"),
                Link(url="http://example.com/sample2.jpg", text=u"sample 2"),
                Link(url="http://example.com/sample4.jpg", text=u"sample 4"),
                Link(url="http://example.com/sample4.jpg", text=u"sample 4 repetition"),
            ],
        )

        lx = HTMLImageLinkExtractor(locations=('//div[@id="wrapper"]',))
        links_3 = lx.extract_links(self.response)
        self.assertEqual(
            links_3,
            [
                Link(url="http://example.com/sample1.jpg", text=u"sample 1"),
                Link(url="http://example.com/sample2.jpg", text=u"sample 2"),
                Link(url="http://example.com/sample4.jpg", text=u"sample 4"),
            ],
        )

        lx = HTMLImageLinkExtractor(locations=("//a",))
        links_4 = lx.extract_links(self.response)
        self.assertEqual(
            links_4,
            [
                Link(url="http://example.com/sample2.jpg", text=u"sample 2"),
                Link(url="http://example.com/sample3.html", text=u"sample 3"),
            ],
        )

コード例 #3

ファイルを表示

ファイル: test_contrib_linkextractors.py プロジェクト: richard-ma/CodeReading

    def test_extraction(self):
        '''Test the extractor's behaviour among different situations'''

        lx = HTMLImageLinkExtractor(locations=('//img', ))
        links_1 = lx.extract_links(self.response)
        self.assertEqual(links_1, [
            Link(url='http://example.com/sample1.jpg', text=u'sample 1'),
            Link(url='http://example.com/sample2.jpg', text=u'sample 2'),
            Link(url='http://example.com/sample4.jpg', text=u'sample 4')
        ])

        lx = HTMLImageLinkExtractor(locations=('//img', ), unique=False)
        links_2 = lx.extract_links(self.response)
        self.assertEqual(links_2, [
            Link(url='http://example.com/sample1.jpg', text=u'sample 1'),
            Link(url='http://example.com/sample2.jpg', text=u'sample 2'),
            Link(url='http://example.com/sample4.jpg', text=u'sample 4'),
            Link(url='http://example.com/sample4.jpg',
                 text=u'sample 4 repetition')
        ])

        lx = HTMLImageLinkExtractor(locations=('//div[@id="wrapper"]', ))
        links_3 = lx.extract_links(self.response)
        self.assertEqual(links_3, [
            Link(url='http://example.com/sample1.jpg', text=u'sample 1'),
            Link(url='http://example.com/sample2.jpg', text=u'sample 2'),
            Link(url='http://example.com/sample4.jpg', text=u'sample 4')
        ])

        lx = HTMLImageLinkExtractor(locations=('//a', ))
        links_4 = lx.extract_links(self.response)
        self.assertEqual(links_4, [
            Link(url='http://example.com/sample2.jpg', text=u'sample 2'),
            Link(url='http://example.com/sample3.html', text=u'sample 3')
        ])

コード例 #4

ファイルを表示

ファイル: test_contrib_linkextractors.py プロジェクト: richard-ma/CodeReading

 def test_urls_type(self):
     '''Test that the resulting urls are regular strings and not a unicode objects'''
     lx = HTMLImageLinkExtractor()
     links = lx.extract_links(self.response)
     self.assertTrue(all(isinstance(link.url, str) for link in links))

コード例 #5

ファイルを表示

ファイル: test_contrib_linkextractors.py プロジェクト: pkufranky/scrapy

 def test_urls_type(self):
     '''Test that the resulting urls are regular strings and not a unicode objects'''
     lx = HTMLImageLinkExtractor()
     links = lx.extract_links(self.response)
     self.assertTrue(all(isinstance(link.url, str) for link in links))