Python TemplateMaker.annotateの例

プログラミング言語: Python

名前空間/パッケージ名: scrapely.template

クラス/型: TemplateMaker

メソッド/関数: annotate

hotexamples.comのコード掲載数: 15

Python TemplateMaker.annotate - 15件のコード例が見つかりました。すべてオープンソースプロジェクトから抽出されたPythonのscrapely.template.TemplateMaker.annotateの実例で、最も評価が高いものを厳選しています。コード例の評価を行っていただくことで、より質の高いコード例が表示されるようになります。

よく使われるメソッド

表示非表示

TemplateMaker(15)

annotate(7)

get_template(7)

selected_data(5)

annotations(3)

annotate_fragment(2)

select(1)

コード例 #1

ファイルを表示

ファイル: test_template.py プロジェクト: netconstructor/scrapely

 def test_annotations(self):
     tm = TemplateMaker(self.PAGE)
     tm.annotate("field1", best_match("text to annotate"), best_match=False)
     annotations = [x[0] for x in tm.annotations()]
     self.assertEqual(
         annotations, [{u"annotations": {u"content": u"field1"}}, {u"annotations": {u"content": u"field1"}}]
     )

コード例 #2

ファイルを表示

ファイル: test_template.py プロジェクト: scrapy/scrapely

 def test_annotations(self):
     tm = TemplateMaker(self.PAGE)
     tm.annotate('field1', best_match('text to annotate'), best_match=False)
     annotations = [x[0] for x in tm.annotations()]
     self.assertEqual(annotations,
         [{u'annotations': {u'content': u'field1'}},
          {u'annotations': {u'content': u'field1'}}])

コード例 #3

ファイルを表示

ファイル: test_template.py プロジェクト: scrapy/scrapely

 def test_annotate_multiple(self):
     tm = TemplateMaker(self.PAGE)
     tm.annotate('field1', best_match('text to annotate'), best_match=False)
     tpl = tm.get_template()
     ex = InstanceBasedLearningExtractor([(tpl, None)])
     self.assertEqual(ex.extract(self.PAGE)[0],
         [{u'field1': [u'Some text to annotate here', u'Another text to annotate there']}])

コード例 #4

ファイルを表示

ファイル: test_template.py プロジェクト: scrapy/scrapely

 def test_annotate_ignore_unpaired(self):
     tm = TemplateMaker(self.PAGE)
     tm.annotate('field1', best_match("and that's"), best_match=False)
     tpl = tm.get_template()
     ex = InstanceBasedLearningExtractor([(tpl, None)])
     self.assertEqual(ex.extract(self.PAGE)[0],
         [{u'field1': [u"More text with unpaired tag <img />and that's it"]}])

コード例 #5

ファイルを表示

ファイル: test_template.py プロジェクト: xyb/scrapely

 def test_annotations(self):
     tm = TemplateMaker(self.PAGE)
     tm.annotate('field1', best_match('text to annotate'), best_match=False)
     annotations = [x[0] for x in tm.annotations()]
     self.assertEqual(annotations,
         [{u'annotations': {u'content': u'field1'}},
          {u'annotations': {u'content': u'field1'}}])

コード例 #6

ファイルを表示

ファイル: test_template.py プロジェクト: xyb/scrapely

 def test_annotate_ignore_unpaired(self):
     tm = TemplateMaker(self.PAGE)
     tm.annotate('field1', best_match("and that's"), best_match=False)
     tpl = tm.get_template()
     ex = InstanceBasedLearningExtractor([(tpl, None)])
     self.assertEqual(ex.extract(self.PAGE)[0],
         [{u'field1': [u"More text with unpaired tag <img />and that's it"]}])

コード例 #7

ファイルを表示

ファイル: test_template.py プロジェクト: xyb/scrapely

 def test_annotate_multiple(self):
     tm = TemplateMaker(self.PAGE)
     tm.annotate('field1', best_match('text to annotate'), best_match=False)
     tpl = tm.get_template()
     ex = InstanceBasedLearningExtractor([(tpl, None)])
     self.assertEqual(ex.extract(self.PAGE)[0],
         [{u'field1': [u'Some text to annotate here', u'Another text to annotate there']}])

コード例 #8

ファイルを表示

ファイル: test_template.py プロジェクト: netconstructor/scrapely

 def test_annotate_multiple(self):
     tm = TemplateMaker(self.PAGE)
     tm.annotate("field1", best_match("text to annotate"), best_match=False)
     tpl = tm.get_template()
     ex = InstanceBasedLearningExtractor([tpl])
     self.assertEqual(
         ex.extract(self.PAGE)[0], [{u"field1": [u"Some text to annotate here", u"Another text to annotate there"]}]
     )

コード例 #9

ファイルを表示

ファイル: __init__.py プロジェクト: CodeOps/scrapely

 def train_from_htmlpage(self, htmlpage, data):
     assert data, "Cannot train with empty data"
     tm = TemplateMaker(htmlpage)
     for field, values in data.items():
         if (isinstance(values, (bytes, str)) or
                 not hasattr(values, '__iter__')):
             values = [values]
         for value in values:
             value = str_to_unicode(value, htmlpage.encoding)
             tm.annotate(field, best_match(value))
     self.add_template(tm.get_template())

コード例 #10

ファイルを表示

ファイル: __init__.py プロジェクト: bopopescu/vinalo

 def train_from_htmlpage(self, htmlpage, data):
     assert data, "Cannot train with empty data"
     tm = TemplateMaker(htmlpage)
     for field, values in data.items():
         if not hasattr(values, '__iter__'):
             values = [values]
         for value in values:
             if isinstance(value, str):
                 value = value.decode(htmlpage.encoding or 'utf-8')
             tm.annotate(field, best_match(value))
     self.add_template(tm.get_template())

コード例 #11

ファイルを表示

 def train_from_htmlpage(self, htmlpage, data):
     assert data, "Cannot train with empty data"
     tm = TemplateMaker(htmlpage)
     for field, values in data.items():
         if (isinstance(values, (bytes, str))
                 or not hasattr(values, '__iter__')):
             values = [values]
         for value in values:
             value = str_to_unicode(value, htmlpage.encoding)
             tm.annotate(field, best_match(value))
     self.add_template(tm.get_template())

コード例 #12

ファイルを表示

ファイル: scraper.py プロジェクト: bry0n969/scrapely-hack

 def train(self, url=None, data=None, html=None, encoding='utf-8'):
     assert data, "Cannot train with empty data"
     page = self._get_page(url, encoding, html)
     tm = TemplateMaker(page)
     for field, values in data.items():
         if not hasattr(values, '__iter__'):
             values = [values]
         for value in values:
             if isinstance(value, str):
                 value = value.decode(encoding)
             tm.annotate(field, best_match(value))
     self.templates.append(tm.get_template())

コード例 #13

ファイルを表示

ファイル: scrapely-hack.py プロジェクト: carriercomm/scraperwiki-scraper-vault

 def train(self, url=None, data=None, html=None, encoding='utf-8'):
     assert data, "Cannot train with empty data"
     page = self._get_page(url, encoding, html)
     tm = TemplateMaker(page)
     for field, values in data.items():
         if not hasattr(values, '__iter__'):
             values = [values]
         for value in values:
             if isinstance(value, str):
                 value = value.decode(encoding)
             tm.annotate(field, best_match(value))
     self.templates.append(tm.get_template())

コード例 #14

ファイルを表示

ファイル: test_template.py プロジェクト: netconstructor/scrapely

 def test_annotate_fragment_already_annotated(self):
     tm = TemplateMaker(self.PAGE)
     tm.annotate("field1", best_match("text to annotate"))
     self.assertRaises(FragmentAlreadyAnnotated, tm.annotate, "field1", best_match("text to annotate"))

コード例 #15

ファイルを表示

ファイル: test_template.py プロジェクト: xyb/scrapely

 def test_annotate_fragment_already_annotated(self):
     tm = TemplateMaker(self.PAGE)
     tm.annotate('field1', best_match('text to annotate'))
     self.assertRaises(FragmentAlreadyAnnotated, tm.annotate, 'field1', best_match("text to annotate"))