Python html_to_paragraph_list示例

编程语言: Python

命名空间/包名称: ebdata.templatemaker.textlist

方法/功能: html_to_paragraph_list

hotexamples.com的示例: 4

Python html_to_paragraph_list - 已找到4个示例。这些是从开源项目中提取的最受好评的ebdata.templatemaker.textlist.html_to_paragraph_list现实Python示例。您可以评价示例，以帮助我们提高示例质量。

示例#1

显示文件

文件： models.py 项目： AndrewJHart/everyblock_code

 def auto_excerpt(self):
     """
     Attempts to detect the text of this page (ignoring all navigation and
     other clutter), returning a list of strings. Each string represents a
     paragraph.
     """
     from ebdata.textmining.treeutils import make_tree
     tree = make_tree(self.html)
     if self.seed.rss_full_entry:
         from ebdata.templatemaker.textlist import html_to_paragraph_list
         paras = html_to_paragraph_list(tree)
     else:
         if self.seed.strip_noise:
             from ebdata.templatemaker.clean import strip_template
             try:
                 html2 = self.companion_page().html
             except IndexError:
                 pass
             else:
                 tree2 = make_tree(html2)
                 strip_template(tree, tree2)
         if self.seed.guess_article_text:
             from ebdata.templatemaker.articletext import article_text
             paras = article_text(tree)
         else:
             from ebdata.templatemaker.textlist import html_to_paragraph_list
             paras = html_to_paragraph_list(tree)
     return paras

示例#2

显示文件

 def auto_excerpt(self):
     """
     Attempts to detect the text of this page (ignoring all navigation and
     other clutter), returning a list of strings. Each string represents a
     paragraph.
     """
     from ebdata.textmining.treeutils import make_tree
     tree = make_tree(self.html)
     if self.seed.rss_full_entry:
         from ebdata.templatemaker.textlist import html_to_paragraph_list
         paras = html_to_paragraph_list(tree)
     else:
         if self.seed.strip_noise:
             from ebdata.templatemaker.clean import strip_template
             try:
                 html2 = self.companion_page().html
             except IndexError:
                 pass
             else:
                 tree2 = make_tree(html2)
                 strip_template(tree, tree2)
         if self.seed.guess_article_text:
             from ebdata.templatemaker.articletext import article_text
             paras = article_text(tree)
         else:
             from ebdata.templatemaker.textlist import html_to_paragraph_list
             paras = html_to_paragraph_list(tree)
     return paras

示例#3

显示文件

 def assertConverts(self, html, expected):
     self.assertEqual(html_to_paragraph_list(make_tree(html)), expected)

示例#4

显示文件

文件： textlist.py 项目： AndrewJHart/everyblock_code

 def assertConverts(self, html, expected):
     self.assertEqual(html_to_paragraph_list(make_tree(html)), expected)