Python Cleaner.whitelist_tags Exemples

Langage de programmation: Python

Espace de nommage/Pack: lxml.html.clean

Class/Type: Cleaner

Méthode/Fonction: whitelist_tags

Exemples au hotexamples.com: 1

Python Cleaner.whitelist_tags - 1 exemples trouvés. Ce sont les exemples réels les mieux notés de lxml.html.clean.Cleaner.whitelist_tags extraits de projets open source. Vous pouvez noter les exemples pour nous aider à en améliorer la qualité.

Méthodes fréquemment utilisées

Afficher Cacher

Cleaner(30)

clean_html(30)

style(30)

kill_tags(30)

javascript(30)

remove_tags(23)

scripts(21)

page_structure(19)

meta(19)

links(16)

remove_unknown_tags(15)

comments(14)

allow_tags(13)

safe_attrs_only(12)

embedded(11)

forms(11)

frames(9)

annoying_tags(8)

html(7)

processing_instructions(7)

inline_style(4)

safe_attrs(3)

xpath(2)

add_nofollow(2)

__call__(2)

allow_tag(1)

javasript(1)

remove_attributes(1)

host_whitelist(1)

replace(1)

frame(1)

embeded(1)

script(1)

allow_attributes(1)

startswith(1)

__init__(1)

whitelist_tags(1)

allow_embedded_url(1)

Méthodes fréquemment utilisées

Cleaner (30)

clean_html (30)

style (30)

kill_tags (30)

javascript (30)

remove_tags (23)

scripts (21)

page_structure (19)

meta (19)

links (16)

Méthodes fréquemment utilisées

remove_unknown_tags (15)

comments (14)

allow_tags (13)

safe_attrs_only (12)

embedded (11)

forms (11)

frames (9)

annoying_tags (8)

html (7)

processing_instructions (7)

inline_style (4)

safe_attrs (3)

xpath (2)

add_nofollow (2)

__call__ (2)

allow_tag (1)

javasript (1)

remove_attributes (1)

host_whitelist (1)

replace (1)

Méthodes fréquemment utilisées

inline_style (4)

safe_attrs (3)

xpath (2)

add_nofollow (2)

__call__ (2)

allow_tag (1)

javasript (1)

remove_attributes (1)

host_whitelist (1)

replace (1)

frame (1)

embeded (1)

script (1)

allow_attributes (1)

startswith (1)

__init__ (1)

whitelist_tags (1)

allow_embedded_url (1)

Méthodes fréquemment utilisées

frame (1)

embeded (1)

script (1)

allow_attributes (1)

startswith (1)

__init__ (1)

whitelist_tags (1)

allow_embedded_url (1)

Exemple #1

0

Afficher le fichier

Fichier : html_helper.py Projet : wuhaifengdhu/DevinHelper

# -*- coding: utf-8 -*- from __future__ import print_function import re import os import lxml from bs4 import BeautifulSoup from lxml.html.clean import Cleaner from lxml.etree import XMLSyntaxError from store_helper import StoreHelper from text_helper import TextHelper cleaner = Cleaner() cleaner.javascript = True # This is True because we want to activate the javascript filter cleaner.style = True # This is True because we want to activate the styles & stylesheet filter cleaner.inline_style = True cleaner.whitelist_tags = set([]) cleaner.remove_tags = [ 'p', 'ul', 'li', 'b', 'br', 'article', 'div', 'body', 'div', 'h1', 'h2', 'h3', 'h4', 'h5', 'span' ] cleaner.kill_tags = ['footer', 'a', 'noscript', 'header', 'label'] class HTMLHelper(object): @staticmethod def remove_tag(web_source): text = re.sub(r'<[^>]+>', '', web_source) return text @staticmethod def get_text(web_source):