Python Cleaner.replace Exemples

Langage de programmation: Python

Espace de nommage/Pack: lxml.html.clean

Class/Type: Cleaner

Méthode/Fonction: replace

Exemples au hotexamples.com: 1

Python Cleaner.replace - 1 exemples trouvés. Ce sont les exemples réels les mieux notés de lxml.html.clean.Cleaner.replace extraits de projets open source. Vous pouvez noter les exemples pour nous aider à en améliorer la qualité.

Méthodes fréquemment utilisées

Afficher Cacher

Cleaner(30)

clean_html(30)

style(30)

kill_tags(30)

javascript(30)

remove_tags(23)

scripts(21)

page_structure(19)

meta(19)

links(16)

remove_unknown_tags(15)

comments(14)

allow_tags(13)

safe_attrs_only(12)

embedded(11)

forms(11)

frames(9)

annoying_tags(8)

html(7)

processing_instructions(7)

inline_style(4)

safe_attrs(3)

xpath(2)

add_nofollow(2)

__call__(2)

allow_tag(1)

javasript(1)

remove_attributes(1)

host_whitelist(1)

replace(1)

frame(1)

embeded(1)

script(1)

allow_attributes(1)

startswith(1)

__init__(1)

whitelist_tags(1)

allow_embedded_url(1)

Méthodes fréquemment utilisées

Cleaner (30)

clean_html (30)

style (30)

kill_tags (30)

javascript (30)

remove_tags (23)

scripts (21)

page_structure (19)

meta (19)

links (16)

Méthodes fréquemment utilisées

remove_unknown_tags (15)

comments (14)

allow_tags (13)

safe_attrs_only (12)

embedded (11)

forms (11)

frames (9)

annoying_tags (8)

html (7)

processing_instructions (7)

inline_style (4)

safe_attrs (3)

xpath (2)

add_nofollow (2)

__call__ (2)

allow_tag (1)

javasript (1)

remove_attributes (1)

host_whitelist (1)

replace (1)

Méthodes fréquemment utilisées

inline_style (4)

safe_attrs (3)

xpath (2)

add_nofollow (2)

__call__ (2)

allow_tag (1)

javasript (1)

remove_attributes (1)

host_whitelist (1)

replace (1)

frame (1)

embeded (1)

script (1)

allow_attributes (1)

startswith (1)

__init__ (1)

whitelist_tags (1)

allow_embedded_url (1)

Méthodes fréquemment utilisées

frame (1)

embeded (1)

script (1)

allow_attributes (1)

startswith (1)

__init__ (1)

whitelist_tags (1)

allow_embedded_url (1)

Exemple #1

0

Afficher le fichier

Fichier : models.py Projet : franciskung/myewb2

def intro(self): if len(self.body) < 250: return self.body if not self.external_link: # thanks http://stackoverflow.com/questions/250357/smart-truncate-in-python intro = self.body[:250].rsplit(' ', 1)[0] intro += '...' intro = Cleaner(scripts=False, # disable it all except page_structure javascript=False, # as proper cleaning is done on save; comments=False, # here we just want to fix any links=False, # dangling tags caused by truncation meta=False, #page_stricture=True, embedded=False, frames=False, forms=False, annoying_tags=False, remove_unknown_tags=False, safe_attrs_only=False).clean_html(intro) return intro else: # woot http://stackoverflow.com/questions/753052/strip-html-from-strings-in-python from HTMLParser import HTMLParser class MLStripper(HTMLParser): def __init__(self): self.reset() self.fed = [] self.opentags = 0 def handle_starttag(self, tag, attrs): if tag in ('script', 'style', 'title'): self.opentags = self.opentags + 1 def handle_endtag(self, tag): if tag in ('script', 'style', 'title'): self.opentags = self.opentags - 1 def handle_data(self, d): if self.opentags == 0: # blatent hack to beautify mailchimp weekly roundup imports if d not in ("Find out about what", "s new in EWB this week!", "|EWB", " Weekly Roundup", "Not displaying correctly?"): self.fed.append(d) def get_data(self): return ''.join(self.fed) s = MLStripper() s.feed(self.body) intro = s.get_data() intro = intro.replace("\n", "") intro = re.sub(r' +', ' ', intro) intro = intro[:250].rsplit(' ', 1)[0] intro += '...' return intro