Python format_html_tokens Exemples

Langage de programmation: Python

Espace de nommage/Pack: article_extraction.html

Méthode/Fonction: format_html_tokens

Exemples au hotexamples.com: 2

Python format_html_tokens - 2 exemples trouvés. Ce sont les exemples réels les mieux notés de article_extraction.html.format_html_tokens extraits de projets open source. Vous pouvez noter les exemples pour nous aider à en améliorer la qualité.

Associées

combine

wire_deformer

split

get_command_output

floatp_value

toScreenHeader

whoAmI

transaction_hash

soft_shutdown

expand

Related in langs

Event (PHP)

Subscription (PHP)

Codility.L6.Leader (C#)

CCASEStatement (C#)

selectTab (C++)

esif_ccb_mutex_uninit (C++)

Between (Go)

GetRandomString (Go)

NodeModelBuilder (Java)

VcsDirectoryMapping (Java)

Exemple #1

0

Afficher le fichier

Fichier : test_html.py Projet : mylove00025/article_extraction

def test_format_html_tokens(self): tokens = ["<p>", "this", "is", "a", "test", "</p>", "<a>", "link", "</a>", "text", "<h1>", "header", "</h1>"] expected_result = ["this", "is", "a", "test", "\n", "\n", "link", "text", "\n", "header", "\n"] result = format_html_tokens(tokens) self.assertListEqual(result, expected_result)

Exemple #2

0

Afficher le fichier

Fichier : mss.py Projet : mylove00025/article_extraction

def extract_article(self, document): """Extract the article from the page contents.""" html_document = clean_html(html.document_fromstring(document)) tokens = tokenize_html(html_document) scores = [self.scoring.score(term) for term in tokens] terms = extract_maximum_subsequence(tokens, scores) terms = format_html_tokens(terms) terms = [re.sub(r"\n ", "\n", term, flags=re.UNICODE) for term in terms] contents = create_text(terms) return contents