Python html_text Exemples

Langage de programmation: Python

Espace de nommage/Pack: osp.corpus.utils

Méthode/Fonction: html_text

Exemples au hotexamples.com: 5

Python html_text - 5 exemples trouvés. Ce sont les exemples réels les mieux notés de osp.corpus.utils.html_text extraits de projets open source. Vous pouvez noter les exemples pour nous aider à en améliorer la qualité.

Associées

vector_csymbol

filter_resource

show_help

pointsStringToXYlist

ArmParser

fetch_nsx_switches

get_specified_tasks

ConfigManager

TimeAccounting

array2d

Related in langs

message_paypal_error_to_admin (PHP)

remaining (PHP)

Sap.Data.Hana.HanaCommand (C#)

RemoteMessage (C#)

CCDBNodeBranch (C++)

get_exename (C++)

PerformDefault (Go)

Point (Go)

Camera (Java)

BayesNet (Java)

Exemple #1

0

Afficher le fichier

Fichier : syllabus.py Projet : ivanistheone/open-syllabus-project

def text(self): """ Extract the raw plain text. Returns: str: The text content. """ ft = self.libmagic_file_type # Empty: if ft == 'inode/x-empty': return None # Plaintext: elif ft == 'text/plain': with open(self.path, 'r') as fh: return fh.read() # HTML/XML: elif ft == 'text/html': return utils.html_text(self.path) # PDF: elif ft == 'application/pdf': return utils.pdf_text(self.path) # Everything else: else: return utils.docx_text(self.path)

Exemple #2

0

Afficher le fichier

Fichier : syllabus.py Projet : MichaelEdage/open-syllabus-project

def text(self): """ Extract the raw plain text. Returns: str: The text content. """ ft = self.libmagic_file_type # Empty: if ft == 'inode/x-empty': return None # Plaintext: elif ft == 'text/plain': with open(self.path, 'r') as fh: return fh.read() # HTML/XML: elif ft == 'text/html': return utils.html_text(self.path) # PDF: elif ft == 'application/pdf': return utils.pdf_text(self.path) # Everything else: else: return utils.docx_text(self.path)

Exemple #3

0

Afficher le fichier

def test_extract_text(mock_osp): """ Text inside HTML tags should be extracted. """ html = '<p>text</p>' path = mock_osp.add_file(content=html, ftype='html') text = html_text(path) assert text == 'text'

Exemple #4

0

Afficher le fichier

def test_ignore_custom_tags(mock_osp): """ Tags explicitly passed in `excluded` should be ignored. """ html = """ <h1>h1</h1> <h2>h2</h2> <h3>h3</h3> """ path = mock_osp.add_file(content=html, ftype='html') text = html_text(path, ['h1', 'h2']).strip() assert text == 'h3'

Exemple #5

0

Afficher le fichier

def test_ignore_scripts_and_styles(mock_osp): """ By default, <script> and <style> tags should be ignored. """ html = """ <style>style</style> <script>script</script> <p>text</p> """ path = mock_osp.add_file(content=html, ftype='html') text = html_text(path).strip() assert text == 'text'