Skip to content

phihag/etreehtml-py

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

etreehtml is a small, copy & pasteable Python library to parse HTML code into xml.etree.ElementTree.

Installation

etreehtml has no dependencies. If dependencies are an option, you should really use lxml instead of etreehtml. It requires Python 2.5 or newer, including Python 3.x with and without 2to3.

To install it, go ahead and copy the content of etreehtml.py and paste it above your code. Alternatively, youc an of course download the file, put it in your application's module search path (for example in the directory your other code resides in), and import it with import etreehtml.

Example code

doc = parseHTML('<html><p>Cont<br>ent</p></html>')
text = etree_text(doc.find('//p'))
assert text == 'Content'

About

Copy & Pasteable HTML parser for xml.etree

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published