Skip to content

youngrok/toc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

toc

toc is HTML table of contents generator. It parses html, generate table of contents, and put anchors into original html.

usage

toc_html, body = table_of_contents(html)
toc_html, body = table_of_contents(html, url='http://somedomain.com/somepath')
toc_html, body = table_of_contents(html, anchor_type='following-marker')
  • anchor_type
    • following-marker : Add anchor tag to the end of heading tags. Anchor text is #
    • stacked-number : Add anchor tag to the begining of heading tags. Anchor text is like 1.2.3.
  • toc_html: table of contents
  • body: modified html

install

pip install toc

notes

  • toc use html5lib for html parser. It's much slower than the popular xml library for python, lxml, but parses more precisely, especially for html5.
  • I don't think ElementTree is more pythonic than DOM. So I used minidom for treebuilder and py-dom-xpath for xpath.

About

python table of contents generator

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages