A tool to create Document Structure trees from XHTML websites.
License
cfournie/docstruct
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
DocStruct - A Document Structure Parser A tool to create Document Structure[1] (DS) trees from XHTML websites. This was created as a term project for CSI 5386 (Fall 2009) at the University of Ottawa, Fall 2009. More detailed information on the project can be found in the paper located at http://cloud.github.com/downloads/cfournie/docstruct/paper.pdf Directories \module\ - Contains the python parser tool \spec\ - Contains example DS trees, and the DS XML Schema References [1] R. Power, D. Scott, and N. Bouayad-Agha, "Document structure," Comput. Linguist., vol. 29, no. 2, pp. 211-260, 2003. Accessible at http://www.mitpressjournals.org/doi/abs/10.1162/089120103322145315
About
A tool to create Document Structure trees from XHTML websites.
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published