Skip to content

imclab/jaws

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

jaws, a metadata extraction toolkit

  >>> from jaws import Document
  >>> doc = Document.from_url(
  ...  'http://www.guardian.co.uk/'
  ...  'commentisfree/2012/sep/10/alzheimers-junk-food-catastrophic-effect')

extract content:

  >>> doc.html
  ...

extract cover image:

  >>> doc.image
  ...

extract author:

  >>> doc.author
  ...

extract title:

  >>> doc.title
  ...

there's also jaws.server.app WSGI application for exposing jaws as a web service

development takes place at http://github.com/dreamindustries/jaws

About

metadata extraction toolkit

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published