Skip to content

Othernet-Project/artexin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ArtExIn

image

image

ArtExIn is short for Article Extraction and Indexing. It's a set of modules for fetching HTML pages, extracting relevant articles from it, and indexing the extracted text.

ArtExIn is developed by Outernet Inc and it powers the preparation of web pages for broadcast over the Outernet network.

Installation

Install artexin using pip:

pip install git+git://github.com/Outernet-Project/artexin.git

Tests

Execute unittests with:

python setup.py test

or if you've got tox installed:

tox

Reporting bugs

Please report all bugs to our issue tracker.

About

Article Extraction and Indexing for Outernet

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages