Skip to content

danielnaab/pomp

 
 

Repository files navigation

Pomp

Pomp is a screen scraping and web crawling framework. Like Scrapy, but more simple.

Inspired by Scrapy but simpler implementation and without hard Twisted dependency.

Features:

  • pure python
  • one dependency only for python2.x - concurrent.futures (backport package for python2.x)
  • one file applications, without project layouts and others restrictions
  • meta framework like Paste (a framework for scrapping frameworks)
  • extendible networking, may be used any sync or async methods
  • without parsing libraries in the core, use you favorites
  • can be distributed, designed to use an external queue

Do not care about:

  • redirects
  • proxies
  • caching
  • database integration
  • cookies
  • authentication
  • etc.

If you want some proxies, redirects or others stuff implement it by our self or use great library - requests as Pomp downloader.

Pomp examples

Pomp docs

Continuous integration status by drone.io:

Latest CI test

codecov

PyPI status:

Latest PyPI version

Number of PyPI downloads

Have wheel

License

Docs status:

Documentation Status

Pomp is written and maintained by Evgeniy Tatarkin and is licensed under BSD license.

Packages

No packages published

Languages

  • Python 98.8%
  • Shell 1.2%