IMPORTANT NOTE: this version is no longer maintained, new version can be found here: https://github.com/trungkak/ezcrawl
This is a web content extracting module written in Python, it's heavily based on python lxml. It works best on Deep websites (websites that result information based on what you entered) like Amazon, StackOverflow, Ebay,..
Given a front page url, it will extract all products/articles links from it (including pagination). It can also extract users comment about the products/articles.