Skip to content

anuragal/python-crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

python-crawler

Small Crawler written in Python 2.7

Basic Features:

  1. Uses Max Depth to stop crawling
  2. Can crawl outside domain URL's
  3. Checks for http response 200

Needs:

  1. Multiprocessing support
  2. Handle 301 and 302 url's
  3. Add crawled data into a database
  4. Respect robots.txt

About

Small Crawler written in Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages