Skip to content

singhalvibhor05/Simple-crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Simple-crawler

A crawler is a program that starts with a url on the web (ex: http://python.org), fetches the web-page corresponding to that url, and parses all the links on that page into a repository of links. Next, it fetches the contents of any of the url from the repository just created, parses the links from this new content into the repository and continues this process for all links in the repository until stopped or after a given number of links are fetched.

The program require only python 2.6.6 and beautiful soap installed.

Execute the file with arguements URL and count of URL you want to fetch.

The Idea is to explore is the two python awesome liberary i.e beautiful soap and urllib2.

Please suggest the changes and imporvements at drvsinghal@gmail.com

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages