python_web_crawler

Basic web crawler. Based on Udacity CS101 final project.

Takes a website string as input and then crawls the web starting from that initial seed page.

Stops after having crawled 100 pages. I will probably change this to a user input decision in the next version.

Returns a WebCorpus object and prints out:

An index
A graph of websites and their relationships to each other
A list of the websites in the graph and how 'linked to' they are. A page with more links will have a higher rank.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
crawler.py		crawler.py
getpage.py		getpage.py
main.py		main.py
search.py		search.py
webcorpus.py		webcorpus.py

Provide feedback