WebCrawler project

Simple python program, which crawls the web and records every unique visited url.

Entry point of the program -> main.py
Generates a database ('sites.db') with three tables -> urls, servers, domain
If the program is terminated, it will start again from the last visited site.
Run ./run_multiple_threads with different urls and it will crawl more things at the same time
Visualizations available -> run servers_visualization.py to get severs pie chart -> run user_analytics.py to get information about crawled domains

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
database_abstractions		database_abstractions
models		models
utils		utils
verification		verification
visualization		visualization
.gitignore		.gitignore
README.md		README.md
main.py		main.py
run_multiple_threads.sh		run_multiple_threads.sh
webcrawler_with_threads.py		webcrawler_with_threads.py

Provide feedback