Skip to content

This is an open source, multi-threaded website crawler written in Python. There is still a lot of work to do, so feel free to help out with development.

License

beaupedraza/Spider

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

This is an open source, multi-threaded website crawler written in Python. There is still a lot of work to do, so feel free to help out with development.


Note: This is part of an open source search engine. The purpose of this tool is to gather links only. The analytics, data harvesting, and search algorithms are being created as separate programs.


License


Setup Procedure

Python version 3.5+ is required

Library

--> pip install HTMLParser

To run the program

--> download the project

--> un-zip the folder

--> Open terminal

--> Change the terminal current working directory to the project location

--> python main.py

--> Follow the Script prompts


Links

About

This is an open source, multi-threaded website crawler written in Python. There is still a lot of work to do, so feel free to help out with development.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%