This project introduces how to conduct Sina Micro-Blog potential customer mining, including crawler big_V and its fans information, user blog and comment information, and user topic information.
- big_V_fans_crawler.py is a multi-threaded crawling code file to crawl large V personal and its fans information.
- comment_crawler.py is a multi-threaded crawling code file to crawl user comment information.
- blog_crawler.py is a multi-threaded crawling code file to crawl user blog information.
- simulation_crawler.py is the crawling code to crawl user profiles information.
- topic_crawler.py is the crawling code to crawl user topic information.
- db_api.py is the interface code between this program and database.
- rsa is the module to import when the project runs.