There are some crawlers for MyUSF:
- change_word_url.py: A crawler for crawl url. You can crawl any word you want by changing the keyword variable inside the code. The important part is the url that you want to crawl(unique).
- input_search_url.py: A crawler for crawl url. You can crawl any word you want by input the keyword in command line. The important part is the url that you want to crawl(unique).
- element_search.py: A crawler for get html element. It is not allowed to have repeated element in the result list, you can change the keyword in the code.
- element_search_repeated.py: A crawler for get html element. It is allowed to have repeated element in the result list, you can change the keyword in the code.
- video_links_v2.py: a crawler example that crawl the video links inside MyUSF, the important part is the word that you want to crawl(unique).
- search_word_repeated.py: a crawler for get the links contains certain keyword, the keyword can be repeated, links is unqiue
- test_crawler.py: a test for regex correctness, you can change the keyword and the link to test if the regex is coded corretly
- content_page.py: a crawler to crawl single content page