Skip to content

webservices-usfca/Crawler

Repository files navigation

MyUSF Crawler

There are some crawlers for MyUSF:

  • change_word_url.py: A crawler for crawl url. You can crawl any word you want by changing the keyword variable inside the code. The important part is the url that you want to crawl(unique).
  • input_search_url.py: A crawler for crawl url. You can crawl any word you want by input the keyword in command line. The important part is the url that you want to crawl(unique).
  • element_search.py: A crawler for get html element. It is not allowed to have repeated element in the result list, you can change the keyword in the code.
  • element_search_repeated.py: A crawler for get html element. It is allowed to have repeated element in the result list, you can change the keyword in the code.
  • video_links_v2.py: a crawler example that crawl the video links inside MyUSF, the important part is the word that you want to crawl(unique).
  • search_word_repeated.py: a crawler for get the links contains certain keyword, the keyword can be repeated, links is unqiue
  • test_crawler.py: a test for regex correctness, you can change the keyword and the link to test if the regex is coded corretly
  • content_page.py: a crawler to crawl single content page

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages