Skip to content

Czt1998/BaiduIndex_Crawl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BaiduIndex_Crawl

Collecting a movie's baiduindex of particular time.

Main code

  • main.py
    Call BaiduIndex_Crawl.py and Get_date.py.
  • BaiduIndex_Crawl.py
    Main code to collect the data from baiduindex.
  • Get_date.py
    Get the movie's release date.

Operating environment

Based on python3.5 and selenium, first need to install:

  1. selenium
  2. pytesseract
  3. Pillow
  4. phantomjs
  5. chromedriver

Operation instructions

Fill in account star.sh main.py
Open the Get_data.py, find 'AccountList' in line 11, fill in several account like this ['account','passwd'] Star with star.sh, and the it will run the main.py to do the task It will call the Get_data.py and BaiduIndex_Crael.py.

Sample

  • Let's take 山楂树之恋 as example
  • First use its name to get the date from MTime.
    date
  • And then use its name and date to get the data from baiduindex.
  • Store the imformation like this.

    movie_name
    [date1:data1,date2:data2....]
    example

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published