BaiduIndex_Crawl

Collecting a movie's baiduindex of particular time.

Main code

main.py
Call BaiduIndex_Crawl.py and Get_date.py.
BaiduIndex_Crawl.py
Main code to collect the data from baiduindex.
Get_date.py
Get the movie's release date.

Operating environment

Based on python3.5 and selenium, first need to install：

selenium
pytesseract
Pillow
phantomjs
chromedriver

Operation instructions

Fill in account	star.sh	main.py
Open the `Get_data.py`, find 'AccountList' in line 11, fill in several account like this ['account','passwd']	Star with `star.sh`, and the it will run the `main.py` to do the task	It will call the `Get_data.py` and `BaiduIndex_Crael.py`.

Sample

Let's take 山楂树之恋 as example
First use its name to get the date from MTime.
And then use its name and date to get the data from baiduindex.
Store the imformation like this.

movie_name
[date1:data1,date2:data2....]
example

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
movie		movie
pic		pic
BaiduIndex_Crawl.py		BaiduIndex_Crawl.py
README.md		README.md
get_data.py		get_data.py
main.py		main.py
star.sh		star.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

movie

movie

pic

pic

BaiduIndex_Crawl.py

BaiduIndex_Crawl.py

README.md

README.md

get_data.py

get_data.py

main.py

main.py

star.sh

star.sh

Repository files navigation

BaiduIndex_Crawl

Main code

Operating environment

Operation instructions

Sample

About

Releases

Packages

Languages

Czt1998/BaiduIndex_Crawl

Folders and files

Latest commit

History

Repository files navigation

BaiduIndex_Crawl

Main code

Operating environment

Operation instructions

Sample

About

Resources

Stars

Watchers

Forks

Languages