Skip to content

Acquires unstructured data from BookMyShow clean it and feed it to MySQL server for remote access.

Notifications You must be signed in to change notification settings

piyushsoni27/BMS_movies_data_extracter

Repository files navigation

Acquires unstructured data from BookMyShow clean it and feed it to MySQL server for remote access.
Scrapped data is structured mainly in 4 data tables:

CITY_Table : It contains information of all the cities compatible with BookMyShow.

  • Columns : "city_id", "city_name"

MOVIE_Table : It contains information about movies that are screened on cities supported by BMS all over INDIA.

  • Columns : "movie_name", "movie_id", "city_id", 'booking_links', "lang", "format_"

THEATRE_Table : It contains information(name, latitude, longitude, city) about all the theatres located in INDIA which supports booking through BMS.

  • Columns : "theatre_ID", "theatre_name", "theatre_lat", "theatre_long", "city", "city_ID"

SHOW_Table : It contains information of shows i.e. show_timings, show_dates, percentage of seats available, of movies screening in theatres all over INDIA.

  • Columns : "show_ID", "theatre_ID", "movie_ID", "show_date", "show_timing", "seat_percent_available"

Files:

  • bms_scrapper.py : Extract movies information for a particular region.

    • get_topten_movies : Extracts list of top ten movies screening in particuar region.
    • get_nearby_theatres : Lists names of nearby theatres to you location.
  • city_class.py : Class to extract cities supported by BookMyShow.
    city_table : city_ID, city_name

  • movie_class.py : Class to extract data of movies showing in all cities supported.
    MOVIE table : movie_ID, movie_name, city_ID

  • theatre_show.py :

    • Extracts details about theatre like latitude and longitutde of theatres, city.
    • Extracts details about shows like show_timings and dates, theatre and city.
  • sql.py :
    This script push the collected data to a remote MySQL server.

How to run:

To fetch information about certain region, just run: bms_scrapper.py

bms = BMSData("ncr")
print(bms.get_topten_movies)

About

Acquires unstructured data from BookMyShow clean it and feed it to MySQL server for remote access.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages