Choose a movie genre, get a top recommendation
The goal of this project is for me to practice what I have learned in the Intermediate Python and Data Engineering chapter of the Ironhack program. For this project, I start with The Movies Dataset and web scraping Rotten Tomatos Top Movies. I import it and use my newly-acquired skills to build a data pipeline that processes the data and produces a result. I try demonstrate my proficiency with the tools we covered (functions, list comprehensions, string operations and web scraping) in my pipeline.
The project folder is structured in the following way:
-
main.py : that contains the code for my data pipeline.
-
INPUT : Folder where the dataset should be placed in csv format.
-
OUTPUT : Folder that contains the cleaned datasets and the output of my data pipeline.
-
SRC: Images and resources.
-
FUNCTIONS: Folder that contains the files functions.py with all the auxiliar functions used in this project.
- I acquire the data from the dataset CSV and the web scrapping.
- Clean the data and generate 2 new datasets to work with it
- Create the functions explore the datasets with the parameters given.
- Returns movie recommendations according to the parameters and metadata from the films.
- Run the main.py file and work with the 2 parameters, 'Year' and 'Genre'.
- Shows the movie recommendations.
To run the program the user needs to introduce two arguments:
- A genre: -- or -s
- Category of fast food company as: --fastfoodtype or -f