Project Pipelines

Choose a movie genre, get a top recommendation

Overview

The goal of this project is for me to practice what I have learned in the Intermediate Python and Data Engineering chapter of the Ironhack program. For this project, I start with The Movies Dataset and web scraping Rotten Tomatos Top Movies. I import it and use my newly-acquired skills to build a data pipeline that processes the data and produces a result. I try demonstrate my proficiency with the tools we covered (functions, list comprehensions, string operations and web scraping) in my pipeline.

Project Structure

The project folder is structured in the following way:

main.py : that contains the code for my data pipeline.
INPUT : Folder where the dataset should be placed in csv format.
OUTPUT : Folder that contains the cleaned datasets and the output of my data pipeline.
SRC: Images and resources.
FUNCTIONS: Folder that contains the files functions.py with all the auxiliar functions used in this project.

1 - Clean and Analysis

I acquire the data from the dataset CSV and the web scrapping.
Clean the data and generate 2 new datasets to work with it

2 - Data Processing

Create the functions explore the datasets with the parameters given.
Returns movie recommendations according to the parameters and metadata from the films.

3 - Start the Query

Run the main.py file and work with the 2 parameters, 'Year' and 'Genre'.
Shows the movie recommendations.

To run the program the user needs to introduce two arguments:

A genre: -- or -s
Category of fast food company as: --fastfoodtype or -f

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
images		images
output		output
src		src
.gitignore		.gitignore
README.md		README.md
Untitled.ipynb		Untitled.ipynb
Untitled1.ipynb		Untitled1.ipynb
dataset.py		dataset.py
functions.py		functions.py
mail.py		mail.py
main.py		main.py
pdf.py		pdf.py
pdf2		pdf2
pdf3.py		pdf3.py
pruebas.py		pruebas.py
pruebo.pdf		pruebo.pdf
webscrapping.py		webscrapping.py

AlexMndzF/project-pipelines

Folders and files

Latest commit

History

Repository files navigation

Project Pipelines

Overview

Project Structure

1 - Clean and Analysis

2 - Data Processing

3 - Start the Query

About

Resources

Stars

Watchers

Forks

Languages