Task Recipes

In Task Recipes, we are building a dynamic data pipeline API by ordering tasks in sequence. Tasks are unit of operation with in the pipeline. All tasks must extend of 'pipeline.task.Task' and override its execute method.

The tasks passed as ordered arguments to 'pipeline.executor.py' script.

An example argument:

<APP_NAME> HTTP_READ <HTTP/HTTPS path> RECIPES_TRANSFORM WRITE json file:///Users/hasif/Learning/recipe/ BEEF_RECIPE_TRANSFORM PARTITION_WRITE hdfs://

<APP_NAME> - Application Name.
HTTP_READ <HTTP/HTTPS path> - This task reads data from the Http/Https url, type of the data is also passed as an argument.
RECIPES_TRANSFORM This task applies transformation given in this exercise to the entire dataset.
WRITE file://<LOCAL_PATH> This task writes the result from previous transformation to the given local path with the specified type.
BEEF_RECIPE_TRANSFORM This task applies transformation given in this exercise to dataset after filtering the 'ingredients' column for keyword 'beef'.
PARTITION_WRITE hdfs:// Task writes to hdfs in the specified format and location.

For example python executor.py ingest_recipe HTTP_READ json <INPUT_PATH> WRITE json file://<LOCAL_PATH> python executor.py ingest_recipe HTTP_READ <INPUT_PATH> BEEF_RECIPE_TRANSFORM PARTITION_WRITE json hdfs://<HDFS_PATH> difficulty python executor.py ingest_recipe HTTP_READ json <INPUT_PATH> RECIPES_TRANSFORM WRITE json file://<LOCAL_PATH> BEEF_RECIPE_TRANSFORM PARTITION_WRITE json hdfs://<HDFS_PATH> difficulty

Below lists the keywords associated with each tasks.

"HDFS_READ": "pipeline.task.Reader",
"HTTP_READ": "pipeline.task.HttpReader",
"WRITE": "pipeline.task.Writer",
"PARTITION_WRITE": "pipeline.task.PartitionWriter",
"FILE_WRITE": "pipeline.task.Writer",
"RECIPES_TRANSFORM": "pipeline.task.TransformRecipes",
"BEEF_RECIPE_TRANSFORM": "pipeline.task.BeefRecipes",
"EMAIL": "pipeline.task.EmailTask",
"SLACK": "pipeline.task.SlackTask",
"KAFKA": "pipeline.task.KafkaTask"

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
commons		commons
test		test
README.md		README.md
__init__.py		__init__.py
config.ini		config.ini
executor.py		executor.py
task.py		task.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

commons

commons

test

test

README.md

README.md

init.py

init.py

config.ini

config.ini

executor.py

executor.py

task.py

task.py

Repository files navigation

Task Recipes

About

Releases

Packages

Languages

HasifSubair/TaskRecipes

Folders and files

Latest commit

History

Repository files navigation

Task Recipes

About

Resources

Stars

Watchers

Forks

Languages