ploomber

ploomber is workflow management tool that accelerates experimentation and facilitates building production systems. It achieves so by providing incremental builds, interactive execution, tools to inspect pipelines, by facilitating testing and reducing boilerplate code.

Install

If you want to try out everything ploomber has to offer:

pip install ploomber[all]

Note that installing everything will attemp to install pygraphviz, which depends on graphviz, you have to install that first:

# if you are using conda (recommended)
conda install graphviz
# if you are using homebew
brew install graphviz
# for other systems, see: https://www.graphviz.org/download/

If you want to start with the minimal amount of dependencies:

pip install ploomber

Example

from ploomber import DAG
from ploomber.products import File
from ploomber.tasks import PythonCallable, SQLDump
from ploomber.clients import SQLAlchemyClient

dag = DAG()

# the first task dumps data from the db to the local filesystem
task_dump = SQLDump('SELECT * FROM example',
                    File(tmp_dir / 'example.csv'),
                    dag,
                    name='dump',
                    client=SQLAlchemyClient(uri),
                    chunksize=None)

def _add_one(upstream, product):
    """Add one to column a
    """
    df = pd.read_csv(str(upstream['dump']))
    df['a'] = df['a'] + 1
    df.to_csv(str(product), index=False)

# we convert the Python function to a Task
task_add_one = PythonCallable(_add_one,
                              File(tmp_dir / 'add_one.csv'),
                              dag,
                              name='add_one')

# declare how tasks relate to each other
task_dump >> task_add_one

# run the pipeline - incremental buids: ploomber will keep track of each
# task's source code and will only execute outdated tasks in the next run
dag.build()

# a DAG also serves as a tool to interact with your pipeline, for example,
# status will return a summary table
dag.status()

Name		Name	Last commit message	Last commit date
Latest commit History 103 Commits
doc		doc
examples		examples
scripts		scripts
src/ploomber		src/ploomber
tests		tests
.gitignore		.gitignore
.travis.yml		.travis.yml
CHANGELOG.rst		CHANGELOG.rst
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.rst		README.rst
bootstrap.sh		bootstrap.sh
readthedocs.yaml		readthedocs.yaml
requirements.txt		requirements.txt
sample_package.sh		sample_package.sh
setup.cfg		setup.cfg
setup.py		setup.py
versioneer.py		versioneer.py

License

edblancas/ploomber

Folders and files

Latest commit

History

Repository files navigation

ploomber

Install

Example

About

Resources

License

Stars

Watchers

Forks

Languages