GitHub - tahanzania/orchest: Orchest is a tool for creating data science pipelines.

Website — Docs — Quickstart — Slack

Orchest is a web based data science tool that works on top of your filesystem allowing you to use your editor of choice. With Orchest you get to focus on visually building and iterating on your pipeline ideas. Under the hood Orchest runs a collection of containers to provide a scalable platform that can run on your laptop as well as on a large scale cloud cluster.

Orchest lets you

Interactively build data science pipelines through its visual interface.
Automatically run your pipelines in parallel.
Develop your code in your favorite editor. Everything is filesystem based.
Tag the notebooks cells you want to skip when running a pipeline. Perfect for prototyping as you do not have to maintain a perfectly clean notebook.
Run experiments by parametrizing your pipeline. Easily try out all of your modeling ideas.

Installation

Requirements

Docker (tested on 19.03.9)

Linux/macOS/Windows(through WSL 2)

git clone https://github.com/orchest/orchest.git
cd orchest
./orchest.sh start

Note! on Windows Docker should be configured to use WSL 2. Make sure you clone inside the Linux environment. More info about Docker + WSL 2 can be found here: https://docs.docker.com/docker-for-windows/wsl/.

Quickstart

Please refer to our docs for a more comprehensive quickstart tutorial.

Build your pipeline.

Each pipeline step executes a file (.ipynb, .py, .R, .sh) in a containerized environment.

Write your code.

Iteratively edit and run your code for each pipeline step with an interactive JupyterLab session.

Run your pipeline and see the results come in.

Outputs (both stdout and stderr) are directly viewable and stored on disk.

Contributing

Contributions are more than welcome! Please see our contributer guides for more details.

We love your feedback

We would love to hear what you think and potentially add features based on your ideas. Come chat with us on Slack.

Name		Name	Last commit message	Last commit date
Latest commit History 509 Commits
.github/workflows		.github/workflows
dev-utils		dev-utils
docs		docs
lib/orchest-internals		lib/orchest-internals
orchest		orchest
userdir		userdir
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
orchest.sh		orchest.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

dev-utils

dev-utils

docs

docs

lib/orchest-internals

lib/orchest-internals

orchest

orchest

userdir

userdir

.gitattributes

.gitattributes

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

orchest.sh

orchest.sh

requirements.txt

requirements.txt

Repository files navigation

Table of contents

Installation

Quickstart

Build your pipeline.

Write your code.

Run your pipeline and see the results come in.

Contributing

We love your feedback

About

Releases

Packages

Languages

License

tahanzania/orchest

Folders and files

Latest commit

History

Repository files navigation

Table of contents

Installation

Quickstart

Build your pipeline.

Write your code.

Run your pipeline and see the results come in.

Contributing

We love your feedback

About

Resources

License

Stars

Watchers

Forks

Languages