Skip to content

Python based pipeline management software for clusters that makes running recursive and dynamically scheduled computations straightforward. So far works with gridEngine, lsf, parasol and on multi-core machines.

License

benedictpaten/toil

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Toil is a massively scalable pipeline management system, written entirely in Python. Toil runs as easily on a laptop as it does on a bare-metal cluster or in the cloud, thanks to support for many batch systems, including Grid Engine, Parasol, and a custom Mesos framework.

Toil is robust, and designed to run in highly unreliable computing environments like Amazon's Spot Market. Towards this goal, Toil does not rely on a distributed file system. Instead, Toil abstracts a pipeline's global storage as a JobStore that can be stored either locally or on AWS. The result of this abstraction is a robust system that can be resumed even after an unexpected shutdown of every node in the cluster that resulted in the loss of all local data.

Writing a Toil script requires only a knowledge of basic Python, with Toil "Jobs" as the elemental unit of work in a Toil workflow. A Job can dynamically spawn other Jobs as needed, leading to an intuitive and powerful control over the pipeline.

Prerequisites

Python 2.5 < 3.0

pip 7.x

Apache Mesos 0.22.1, if using the Mesos batch system. This is Brew installable on OSX via:

brew install mesos

Git, if cloning from the Toil Github Repository

Installation

Toil uses setuptool's extras syntax for dependencies of optional features, like the Mesos batch system and the AWS JobStore. To install Toil with these extras, specify the features you would like to include when pip installing:

pip install toil[aws,mesos]

Building & Testing

This is only required if cloning from Git. Running:

make develop

will install Toil in editable mode. You can also specify extras to use in develop mode as follows:

make develop extras=[mesos,aws]

To run the tests, cd into the toil root directory and run:

make test

Finally, running:

make

by itself will print help for testing and building.

About

Python based pipeline management software for clusters that makes running recursive and dynamically scheduled computations straightforward. So far works with gridEngine, lsf, parasol and on multi-core machines.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 99.1%
  • Makefile 0.9%