Skip to content



Repository files navigation


For more information and the full documentation please visit

To chat with the author/other users (many of which use COSMOS to make bioinformatics NGS workflows), use gitter:

Join the chat at


pip install cosmos-wfm


COSMOS is a workflow management system for Python. It allows you to efficiently program complex workflows of command line tools that automatically take advantage of a compute cluster, and provides a web dashboard to monitor, debug, and analyze your jobs. COSMOS is able to scale on a traditional cluster such as LSF or GridEngine with a shared filesystem. It is especially powerful when combined with spot instances on Amazon Web Services and StarCluster.

COSMOS was designed to solve the problem of compute-intensive and complex scientific data pipelines. It's primary objective is to provide a simple but flexible api to specify complex job DAGs, a way to resume modified or failed workflows, and make debugging and provenance as easy as possible.


COSMOS was published as an Application Note in the journal Bioinformatics, but has evolved a lot since it's original inception. If you use COSMOS for research, please cite it's manuscript. This means a lot to the author.

Since the original publication, it has been re-written and open-sourced by the original author, in a collaboration between The Lab for Personalized Medicine at Harvard Medical School, the Wall Lab at Stanford University, and Invitae, a clinical genetic sequencing diagnostics laboratory.


  • Written in python which is easy to learn, powerful, and popular. A programmer with limited experience can begin writing COSMOS workflows right away.
  • Powerful syntax for the creation of complex and highly parallelized workflows.
  • Reusable recipes and definitions of tools and sub workflows allows for DRY code.
  • Keeps track of workflows, job information, and resource utilization and provenance in an SQL database.
  • The ability to visualize all jobs and job dependencies as a convenient image.
  • Monitor and debug running workflows, and a history of all workflows via a web dashboard.
  • Alter and resume failed workflows.

Multi-platform Support

  • Support for DRMS such as SGE, LSF. DRMAA coming soon. Adding support for more DRMs is very straightforward.
  • Supports for MySQL, PosgreSQL, Oracle, SQLite by using the SQLALchemy ORM.
  • Extremely well suited for cloud computing, especially when used in conjuection with AWS and StarCluster.

Bug Reports

Please use the Github Issue Tracker.



Some pretty big changes here, incurred during a hackathon at Invitae where a lot of feedback and contributions were received. Primarily, the api was simplified and made more intuitive. A new COSMOS primitive was created called a Dependency, which we have found extremely useful for generalizing subworkflow recipes. This API is now considered to be much more stable.

  • Renamed Execution -> Workflow
  • Reworked Workflow.add_task() api, see its docstring.
  • Renamed task.tags -> task.params.
  • Require that a task's params do not have keywords that do not exist in a task's functions parameters.
  • Require that a user specify a task uid (unique identifer), which is now used for resuming instead of a Task's params.
  • Created cosmos.api.Dependency, which provides a way to specify a parent and input at the same time.
  • Removed one2one, one2many, etc. helpers. Found this just confused people more than helped.
  • Various stability improvements to the drmaa jobmanager module


Python Workflow and Pipeline Management System







No packages published


  • Python 44.2%
  • HTML 41.7%
  • JavaScript 9.4%
  • CSS 4.1%
  • Other 0.6%