Skip to content

fpierfed/owl_pipe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

STPipe


WARNING

STPipe (formerly known as owl_pipe) is being moved to

https://svn6.assembla.com/svn/jwst/trunk/stpipe/

please update your bookmarks.



STPipe is a light-weight Pipeline framework. It is implemented as a Python 
module called stpipe with no dependencies other than a reasonably recent 
version of Python (see below for further information) and ConfigObj (with the 
optional verify.py module). It is either called "OWL Pipe" or stpipe. The two 
names are used interchangeably in this document.

STPipe allows developers to construct Python-based scientific pipelines out of
reusable code. The framework provides two useful containers: a Pipeline object 
and a Step object.

A Pipeline is an ordered collection of Steps, where a Step is some python code
(that performs some part of the computation) and configuration data needed to 
run that code.  As a simple example from basic astronomical processing, a Step
could perform bias subtraction while a pipeline could be basic image detrending.

Pipelines in stpipe are built by creating an ASCII configuration file. That 
file is where, among other things, the Steps forming that pipeline (and their 
input and outputs) are defined.
Each Step has a separate configuration file where its parameters can be 
specified.

Both Pipeline and Step instances have access to some instance variables:
    1. log: a Logger instance from the Python logging module.
    2. qualified_name: fully qualified name, mostly used in logs. For Pipeline
       instances it is <system>.<name>. For Step instances it is 
       <pipeline qualified_name>.<name>.
    3. name: the Pipeline/Step name.
Pipeline instances have these additional instance variables:
    1. system: the name of the Pipeline system.
    2. log_level: the log threshold.
    3. local_logs: boolean determining whether to write logs to disk or STDOUT.
    4. steps: list of Step instances for that Pipeline.
    5. clipboard: a dictionary for passing data in-memory across Steps (see 
       below).
Step instances have these additional instance variables:
    1. pipeline: the Pipeline instance the Step belongs to.
    2. Any parameter defined in the parameters dictionary in the Step 
       configuration file (see below).
    3. Any input/output containers defined in the main Pipeline configuration 
       file (see below).

Clearly, STPipe stands on the shoulder of giants, among which NOAO NHPPS and 
LSST pex_harness. As such it borrows features and ideas from its predecessors
quite heavily.



Requirements
STPipe uses the "ConfigObj" module together with the optional Validation code.
It also requires a reasonably recent version of Python, likely 2.5 or later.
Python or later can be obtained from http://www.python.org
ConfigObj can be found at http://www.voidspace.org.uk/python/configobj.html but
remember to download both configobj.py and validate.py or the .zip file which 
contains both. Alternatively, download from PyPI (or by using easy_install).



Installation
Installation uses the standard python distutil module as detailed in the Python 
web site: http://docs.python.org/install/index.html#install-index

Basically, 
    shell> python setup.py install
should suffice.



Configuration File Format
The Pipeline and Step definition files are written in INI format with some
extensions from ConfigObj (e.g. the use of '#' to indicate comments, string 
interpolation etc.) 
(see http://www.voidspace.org.uk/python/configobj.html#the-config-file-format).



Pipeline Definition File Structure
Required high level sections:
    1. pipeline

Optional high level sections:
    None

The pipeline section
Required keys:
    1. name: the name of the Pipeline, can be anything and is only used in logs.
    2. system: the name of the pipeline system. Again it can be anything and
       is only used in logs to group pipelines together.
    3. steps: a list of sections, each one defining a Step (see below).
Optional keys:
    1. log_level: if present, it must be one of the log levels defined by the
       Python logging module. It defaults to "DEBUG".
    2. local_log_mode: if present and true, a log file is written to disk in the
       local directory as <system>.<name>.log. If false, log messages are 
       written to STDOUT. It defaults to false.

Sections in the steps list (the section name is the name of the Pipeline Step it
is referring to)
Required keys:
    1. config_file: the path to the Step configuration file (see below).
    2. python_class: the fully qualified name of the Python class to instantiate
       for the Step.
Optional keys:
    1. input: a comma separated, list of strings enclosed in double quotes. Each
       string is itself a comma separated two element list of variable name and
       corresponding object class name. The object class name can be omitted if
       type validation is not needed. The Step instance will have these instance
       variables defined and pre-populated (from the Pipeline clipboard).
    2. output: a comma separated, list of strings enclosed in double quotes. 
       Each string is itself a comma separated two element list of variable name
       and corresponding object class name. The object class name can be omitted
       if type validation is not needed. The Step instance will have these 
       instance variables defined but not pre-populated. It is the 
       responsibility of the Step code to assign values to them so that STPipe
       can pass them to subsequent Steps can consume them.



Step Configuration File Structure
Required high level sections:
    None

Optional high level sections:
    1. parameters

The parameters section
Required keys:
    None
Optional keys:
    1. Parameter name = parameter value. The Step instance will have access to 
       parameters by name directly (i.e. as instance variables of the 
       appropriate type, accessed using the usual Python self. notation).

About

Simple pipeline framework in Python

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages