Skip to content

mcmartins/broccoli

Repository files navigation

Broccoli

Build Status Code Climate Issue Count

logo

README

Broccoli is a concurrent Job executor. It allows an user to save time and money when running concurrent software on HPC.

Assumptions

  1. A user has a long process to run
    1a. A user has a sequence of processes to run
  2. A user runs it on a rented platform
    2a. Costs are based on a per resources usage

Use Cases

This section will introduce the use cases for Broccoli.

Run 2 parallel Tasks

A user is writing a new paper on a new theory. He needs to test his theory by finding a proof or counter proof. To do this, the user runs prover9 and mace4 in parallel.
He can test his theory within the prover9 and try to find a counter proof on mace4 at the same time, in parallel, by executing the two proceses. The processing may take days running and uses a fair amount of resources, so he is using a rented cloud HPC infrastructure. By taking an uncertain amount of time and in order to save resources, and thus money, the user configures broccoli for running these tasks in parallel. Broccoli will ensure that when a proof or counter proof is found, e.g. the first Task that finishes with success, all tasks started for the study are cancelled/killed/stoped in order to save computing resources. If no process reaches a conclusion, it will stop after a configurable timeout parameter. The user writes the inputs for prover9 and mace4 and configures broccoli.
The broccoli input looks like:

{
  "jobName": "MyFirstBroccoli",
  "jobDescription": "Search for a proof or counter proof for my new theory.",
  "workingDir": "/tmp/",
  "timeout": 259200,
  "tasks": [
    {
      "taskName": "T1",
      "taskDescription": "Generate Consequences from a predefined Theory.",
      "wait": false,
      "failTolerant": false,
      "commands": [
        "prover9 -f Input.in > Prover.out",
        "mace4 -f Input.in > Mace.out"
      ],
      "children": [
        {
          "taskName": "T1.1.1",
          "taskDescription": "For each Prover9/Mace4 output check for proofs found and notify PI.",
          "wait": false,
          "commands": [
            "for file in `grep -rl PROVED --exclude='*.json' --exclude='*.log' .`; do sendmail user@example.com  < $file; done"
          ]
        }
      ]
    }
  ]
}

API

Description

This tool handles Jobs. A Job is constituted by a set of Tasks. Each Task has Children Tasks and so on. Top Level Tasks will run in parallel and Children Tasks will follow as soon parent Tasks finishes. The processing finishes when one top level Task finishes executing (including, if any, all its Children Tasks). Each Task might need preparation, so Tasks are resolved into Sub Tasks before execution.

This tool accepts as input a JSON String/File, parses it and starts the processing.

The input JSON includes the following information:

  • A Job to execute:
  1. Name to identify the Job
  2. Description to best describe the Job
  3. Working directory where the Job will run, and all the output will be stored (if not specified otherwise)
  4. Timeout in seconds to kill the Job if exceeds
  5. List of Tasks that constitute the Job
  • For each Task:
  1. Name to identify the Task
  2. Description to best describe the Task
  3. Preparation step where the user can define what to do before running the commands in the form of Sub Tasks
  4. Command the actual command to execute and accomplish the Task
  5. List of Children Tasks circular dependency for other Tasks

Example Input

The input follows the following JSON Schema.
An example of an input would be something like:

{
  "jobName": "Consequences",
  "jobDescription": "Search for new Theories.",
  "workingDir": "/tmp/autonomous",
  "timeout": 10000,
  "tasks": [
    {
      "taskName": "T1",
      "taskDescription": "Generate Consequences from a predefined Theory.",
      "wait": false,
      "failTolerant": true,
      "commands": [
        "../LADR-2009-11A/bin/prover9 -f Theory.in > Consequences.out"
      ],
      "children": [
        {
          "taskName": "T1.1",
          "taskDescription": "For each consequence found, replace in original Theory and Run prover/mace to find new Theories.",
          "wait": false,
          "failTolerant": true,
          "preparation": {
            "filterFile": "Consequences.out",
            "pattern": "^given.*?T.*?:\\s*\\d+([^.\\#]*.)",
            "writeFile": "New-Theory-*.in",
            "placeholder": "$placeholder",
            "copy": true
          },
          "commands": [
            "../LADR-2009-11A/bin/prover9 -f $file > $file_prover.out",
            "../LADR-2009-11A/bin/mace4 -f $file > $file_mace.out"
          ],
          "children": [
            {
              "taskName": "T1.1.1",
              "taskDescription": "For each Prover9/Mace4 output check for proofs found.",
              "wait": false,
              "commands": [
                "grep -rl PROVED --exclude='*.json' --exclude='*.log' . && echo 'New Theories Found.' || echo 'No new Theories Found.'"
              ]
            }
          ]
        }
      ]
    }
  ]
}

This input will look for equivalent theories by replacing Concequences in the original Theory. It will run 2 commands in parallel and if a proof or counterproof is found, the other command will be killed. It will do this for each Consequence.

Integration with WebGAP

A node.js module should be produced in order to wrap the python tool. This tool will be used as an asynchronous job executor. The tool will be invoked over the WebGAP API and will run inside docker containers.

Diagrams

Application Flow Diagram:

alt text

Job Diagram

alt text

Tests

These Test Cases can be viewed on Tests Source / Travis CI

Online Input Editor

WebGAP will have his own input editor. While is not available, the input for broccoli can be generated here: Input Form

How to setup environment

Using a RHEL based system, CentOS 7, execute the following steps:

# assuming we're going to use the user home as working directory to download and install everything
# download prover9 software
wget https://www.cs.unm.edu/~mccune/prover9/download/LADR-2009-11A.tar.gz
# unpack
tar -xzf LADR-2009-11A.tar.gz
# remove tar
rm LADR-2009-11A.tar.gz
# go to folder and compile
cd LADR-2009-11A/
make all
# as root add prover9 to path (change path accordingly)
su -
echo 'pathmunge /home/user/LADR-2009-11A/bin' > /etc/profile.d/LADR.sh
chmod +x /etc/profile.d/LADR.sh
# go to home and download broccoli
su - user
git clone https://github.com/mcmartins/broccoli.git

Install & Run

Install module

sudo python setup.py build install

Run module

python -m broccoli -v -i /path/to/<input.json>
# or
python -m broccoli -v -i '<JSON>'

Future Tasks

  • Make available a Test Environment Package (it can be a docker image)
  • Support piping Tasks by stdout. At the moment only Tasks generating files for Guidance Tasks is implemented.

LICENSE

Apache License, Version 2.0

About

Concurrent Job Executor for HPC

Resources

License

Stars

Watchers

Forks

Packages

No packages published