Skip to content

marian-babik/python-nap

Repository files navigation

Python Library to write Nagios (Monitoring) Plugins (NAP) with following features:

  • Supports writing both active and passive checks
  • Combination of active and mulitple passive checks via sequences
  • Passive check status sent via command file
  • Output formats for nagios, passive (command pipe) or check_mk (local check)
  • Wraps sys.stdout and sys.stderr to ensure correct output format with status and summary in the first line (regardless of exceptions, code execution flow, etc.)
  • Supports performance data (also for passive metrics)
  • Auto-defines basic command line arguments (e.g. -H, -v, -d, -w, -c, etc.)
  • Compatible with python 2.7 and 3.6+

Synopsis:

app = nap.core.Plugin()
app.add_argument("--test", help="define additional arguments (using argparse syntax")

@app.metric()
def test_metric(args, io):
    # code to take the measurment
    if args.test: # accessing arguments
        pass
        
    ret_code, result = nap.core.sub_process("echo \"detailed output\"", timeout=600)
    if ret_code == 0:
        io.status = nap.OK  # setting exit status
        io.summary = "no issues"  # setting summary line
    else:
        io.status = nap.CRITICAL
        io.summary = "test failed with %d" % ret_code
    
    print result  # detailed output via print
    io.write("another detailed output")  # or directly to buffer

    io.add_perf_data("cpu", 0.24)  # performance data
    io.add_perf_data("mem", 0.87, uom="%")
    
    # plugin status determined from io.status, return statement not needed

if __name__ == '__main__':
    app.run()

Sample run will output the following:
$ sample_plugin.py --help
usage: sample_plugin.py [-h] [--version] [-H HOSTNAME] [-w WARNING] [-c CRITICAL] [-d]
               [-p PREFIX] [-s SUFFIX] [-C COMMAND] [--dry-run] [-o OUTPUT]
               [--test TEST]

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
  -H HOSTNAME, --hostname HOSTNAME
                        Host name, IP Address, or unix socket (must be an
                        absolute path)
  -w WARNING, --warning WARNING
                        Offset to result in warning status
  -c CRITICAL, --critical CRITICAL
                        Offset to result in critical status
  -d, --debug           Specify debugging mode
  -p PREFIX, --prefix PREFIX
                        Text to prepend to ever metric name
  -s SUFFIX, --suffix SUFFIX
                        Text to append to every metric name
  -C COMMAND, --command COMMAND
                        Nagios command pipe for submitting passive results
  --dry-run             Dry run, will not execute commands and submit passive
                        results
  -o OUTPUT, --output OUTPUT
                        Plugin output format; valid options are nagios or
                        check_mk (defaults to nagios)
  --test TEST           additional argument


$ python sample_plugin.py
OK - no issues | cpu=0.24;;;; mem=0.87%;;;;
detailed output
another detailed output
$ python sample_plugin.py -o check_mk 
0 test_metric cpu=0.24;;;;|mem=0.87%;;;;| no issues

Writing passive plugins that report results via Nagios command pipe is easy, e.g.

@app.metric(passive=True)
def test_metric(args, io):
    io.set_status(nap.OK, "summary line")
    
$ python sample_plugin.py --dry-run -d
Dec 14 11:58:57 DEBUG core[98727]: Call sequence: [(<function test_metric at 0x106a00050>, 'test_metric', True)] 
Dec 14 11:58:57 DEBUG core[98727]:    Function call: test_metric
Dec 14 11:58:57 INFO core[98727]: [1481713137] PROCESS_SERVICE_CHECK_RESULT;localhost;test_metric;0;no issues | cpu=0.24;;;; mem=0.87%;;;; 

Plugin aggregating multiple metrics using a sequence of one active and multiple passive metrics is also possible, e.g.

app = nap.core.Plugin()
app.add_argument("--test", help="define additional arguments (using argparse syntax")

@app.metric(seq=2, passive=True)
def test_m1(args, io):
    # test CPU
    io.set_status(nap.OK, "cpu ok")

@app.metric(seq=1, passive=True)
def test_m2(args, io):
    # test mem
    io.set_status(nap.CRITICAL, "out of memory")

@app.metric(seq=3, passive=False)
def test_all(args, io):
    print "active probe that aggregates m1 and m2"

    results = app.metric_results()

    statuses = [e[1] for e in results]
    print statuses
    if all(st == 0 for st in statuses):
        io.set_status(nap.OK, "All fine")
    if 2 in statuses:
        io.set_status(nap.CRITICAL, "Not quite")
        
if __name__ == '__main__':
    app.run()

$ python sample_plugin.py --dry-run -d
Dec 16 09:50:08 DEBUG core[16183]: Call sequence: [(<function test_m2 at 0x10718c140>, 'test_m2', True), 
                                                   (<function test_m1 at 0x10718c1b8>, 'test_m1', True), 
                                                   (<function test_all at 0x10718c230>, 'test_all', False)] 
Dec 16 09:50:08 DEBUG core[16183]:    Function call: test_m2
Dec 16 09:50:08 INFO core[16183]: [1481878208] PROCESS_SERVICE_CHECK_RESULT;localhost;test_m2;2;general failure\noutput from m2\n
Dec 16 09:50:08 DEBUG core[16183]:    Function call: test_m1
Dec 16 09:50:08 INFO core[16183]: [1481878208] PROCESS_SERVICE_CHECK_RESULT;localhost;test_m1;0;no issues | cpu=0.24;;;; mem=0.87%;;;; \noutput from m1\n
Dec 16 09:50:08 DEBUG core[16183]:    Function call: test_all
CRITICAL - Not quite
output from all

Batch processing of multiple passive metrics is also supported:

@app.metric(passive=False)
def test_all(args, io):
    print "active probe that publishes m1, m2, etc as passive "
    for m in ['m1', 'm2']:
        hostname = '{}_host'.format(m)
        service = m
        status = 'OK'
        sum_out = '{} summary meessage'.format(m)
        details = '{} details'.format(m)
        io.batch_passive_out(hostname, service, nap.core.get_code(status), sum_out, details)
    results = app.metric_results()

    io.set_status(nap.OK, "All fine, all passive metrics published")
        
if __name__ == '__main__':
    app.run()

For more complex examples please check https://gitlab.cern.ch/etf/perfsonar-plugins; https://gitlab.cern.ch/etf/cmssam/-/blob/master/SiteTests/SE/cmssam_xrootd_endpnt.py or https://gitlab.cern.ch/etf/jess/-/blob/master/bin/check_js