Smarthome monitoring

Checks are divided in 3 categories:

Generic - these tests are performed by the monitor node to check the whole system health;
Driver - these tests are implemented by driver authors using the Smarthome-NeXus library and are run by the monitor node for each node that has the corresponding driver enabled according to config;
System - these tests are implemented and run on the node, logged to the remote location, and these logs are monitored by the monitor node.

The monitor node will detect running nodes by listing the nodes connected to the SSH tunnel. (Are we fine with this?)

System tests format

System tests log to a timestamped synced log file, named checks, one entry per line.

System tests will advertise the checks running at start, every time a new logfile is created, and when checks change. This is the format:

{'type': 'advert', 'checks': [
	{'name': 'xxx', 'period': 100},
	...
]}

The monitor will assume that a node not advertising, or not logging a check result for more than its period is failing.

The check result format is:

{'type': 'result', 'name': 'xxx',
 'status': 'GREEN', 'note': '',
 'timestamp': 123456789}

Code locations

Generic checks are implemented in the monitor node itself; Driver checks are implemented in smarthome-checks, in the monitors folder; System checks are implemented by the node itself.

Smarthome-NeXus monitoring library

Invocation and module layout

The monitor module should expose one single callable:

check(hub_id)

-> [{'name': 'driver/test',
     'status': nexus.GREEN/YELLOW/RED,
     'note': str(), 'timestamp': int()}, ...]

(Note: the return value might be later changed to a nexus.Result instance)

And a global:

PERIOD = 60

The monitor infrastructure will take care of calling check() every PERIOD seconds and registering the result.

For development, a helper is provided, that will turn the module into a command-line tool:

if __name__ == '__main__':
	nexus.command_line()

The command line usage is as follows:

Usage: module.py [--config=<path>] <hub-id>

Options:
	--config=<path>  The path to Smarthome-NeXus JSON config
	                 [default: ~/.smarthome.json]

So here is a recommended skeleton of a monitor module:

#! /usr/bin/env python

import nexus as nx
import time

__all__ = ['PERIOD', 'check']

PERIOD = 60

def check(hub_id):
	return [{
		'name': 'helloworld/xxx',
		'status': nx.GREEN,
		'note': '',
		'timestamp': time.time()
	}]
	
if __name__ == '__main__':
	nx.command_line()

Invoking it will work like this:

$ helloworld.py demo-hub
[2014-06-06 21:31:19,392] [INFO] nexus: The check() function would run every... 1 second(s)
[2014-06-06 21:31:19,392] [INFO] nexus: Running check('xxx')
[2014-06-06 19:31:19 -00:00] helloworld/xxx: GREEN ("")

Configuration

The library needs to be configured to fetch the data for the tests. You don't need to worry about this when running inside the monitor infrastructure, but you'll need to this yourself to run the dev command-line tool.

A JSON configuration file looks like this (without comments):

{
	"data_location": "admin@1.2.3.4:smarthome/",
	"ssh_key": "~/.smarthome-services", // optional
	"loglevel": "DEBUG"
}

(Note: the monitor infrastructure will need more parameters)

API

In all calls, if hub_id is not specified it will default to the value check() was called with.

fetch_data(driver_name, n=100, hub_id=None)

fetch_data will return the n most recent lines from data files with the passed driver_name (following the common naming convention).

Returns None if not found.

fetch_logs(driver_name, n=100, hub_id=None)

fetch_logs will return the n most recent lines from log files with the passed driver_name (following the common naming convention).

Returns None if not found.

data_timestamp(driver_name, hub_id=None)

data_timestamp will return the modified timestamp of the most recent data file with the passed driver_name (following the common naming convention).

Returns None if not found.

logs_timestamp(driver_name, hub_id=None)

logs_timestamp will return the modified timestamp of the most recent log file with the passed driver_name (following the common naming convention).

Returns None if not found.

list_hub_ids()

list_hub_ids will return a list of all the Hub IDs that ever logged.

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
monitor_node		monitor_node
nexus		nexus
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
requirements_node.txt		requirements_node.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

monitor_node

monitor_node

nexus

nexus

.gitignore

.gitignore

README.md

README.md

requirements.txt

requirements.txt

requirements_node.txt

requirements_node.txt

setup.py

setup.py

Repository files navigation

Smarthome monitoring

System tests format

Code locations

Smarthome-NeXus monitoring library

Invocation and module layout

Configuration

API

fetch_data(driver_name, n=100, hub_id=None)

fetch_logs(driver_name, n=100, hub_id=None)

data_timestamp(driver_name, hub_id=None)

logs_timestamp(driver_name, hub_id=None)

list_hub_ids()

About

Releases

Packages

Languages

smart-home/smarthome-monitor

Folders and files

Latest commit

History

Repository files navigation

Smarthome monitoring

System tests format

Code locations

Smarthome-NeXus monitoring library

Invocation and module layout

Configuration

API

fetch_data(driver_name, n=100, hub_id=None)

fetch_logs(driver_name, n=100, hub_id=None)

data_timestamp(driver_name, hub_id=None)

logs_timestamp(driver_name, hub_id=None)

list_hub_ids()

About

Resources

Stars

Watchers

Forks

Languages