- What is om?
- Installation
- Basic Usage
- Configuration File
- Contributing to om
- Hacking on om
- License
- Copyright
Collect disk usage, memory and cpu load info on remote boxes without having to install any software - as long as you can SSH into it.
- CPU load, disk and memory usage
- Supports email alerts when resources get to a critical state (e.g. nearly full disk, low free memory, high cpu load, etc)
$ pip install om
$ om 192.168.0.1
$ om 192.168.0.2,box2
$ om root:mypass@mybox:44445
Using a Configuration File
$ om -c <config.json>
The configuration file can be used for more advanced setups. Some use cases are monitoring multiple hosts, specifying alarming values of disk, cpu or memory usage for a host or for all hosts, using custom handlers, monitoring process status (running/stopped) and etc.
Please note that though the config file allows much customisation and host-specific settings, we suggest you to avoid most of them. Our goal with om is to have the best defaults so you don't have to configure much.
The configuration file must have a ''hosts'' section to indicate which hosts you want to collect from. It can also include ''ssh'', ''plugins'' and ''handlers'' sections for global configuration of SSH (username, password, port), plugins and handlers.
Hosts allow host-specific settings for SSH and plugins for more advanced setups.
{
"hosts": {
# Hosts go here (required)
"host1": {
# IP or hostname goes here (required)
"host": "179.25.15.2",
"ssh": {
# Host-specific SSH settings go here (optional)
}
"plugins": {
# Host specific plugin configurations go here (optional)
}
}
},
"ssh": {
# Global SSH settings go here (optional)
},
"plugins": {
# Global plugin configurations go here (optional)
}
"handlers": {
# Handlers configurations go here
}
}
Remember that having your SSH keys in place for the current user allows you to skip configuring ''ssh''. om can already load them from the local agent
''ssh'' is entirely optional if your local agent is already able to use keys to get to the machine.
{
"ssh": {
"username": "user001"
},
"hosts": {
"webserver01": {
"host": "webserver01.overseer.om",
"plugins": [
{
"type": "process_state",
"process_name": "nginx"
},
{
"type": "process_state",
"process_name": "postgres"
}
]
}
},
"plugins": [
{
"type": "disk_usage",
"thresholds": {
"usage": 90
}
}
],
"handlers": {
"stdout" : {},
}
}
Plugins are the units that collect the metrics on the designated machines. Plugins are added as an object to the plugins
list.
$ om -p
CPU load, memory and disk usage have builtin plugins and are always collected by om. No further configuration is required.
Checking if a process ''nginx'' is running:
{
"hosts": {
"my_web_server": {
"host": "192.168.0.1",
"plugins": [
{
"type": "process_state",
"process_name" : "nginx"
},
{
"type": "process_state",
"process_name" : "unicorn"
}
]
}
}
...
}
For instance, disk usages are reported as critical when they reach 80% usage. If for a certain box you want to be critical when it reaches 50%, then:
{
"hosts": {
"my_rails_app": {
"host": "125.22.13.12",
"plugins": [
{
"type": "disk_usage",
"thresholds": {
"usage": "50%"
}
}
]
}
}
...
}
You can also override the default value globally:
{
"hosts": {
"my_postgres": {
"host": "postgresbox",
"plugins_config": {
"disk_usage": {
"thresholds": {
"usage": "60%"
}
}
}
}
},
"plugins": [
{
"type": "disk_usage",
"thresholds": {
"usage": "50%"
}
}
]
}
Checks if disk usage is above a percentual threshold.
{
"type": "disk_usage",
"thresholds": {
"usage": "50%" #optional, default: 80%
}
}
Checks if memory usage is above a percentual threshold.
{
"type": "memory_usage",
"thresholds": {
"usage": "50%" #optional, default: 70%
}
}
Checks if the CPU load average is above a percentual threshold for the past 1, 5 and 15 minutes intervals.
{
"type": "cpu_load",
"thresholds": {
"avg_1min": "90%", #optional, default: 25%
"avg_5min": "80%", #optional, default: 50%
"avg_15min": "75%" #optional, default: 75%
}
}
Checks if a process with a given name is running at the host.
{
"type": "process_state",
"process_name": "<name of the process>"
}
Handlers receive the results of the plugins metrics and act upon them. Handlers can be as simple as the StdOut handler that simply prints the results to stdout or can save the metrics to a database. Plugins can be configured to have thresholds that are used to detect if the measured value indicates a risky situation. Handlers have access to this information and it can act only if the value is critical. For example, an Email handler can be configured to mail sysadmins only if the value reaches a critical or bad value.
Simple handler that just dumps the metrics to the standard output
Name: stdout
Parameters: none
Dumps the metrics to the standard output in JSON format
Name: json_stdout
Parameters: none
Sends email whenever critical values are found for metrics
Name: email
Parameters:
"handlers": {
"email": {
"smtp": "<smtp host>",
"port": <smtp port>,
"security": "<security mechanism used>", #optional, accepts starttls
"login": "<smtp user login>",
"password": "<smtp user password>",
"from": "<from mail>",
"to": ["<list of recipients>"],
"subject": "<subject for the mail>", #optional
}
}
Saves the metrics to a Sqlite3 database
Name: sqlite3
Parameters:
"handlers": {
"sqlite3": {
"path": "<file path>",
"expiration_days": <days after a metric is deleted>
}
}
Saves the metrics to a Redis database
Name: redis
Parameters:
"handlers": {
"redis": {
"host": "<redis host>",
"port": <redis port>,
"max_list_length": <number> #maximum number of metrics stored per instance and plugin
}
}
You're encouraged to submit issues, PRs and weigh in with your opinion anywhere. If you want to know how to get started, feel free to contact the authors either directly or through a new issue. We also love documentation so feel free to extend this README.
Hacking locally is really easy. First clone the repository:
$ git clone https://github.com/overseer-monitoring/om.git
Install the requirements (we provide a quick makefile for that):
$ make
Run the daemon on a host:
$ cd om
$ PYTHONPATH=. ./bin/om <host>
Run tests:
$ make test
LGPLv3 License. See LICENSE for details.
Copyright (c) 2014 André Dieb, Thiago Sousa Santos