Skip to content
forked from douban/Kenshin

Kenshin: A time-series database alternative to Graphite Whisper with 40x improvement in IOPS

License

Notifications You must be signed in to change notification settings

longerpop/Kenshin

 
 

Repository files navigation

Kenshin

travis-ci status

Kenshin (るろうに剣心)

Kenshin project consists of two major components:

  • Kenshin is a fixed-size time-series database format, similar in design to Whisper, it's an alternative to Whisper for Graphite storage component. Whisper performs lots of tiny I/O operations on lots of different files, Kenshin is aiming to improve the I/O performance. For more design details please refer to design docs (Chinese) and QCon 2016 Presentation slide.

  • Rurouni-cache is a storage agent that sits in front of kenshin to batch up writes to files to make them more sequential, rurouni-cache is to kenshin as carbon-cache is to whisper.

Kenshin is developing and maintaining by Douban Inc. Currently, it is working in production environment, powering all metrics (host, service, DAE app, user defined) in douban.com.

What's the performance of Kenshin?

In our environment, after using Kenshin, the IOPS is decreased by 97.5%, and the query latency is not significantly slower than Whisper.

Quick Start

We recommend using virtualenv when installing dependencies:

$ git clone https://github.com/douban/Kenshin.git
$ cd Kenshin
$ virtualenv venv
$ source venv/bin/activate
$ pip install -r requirements.txt

Tests can be run using nosetests:

$ nosetests -v

Setup configuration

$ misc/init_setup_demo.sh

Setup Kenshin

$ python setup.py install  # Or `export PYTHONPATH=.`

Start Kenshin instance

$ python bin/rurouni-cache.py --debug --config=conf/rurouni.conf --instance=0 start

Send metrics to Kenshin instance

$ python examples/rurouni-pickle-client.py 1

Query data in cache

$ python bin/kenshin-cache-query.py system.loadavg.min_1.metric_test

Query data in file

$ python bin/kenshin-fetch.py storage/link/0/system/loadavg/min_1/metric_test.hs --from <timestamp>

Get kenshin file info

$ python bin/kenshin-info.py storage/link/0/system/loadavg/min_1/metric_test.hs

FAQ

Why don't you just use whisper?

Whisper is great, and initially we did use it. Over time though, we ran into several issues:

  1. Whisper using a lot of IO. There are serval reasons:
    • Using one file per metric.
    • Realtime downsample feature (different data resolutions based on age) causes a lot of extra IO
  2. Carbon-cache & Carbon-relay is inefficient and even is cpu-bound. We didn't write our own carbon-relay, but replaced carbon-relay with carbon-c-relay.

Why did you totally rewrite whisper? Couldn't you just submit a patch?

The reason I didn't simply submit a patch for Whisper is that kenshin's design is incompatible with Whisper's design. Whisper using one file per metric. Kenshin on the other hand merge N metrics into one file.

How to intergrate Kenshin with Graphite-Web?

Now you need to change the Graphite-Web source code to support Kenshin format, here is an example. And We will write a plugin for using Graphite-Web and/or Graphite-API with Kenshin-based storage backend.

Acknowledgments

Contributors

License

Kenshin is licensed under version 2.0 of the Apache License. See the LICENSE file for details.

About

Kenshin: A time-series database alternative to Graphite Whisper with 40x improvement in IOPS

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 99.5%
  • Shell 0.5%