Stream-Framework-Bench

A real-life benchmark for NoSQL databases. It simulates the use case of powering the newsfeed for a social app. The benchmark is open source and easy to replicate. It uses Stream-Framework, the most widely used open source package for building scalable newsfeeds and activity streams.

CloudFormation, Fabric, Boto and Cloud-init are used to automatically setup the infrastructure and run the benchmark.

Running the benchmark

Note that running a benchmark can be expensive.

Setup your development environment

git clone https://github.com/GetStream/Stream-Framework-Bench.git

Create a new python virtual env

mkvirtualenv bench

Install the dependencies

pip install -r requirements

Ensure you have your AWS cli installed and configured.

Configure your credentials file

Change the key pair name used for starting the instances by editing stack.template

Start the cluster

Start the cluster on AWS (warning, this is expensive). By default the stack will be created in the us-west-2 region.

fab create_stack:stack=cassandra,key_name=yourkeyname

Optionally you can use datadog to track the benchmark metrics:

fab create_stack:stack=cassandra,datadog=yourapikeyhere

You can view the progress in your Cloudformation dashboard. Note that cloud-init will take a while to run. (cassandra-driver takes a while to install)

Running the benchmark using stream framework

fab run_bench:stack=cassandra

Note: This step will fail if your stack didn't complete the cloud-init configuration step.

The benchmark will slowly increase the number of users in the graph and measure:

The time it takes to read a feed
The fanout delay for feed updates

Stopping the stack

fab delete_stack:stack=cassandra

Testing another NoSQL database

Fork Stream-Framework https://github.com/tschellenbach/stream-framework

Implement your own storage backend

Fork Stream-Framework-Bench

Update requirements.txt and reference your Stream-Framework fork

Copy the cassandra.json cloudformation file and make the required changes

Benchmarks using Stream-Framework-Bench

HighScalability post

A typical stack

A stack will typically start several components

RabbitMQ (message queue) & Admin instance - 1 large node
Task workers/ Celery - An autoscaling group of task workers
A cluster of your database instances - 3 by default

Development tips

Running a celery worker locally

celery -A benchmark worker -l debug

Set CELERY_ALWAYS_EAGER to False

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
benchmark		benchmark
cloudformation		cloudformation
seedprovider		seedprovider
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
TODO.md		TODO.md
cassandra.ipv4		cassandra.ipv4
fabfile.py		fabfile.py
get_ip.py		get_ip.py
rabbit.ipv4		rabbit.ipv4
requirements.txt		requirements.txt
run.py		run.py

License

GetStream/Stream-Framework-Bench

Folders and files

Latest commit

History

Repository files navigation

Stream-Framework-Bench

Running the benchmark

Setup your development environment

Start the cluster

Running the benchmark using stream framework

Stopping the stack

Testing another NoSQL database

Benchmarks using Stream-Framework-Bench

A typical stack

Development tips

About

Resources

License

Stars

Watchers

Forks

Languages