A real-life benchmark for NoSQL databases. It simulates the use case of powering the newsfeed for a social app. The benchmark is open source and easy to replicate. It uses Stream-Framework, the most widely used open source package for building scalable newsfeeds and activity streams.
CloudFormation, Fabric, Boto and Cloud-init are used to automatically setup the infrastructure and run the benchmark.
Note that running a benchmark can be expensive.
git clone https://github.com/GetStream/Stream-Framework-Bench.git
Create a new python virtual env
mkvirtualenv bench
Install the dependencies
pip install -r requirements
Ensure you have your AWS cli installed and configured.
Configure your credentials file
Change the key pair name used for starting the instances by editing stack.template
Start the cluster on AWS (warning, this is expensive). By default the stack will be created in the us-west-2 region.
fab create_stack:stack=cassandra,key_name=yourkeyname
Optionally you can use datadog to track the benchmark metrics:
fab create_stack:stack=cassandra,datadog=yourapikeyhere
You can view the progress in your Cloudformation dashboard. Note that cloud-init will take a while to run. (cassandra-driver takes a while to install)
fab run_bench:stack=cassandra
Note: This step will fail if your stack didn't complete the cloud-init configuration step.
The benchmark will slowly increase the number of users in the graph and measure:
- The time it takes to read a feed
- The fanout delay for feed updates
fab delete_stack:stack=cassandra
Fork Stream-Framework https://github.com/tschellenbach/stream-framework
Implement your own storage backend
Fork Stream-Framework-Bench
Update requirements.txt and reference your Stream-Framework fork
Copy the cassandra.json cloudformation file and make the required changes
- HighScalability post
A stack will typically start several components
- RabbitMQ (message queue) & Admin instance - 1 large node
- Task workers/ Celery - An autoscaling group of task workers
- A cluster of your database instances - 3 by default
- Running a celery worker locally
celery -A benchmark worker -l debug
- Set CELERY_ALWAYS_EAGER to False