django-zipkin is a middleware and api for recording and sending messages to Zipkin. Why use it? From the http://twitter.github.io/zipkin/:
"Collecting traces helps developers gain deeper knowledge about how certain requests perform in a distributed system. Let's say we're having problems with user requests timing out. We can look up traced requests that timed out and display it in the web UI. We'll be able to quickly find the service responsible for adding the unexpected response time. If the service has been annotated adequately we can also find out where in that service the issue is happening."
Python: 2.6
, 2.7
(the current Python Thrift release doesn't support Python 3)
Django: 1.3
- 1.8
There is a sample django app in the example
folder set up to use this middleware. If you want to see a working demo, we have provided docker images so you can try it locally; make sure you have docker-compose set up, and then:
cd example
docker-compose up
curl -v localhost:8000 -H 'X-B3-Flags: 1'
Opening http://localhost:8080 should bring up the zipkin web interface where you should be able to see your request trace.
Install the library:
pip install django-zipkin
Add the middleware to the list of installed middlewares:
MIDDLEWARE_CLASSES = ('...',
'django_zipkin.middleware.ZipkinMiddleware',
'...')
Set the name your service will use to identify itself. This will appear as the service name in Zipkin.
ZIPKIN_SERVICE_NAME = 'awesome-service'
django-zipkin
is now logging data compatible with the Zipkin collector to the logger called zipkin
.
From here you it's up to you to get the messages to Zipkin. Here's how we do it at Prezi:
- We configure logging in each service using
django-zipkin
to send log messages from thezipkin
logger to the locally running Scribe instance, into the categoryzipkin
. - The Scribe instances are configured to forward the
zipkin
category directly to the Zipkin collector. This is useful because Scribe buffers messages in case the collector (or the network to it) is down.
You can see an example for this in the example
folder.
Another alternative may be logging to syslog, and using scribe_apache
shipped with Scribe to send data to Zipkin (possibly via a local Scribe server).
django-zipkin
creates a single span per served requests. It automatically adds a number of annotations (see below). You can also add your own annotations from anywhere in your code:
from django_zipkin.api import api as zipkin_api
zipkin_api.record_event('MySQL: "SELECT * FROM auth_users"', duration=15000) # Note duration is in microseconds, as defined by Zipkin
zipkin_api.record_key_value('Cache misses', 15) # You can use string, int, long and bool values
To identify which spans belong to the same trace, some information must be passed on with inter-service calls. django-zipkin
provides facilities for this on both the client and the server side. The middleware automatically reads the trace propagation HTTP headers described in the Zipkin documentation. For propagating data to outgoing requests, a function returning a dict of the correct HTTP headers is provided:
from django_zipkin.api import api as zipkin_api
headers = zipkin_api.get_headers_for_downstream_request()
# During a request returns something like this:
{'X-B3-Sampled': 'false', 'X-B3-TraceId': 'b059fb34103a46f7', 'X-B3-Flags': '0', 'X-B3-SpanId': 'a42f4f3a045c54a5'}
sr
and ss
annotations are automatically added by the middleware. The following binary (key-value) annotations are also added:
Annotation | Example value | Added if |
---|---|---|
http.uri | /api/v1/login |
Always |
http.statuscode | 200 |
Always |
django.view.func_name | login |
Always |
django.view.class | AuthView |
If the view function is the method of a view-based class |
django.view.args | ('oauth') |
Always |
django.view.kwargs | {"next": "/index"} |
Always |
django.url_name | myapp.views.login |
Always |
django.tastypie.resource_name | user |
If the request is served by Tastypie (specifically, when the view gets a kwarg resource_name ) |
It's up to you to add cs
and cr
(client send and client receive) annotations in whatever client you use.
If a middleware above django-zipkin
returns a response, then the request processing part of django-zipkin
will never be called, resulting in an inconsistent internal state. In this case your custom annotations and most of the automatically added annotations will be lost, and timing information will be incorrect. An extra annotation will be added with the following value:No ZipkinData in thread local store. This can happen if process_request didn't run due to a previous middleware returning a response. Timing information is invalid.
If your view is wrapped (for example with a decorator) without using the functools.wraps
decorator, then django-zipkin
has no way of retrieving the name of the view. In this case django.view.func_name
will be the function name of the wrapper function. This is something you'll want to avoid in your own code.
One offender is Tastypie: django.view.func_name
will always be wrapper
. On requests served by Tastypie the annotation django.tastypie.resource_name
will be added with the name of the Tastypie resource, and django.url_name
will be something useful like api_dispatch_list
.
The django.view.kwargs
annotation has a JSON string as its value for easier automated processing. Unfortunately this make the UI display the value as [object Object]
. See Zipkin issue #410 for any progress on this. If you want to find the value on the web UI, you can open the page source and search for django.view.kwargs
.
You can customize the way django-zipkin
works with the following settings values. They are defined in django_zipkin/defaults.py
.
ZIPKIN_SERVICE_NAME: Default None
. The service name that will appear on Zipkin (the service_name
value in the sent Thrift objects).
ZIPKIN_LOGGER_NAME: Default 'zipkin'
. The name of the logger to use when sending Zipkin messages through the Python logging system.
ZIPKIN_DATA_STORE_CLASS: Default 'django_zipkin.data_store.ThreadLocalDataStore'
. django-zipkin
needs to pass some data from the request processor to the response processor. This same data needs to be accessible from anywhere in the users code. The default implementation for this is to use thread-local storage. gevent
and greenlet
monkey-patch it, so this implementation works fine even under gunicorn
and friends. You can provide your own implementation - it needs to implement the methods of django_zipkin.data_store.BaseDataStore
.
ZIPKIN_ID_GENERATOR_CLASS: Default 'django_zipkin.id_generator.SimpleIdGenerator'
. The class used to generate span and trace ids if we don't get one from the incoming request.
configglue
support is provided via django_zipkin.schema
; you can include it into your own schema like this:
from django_zipkin.schema import DjangoZipkinSection
class MySchema(...):
...
class zipkin(DjangoZipkinSection):
pass
See CONTRIBUTING.md for guidelines.
You can start hacking on django-zipkin
with:
git clone https://github.com/prezi/django-zipkin.git
cd django-zipkin
git remote rename origin upstream
virtualenv virtualenv
. virtualenv/bin/activate
pip install django
python setup.py test