Skip to content

ramonvanalteren/p2ptracker

 
 

Repository files navigation

Introduction

This tracker is part of our entire deployment pipeline which consists of several moving parts:

  • Jenkins buildserver(s) that build our sourcecode
  • Custom scripting for starting deploys
  • rtorrent client running on every node participating in a deploy
  • a servermanagement database that holds metadata on nodes in our environment
  • a tracker for bittorrent

The tracker uses the knowledge we have about our network topology to build two-tier swarms of bittorrent clients. It will return peers in a global swarm to the first two clients in a rack requesting tracker information. Any additional clients from the SAME rack requesting peer information from the tracker will only get peers in the same rack, thus building a second tier swarm that spans a single rack.

This setup was chosen because the uplink bandwidth in a rack is a critical resource for us. If many clients in a rack start downloading pieces from other peers randomly distributed in our network environment they may/will saturate the uplink in the rack, causing serious starvation issues and failing requests to production services.

By limiting the amount of peers that participate in a global bittorrent swarm in a single rack to 2 and capping the bittorrent client to ~100mbit/s we can guarantee that the rack uplink is only utilized for 20% by bittorrent traffic.

You could probably do something similar with QoS on your rack switches, but we deploy dumb switches which do not allow QoS easily and since we deploy a lot of them, this saves us the overhead of building a complex rack switch configuration management system.

Disclaimer for general use

This tracker works in our environment with our setup and specifically exploits our knowledge of our network topology.

YOUR MILEAGE WILL VARY !!

Your network topology is most certainly different than ours, your nodes will be different and the servermanagement metadata REST service that we operate is not (yet) open source.

HYVES, THE AUTHOR OR ANY CONTRIBUTERS WILL NOT ACCEPT ANY RESPONSIBILITY, LIABILITY NOR CLAIM ANY GUARANTEE THAT THIS SOFTWARE WILL WORK FOR YOU. IT MAY EAT YOUR CAT, LUNCH OR ENTIRE DATACENTER BANDWIDTH WITHOUT ANY PRIOR NOTICE OR WARNING

Or in legalese:

The Software is provided "AS IS", without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and noninfringement. In no event shall the authors or copyright holders be liable for any claim, damages or other liability, whether in an action of contract, tort or otherwise, arising from, out of or in connection with the Software or the use or other dealings in the Software.

Grouping Nodes

The call to our servermanagement metadata service is extremely simple and can easily be replaced by other logic such as:

  • dns calls for TEXT records or LOCATION records
  • Subnet logic (swarms grouped by subnet)
  • Any form of key logic that will group clients by a key based on the ipaddress that they have
  • Any form of external service that will return a group key

There is NO support at the moment for multiple concurrent transfers as the groups are organized by transfer hash, which is different per transfer.

There is no code to generate torrent files, the torrent files we use are generated by running mktorrent from http://mktorrent.sourceforge.net

We run Gentoo and you can find the two relevant ebuilds for the rtorrent client and the libtorrent library in the client directory. These ebuilds are lightly modified ebuilds from the mainline of the gentoo project: http://www.gentoo.org

The tracker should work with any bittorrent client. I'd be interested in reports of other (non)working clients. Please file issues on github for non-working clients.

We use rtorrent which is a high performance C++ based bittorrent client based on libtorrent. Both are developed here: http://libtorrent.rakshasa.no/

I've included the patch we apply to rtorrent/libtorrent to speed it up. This patch removes features that you'd probably want if you're deploying this client in a hostile environment such as the internet. In addition it removes the minimal timeout for tracker requests which will get you (rightfully) blocked on any public tracker.

Configfiles

The tracker configfile has the following recognized values:

HOST: which hostname is the tracker running PORT: which port should it respond to REDISHOST: where to contact the redis backing store REDISPORT: which port to connect to redis on SMDB_URL: where to contact the REST servermanagement metadata service MAX_REPR_RACK: How many peers from a single rack can participate in a global swarm ACTIVE_INTERVAL: How often to contact the tracker if a transfer is active PASSIVE_INTERVAL: How often to contact the tracker if a transfer is passive MAXPEERS: How many peers to return to nodes PROXYPASS: Is the tracker behind a proxy and should it fix clients vars DEBUG: Should the tracker log debug statements/run debug code

In addition to the tracker configfile there is a sample rtorrent configfile included in the etc directory, which is a puppet template that we use to deploy the client in our environment. The variables should be fairly self-evident.

Redis storage model

Transfer information is stored in redis and requires at least redis 2.0.x and redis-py-2.0.x The redis model is described below

racks = all racks we've seen  
rack:rackname = all hosts we've ever seen in this rack  
transfers = all seen info_hashes  
active_transfers = all active info_hashes  
hash:peers:N =  all seen peers for the hash  
hash:peers:R =  all representants for the hash  
hash:peers:S =  all seeders for the hash  
hash:peers:L =  all leechers for the hash  
hash:rack:rackname:N =  all peers for the hash in a rack  
hash:rack:rackname:R =  all repr for the hash in a rack  
hash:rack:rackname:S =  all seeders for the hash in a rack  
hash:peer:peeripaddress:compact = True/False  
hash:peer:peeripaddress:port = port where the client is operating on  
hash:peer:peeripaddress:peer_id = peer id from client  
hash:peer:peeripaddress:key = peer key  
hash:peer:peeripaddress:last_event = last seen event  
hash:peer:peeripaddress:event: = datetime event was seen  
hash:peer:peeripaddress:seeder = True/False  
hash:peer:peeripaddress:downloaded = bytes downloaded  
hash:peer:peeripaddress:left = bytes left to downlaod  
hash:peer:peeripaddress:uploaded = bytes uploaded to other clients  
hash:peer:peeripaddress:rack = rack where the peer is located  
hash:peer:peeripaddress:hostname = hostname reported for ipaddress  
hash:length = length of the torrent payload  
hash:name = name of the torrent  
hash:registered = datetime transfer was activated  
hash:deregistered = datetime transfer was deactivated  
hash:first_started = datetime first peer started downloading  
hash:last_started = datetime last peer started downloading  
hash:first_completed = datetime first peer completed downloading  
hash:last_completed = datetime last peer completed downloading  
peer = ipaddress:port of the peer  
rack = rackname  
hash = uppercase hash for the torrent  

ALL VALUES ARE STRINGS !!!!!!

Deactivation renames all keys that start with hash, to datetime:hash where datetime is the datetime of deactivation

Contributing

We love patches, bug reports, and anything related to trying to get this to work in a different environment than ours

Please use githubs excellent issue system for bug reports Please use githubs even more awesome pull-request system to contribute

Contact me: ramon at hyves dot nl with any feedback

You can also find me on IRC, I usually hang out on Freenode in one of the gentoo-* channels and/or #vagrant #pocoo #fabric and #openstack

About

location aware bittorrent tracker

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%