Skip to content

A very simple python peer-to-peer implementation using consistent hashing (and an example distributed key-value store layered atop it) using the Thrift protocol and its basic RPC capabilities.

License

atl/thrifty-p2p

Repository files navigation

thrifty-p2p

A very simple peer-to-peer implementation using consistent hashing, the Thrift protocol and its basic RPC capabilities. It is not expected to be generically usable by arbitrary projects: thrifty-p2p implements a flat network, which is justifiable for the author's purposes (distributing processing amongst dozens of data-rich nodes), but will encounter trouble scaling.

Basic usage

python location.py [-h peer_node] [-p port_num] [--help]

Initiates and/or joins a simple peer-to-peer network. Default port_num is 9900. Absent a peer_node (which is the peer initially contacted for joining the network), the command initiates a network.

Example usage, in different terminal windows:

python location.py

python location.py --host localhost:9900 --port 9901

python location.py -h localhost:9901 -p 9902

... etc. ...

Dependencies & Requirements

The basic server uses consistent hashing via hash_ring.py, developed by Amir Salihefendic. For better and for worse, I modified the code to allow for easier node set manipulations (as if it were a list) and different node resolution behavior so that the internal keys are more directly related to md5 hashes.

The library requires the Apache Thrift Python libraries to be installed, but -- because the thrift compiler has been run already -- does not currently require the Thrift compiler or any other language libraries.

Design priorities

I'm using Thrift as much for its simple RPC underpinnings as the message format. I wanted to consistently distribute some processing and bother as little as possible with node lookup. This is the simplest thing that I came up with that sort-of works: a flat peer-to-peer network with each node keeping up with the state of all the nodes as well as it can. This necessarily limits the ring's ability to scale beyond dozens of nodes.

diststore

Diststore is an example of an application that can be layered atop the thrifty-p2p location service. It's a toy version of a distributed key-value store of the sort that all the cool kids are doing these days. The underlying peer-to-peer implementation makes no attempt to optimize for large scales, so don't get too excited about deploying this in production. Rather, it's provided for instruction and testing.

Although diststore.thrift uses locator.thrift for a lot of its interfaces, the implementation overloads some of the base methods as an illustration of how this simple library might be used in other projects.

Example usage, run this in a couple terminal windows:

python storeserver.py

Note how the second and following servers auto-discover the other servers you've initiated, join them as peers on the network, and find an open port. Then in another window start populating the store:

python storeput.py a apple
python storeput.py b banana
python storeput.py c crayon
python storeput.py d dinosaur
python storeprimer.py

You can then query the store:

python storeget.py c
python storetest.py

or start a new server, and see how a minimum of the existing keys redistribute themselves and an example of consistent hashing:

python storeserver.py

or even stop a server with ctrl-C or a kill -INT signal and see how the server hands its items off gracefully. However, the system is not robust to any more aggressive termination: keys will go missing.

What's happening here? Because every node has a full model of the network, it knows which node to forward a get() request to, or where to hand off its items when it leaves the network. Key methods here are overridden from location.LocatorHandler: add() and cleanup().

Programming usage

There are two primary thrift interfaces exposed in locator.thrift: locator.Base and locator.Locator. Locator.Base supports primal methods such as ping(), service\_type(), service\_types(), and die(), that don't rely on any of the locator services. Locator.Locator implements basic node location and management: propagating and dropping services on the network. When creating your own thrift definition, your service should import locator.thrift and extend one of the two service classes.

When creating a thrift handler implementation, the python class should inherit from location.BaseHandler or location.LocatorHandler and the Iface stub from your own class generated from your thrift file.

About

A very simple python peer-to-peer implementation using consistent hashing (and an example distributed key-value store layered atop it) using the Thrift protocol and its basic RPC capabilities.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages