pycassa is a python client library for Apache Cassandra with the following features:
- Automatic failover and operation retries
- Connection pooling
- Multithreading support
- A batch interface
- A class for mapping classes to Cassandra column families
The latest release is compatible with Cassandra 0.7 and 0.8.
pycassa is open source under the MIT license.
Documentation can be found here:
http://pycassa.github.com/pycassa/
It includes installation instructions, a tutorial, API documentation, and a change log.
IRC:
- Use channel #cassandra on irc.freenode.net. If you don't have an IRC client, you can use freenode's web based client.
Mailing List:
- User list: http://groups.google.com/group/pycassa-discuss
- Developer list: http://groups.google.com/group/pycassa-devel
If easy_install is available, you can use:
easy_install pycassa
The simplest way to install manually is to copy the pycassa directories to your program. If you want to install, make sure you have thrift installed, and run setup.py as a superuser.
easy_install thrift
python setup.py install
All functions are documented with docstrings. To read usage documentation, you can use help:
>>> import pycassa
>>> help(pycassa.ColumnFamily.get)
To get a connection pool, pass a Keyspace and an optional list of servers:
>>> pool = pycassa.ConnectionPool('Keyspace1') # Defaults to connecting to the server at 'localhost:9160'
>>> pool = pycassa.ConnectionPool('Keyspace1', server_list=['192.168.2.10'])
See the tutorial for more details.
To use the standard interface, create a ColumnFamily instance.
>>> pool = pycassa.ConnectionPool('Keyspace1')
>>> cf = pycassa.ColumnFamily(pool, 'Standard1')
>>> cf.insert('foo', {'column1': 'val1'})
1261349837816957
>>> cf.get('foo')
{'column1': 'val1'}
insert() also acts to update values:
>>> cf.insert('foo', {'column1': 'val2'})
1261349910511572
>>> cf.get('foo')
{'column1': 'val2'}
You may insert multiple columns at once:
>>> cf.insert('bar', {'column1': 'val3', 'column2': 'val4'})
1261350013606860
>>> cf.multiget(['foo', 'bar'])
{'foo': {'column1': 'val2'}, 'bar': {'column1': 'val3', 'column2': 'val4'}}
>>> cf.get_count('bar')
2
get_range() returns an iterable. Call it with list() to convert it to a list.
>>> list(cf.get_range())
[('bar', {'column1': 'val3', 'column2': 'val4'}), ('foo', {'column1': 'val2'})]
>>> list(cf.get_range(row_count=1))
[('bar', {'column1': 'val3', 'column2': 'val4'})]
You can remove entire keys or just a certain column.
>>> cf.remove('bar', columns=['column1'])
1261350220106863
>>> cf.get('bar')
{'column2': 'val4'}
>>> cf.remove('bar')
1261350226926859
>>> cf.get('bar')
Traceback (most recent call last):
...
cassandra.ttypes.NotFoundException: NotFoundException()