Skip to content

imlzg/aioredis_bloom

 
 

Repository files navigation

aioredis_bloom

|Build status|

|Coverage status|

A simple Bloom filter written in Python 3 with asyncio using, redis ( aioredis ) as storage and the Murmur (mmh3) hash. Bloom filter is a space-efficient probabilistic data structure, is used to test whether an element is a member of a set.

Basic api:

import asyncio
import aioredis
from aioredis_bloom import BloomFilter


loop = asyncio.get_event_loop()


@asyncio.coroutine
def go():
    redis = yield from aioredis.create_redis(
        ('localhost', 6379), loop=loop)
    capacity = 100000  # expected capacity of bloom filter
    error_rate = 0.0001  # expected error rate
    # size of underlying array is calculated from capacity and error_rate
    bloom = BloomFilter(redis, 100000, 0.0001)
    yield from bloom.add('python')
    yield from bloom.add('asyncio')
    result = yield from bloom.contains('tornado')
    assert result==False

    redis.close()
loop.run_until_complete(go())

Intersection and union of two bloom filters:

Intersection and union of filters requires both filters to have both the same capacity and error rate.

import asyncio
import aioredis
from aioredis_bloom import BloomFilter


loop = asyncio.get_event_loop()


@asyncio.coroutine
def go():
    redis = yield from aioredis.create_redis(
        ('localhost', 6379), loop=loop)

    bloom1 = BloomFilter(redis, 1000, 0.001, 'bloom:1')
    bloom2 = BloomFilter(redis, 1000, 0.001, 'bloom:2')

    # init data in both bloom filters
    yield from bloom1.add('tornado')
    yield from bloom1.add('python')

    yield from bloom2.add('asyncio')
    yield from bloom2.add('python')

    # intersection
    inter_bloom = yield from bloom1.intersection(bloom2)

    in_bloom1 = yield from inter_bloom.contains('python')
    print(in_bloom1)  # True
    in_bloom2 = yield from inter_bloom.contains('asyncio')
    print(in_bloom2)  # False

    # union
    union_bloom = yield from bloom1.intersection(bloom2)

    in_bloom1 = yield from union_bloom.contains('python')
    print(in_bloom1)  # True
    in_bloom2 = yield from union_bloom.contains('asyncio')
    print(in_bloom2)  # True

    redis.close()

loop.run_until_complete(go())

Requirements

Thanks

I've learned a lot from following projects:

License

The aioredis_bloom is offered under MIT license.

About

Bloom filter for aioredis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 96.0%
  • Makefile 4.0%