def test_map(resources): with Pool(name='test-pool') as pool: results = pool.map(lambda tpl: tpl[0] + tpl[1], zip(range(3), range(3)), resources=resources) expected = [i + i for i in range(3)] assert results == expected
def test_apply(): with Pool(name='test-pool') as pool: result = pool.apply(lambda a, b: a + b, [1, 3]) assert result == 4 result = pool.apply(lambda a, b: a + b, ['a', 'b']) assert result == 'ab' pool.wait(seconds=20)
def test_map_async(resources): with Pool(name='test-pool') as pool: results = pool.map_async(lambda tpl: tpl[0] + tpl[1], zip(range(5), range(5)), resources=resources) assert all([isinstance(res, AsyncResult) for res in results]) values = [res.get(timeout=30) for res in results] assert values == [i + i for i in range(5)]
def test_multiple_apply_async(resources): with Pool(name='test-pool') as pool: results = [ pool.apply_async(lambda a, b: a + b, [1, i], resources=resources) for i in range(5) ] values = [res.get(timeout=20) for res in results] assert values == [i + 1 for i in range(5)]
def test_apply_async(): with Pool(name='test-pool') as pool: res1 = pool.apply_async(lambda a, b: a + b, [1, 2]) res2 = pool.apply_async(lambda a, b: a * b, [3, 5]) pool.wait() assert isinstance(res1, AsyncResult) assert isinstance(res2, AsyncResult) assert res1.get(timeout=10) == 3 assert res2.get(timeout=10) == 15
def test_multiple_apply_async(resources): def fn(a, b): return a + b with Pool(name='test-pool') as pool: results = [ pool.apply_async(fn, [1, i], resources=resources) for i in range(10) ] values = [res.get(timeout=20) for res in results] assert values == [i + 1 for i in range(10)]
def test_queue_apply_async(zk, resources): def feed(i, queue): queue.put(cp.dumps(i)) queue = Queue(zk, '/satyr/test-pool') with Pool(name='test-pool') as pool: results = [ pool.apply_async(feed, [i, queue], resources=resources) for i in range(5) ] pool.wait(seconds=30) time.sleep(1) results = [cp.loads(queue.get()) for i in range(5)] assert sorted(results) == range(5)
def get(dsk, keys, optimizations=[], num_workers=None, docker='lensa/dask.mesos', zk=os.getenv('ZOOKEEPER_HOST', '127.0.0.1:2181'), mesos=os.getenv('MESOS_MASTER', '127.0.0.1:5050'), **kwargs): """Mesos get function appropriate for Bags Parameters ---------- dsk: dict dask graph keys: object or list Desired results from graph optimizations: list of functions optimizations to perform on graph before execution num_workers: int Number of worker processes (defaults to number of cores) docker: string Default docker image name to run the dask in zk: string Zookeeper host and port the distributed Queue should connect to mesos: string Mesos Master hostname and port the Satyr framework should connect to """ pool, kazoo = _globals['pool'], _globals['kazoo'] if pool is None: pool = Pool(name='dask-pool', master=mesos, processes=num_workers) pool.start() cleanup_pool = True else: cleanup_pool = False if kazoo is None: kazoo = KazooClient(hosts=zk) kazoo.start() cleanup_kazoo = True else: cleanup_kazoo = False # Optimize Dask dsk2, dependencies = cull(dsk, keys) dsk3, dependencies = fuse(dsk2, keys, dependencies) dsk4 = pipe(dsk3, *optimizations) def apply_async(execute_task, args): key = args[0] func = args[1][0] params = func.params if isinstance(func, SatyrPack) else {} params['id'] = key if 'docker' not in params: params['docker'] = docker return pool.apply_async(execute_task, args, **params) try: # Run queue = Queue(kazoo, str(uuid4())) result = get_async(apply_async, 1e4, dsk3, keys, queue=queue, **kwargs) finally: if cleanup_kazoo: kazoo.stop() if cleanup_pool: pool.stop() return result
def test_apply_async_timeout(): with pytest.raises(TimeoutError): with Pool(name='test-pool') as pool: res = pool.apply_async(time.sleep, (3, )) res.get(timeout=1)