def certain_kind_tap(data_items): """ As the stream of data items go by, get different kinds of information from them, in this case, the things that are fruit and metal, collecting each kind with a different spigot. stream_tap doesn't consume the data_items iterator by itself, it's a generator and must be consumed by something else. In this case, it's consuming the items by casting the iterator to a tuple, but doing it in batches. Since each batch is not referenced by anything the memory can be freed by the garbage collector, so no matter the size of the data_items, only a little memory is needed. The only things retained are the results, which should just be a subset of the items and in this case, the getter functions only return a portion of each item it matches. :param data_items: A sequence of unicode strings """ fruit_spigot = Bucket(get_fruit) metal_spigot = Bucket(get_metal) items = stream_tap((fruit_spigot, metal_spigot), data_items) for batch in i_batch(100, items): tuple(batch) return fruit_spigot.contents(), metal_spigot.contents()
def test_accumulation_handler(self): """ Ensure the return value of accumulation_handler is the contents of a Bucket instance with it's contents drained. :return: """ spigot = Bucket(unicode) spigot(1) spigot(2) result = accumulation_handler(None, spigot) self.assertEqual(result, deque([u"1", u"2"])) self.assertEqual(spigot.contents(), deque([]))