A collection of my MapReduce jobs for various quick analyses
This package is simply a collection of my various quick MapReduce jobs that I tend to run over various unstructured texts. This repository represents my MapReduce development environment as I tend to write MapReduce jobs using Dumbo in Python, and gives me a quick way to get setup and do various MR jobs and deploy them to my cluster.
These jobs are provided open source, and please feel free to use them- but note that there are no tests or really any api documentation rather than what's in the code files themselves.
-Ben