Skip to content

JohnEmhoff/disco

 
 

Repository files navigation

Disco - Massive data, Minimal code

Disco is an implementation of the [Map-Reduce] (http://en.wikipedia.org/wiki/MapReduce) framework for distributed computing. Like the original framework, which was publicized by Google, Disco supports parallel computations over large data sets on an unreliable cluster of computers. This makes it a perfect tool for analyzing and processing large datasets without having to bother about difficult technical questions related to distributed computing, such as communication protocols, load balancing, locking, job scheduling or fault tolerance, all of which are taken care by Disco.

See [discoproject.org] (http://discoproject.org) for more information.

Note: For installing Disco, you cannot use the zip or tar.gz packages generated by github, instead you should clone this repository.

The develop branch contains the newest features and is not recommended for use in production. The master branch is the latest stable release and is tested in production. Important bug fixes will be first merged into the develop branch and then backported into the master branch.

Build Status: Travis-CI :: Travis-CI

About

a Map/Reduce framework for distributed computing

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Erlang 52.1%
  • Python 40.8%
  • JavaScript 4.5%
  • CSS 1.4%
  • Other 1.2%