Blaze is the next-generation of NumPy. It is designed as a foundational set of abstractions on which to build out-of-core and distributed algorithms over a wide variety of data sources and to extend the structure of NumPy itself.
Our goal is to allow easy composition of low level computation kernels ( C, Fortran, Numba ) to form complex data transformations on large datasets.
In Blaze computations are described in a high-level language ( Python ) but executed on a low-level runtime outside of Python. Allowing the easy mapping of high-level expertise to data while not sacrificing low-level performance. Blaze aims to bring Python and NumPy into the massively-multicore arena, allowing it to able to leverage many CPU and GPU cores across computers, virtual machines and cloud services.
The general parallelization and distributed scheduling problem is extremely difficult and under active research, as such we do not aim to solve the problem in its full generality. We aim to provide a compact set of abstractions and types to express general transformations between code and data in addition to a framework for exploring distributed computations.
Simultaneously, in reality most analysts and scientific-computing users spend a large portion of their time combating practical, operational issues, such as cleaning data, matching data formats, and navigating heterogeneous technology environments. Blaze aims to tackle this problem in its entirely and become a "glue project" allowing many different users of other PyData projects ( Pandas, Theano, Numba, SciPy, Scikit-Learn) to interoperate.
Blaze is a work in progress at the moment. The code is quite a distance from feature complete. The code is released in an effort to start a public discussion with our end users and community.
If you are interested in the development version of Blaze you can obtain the source from Github.
$ git clone git@github.com:ContinuumIO/blaze.git
Many of the dependencies ( llvm, numba, ... ) are non-trivial to install. It is highly recommend that you build Blaze using the Anaconda Python distribution.
Free Anaconda CE is available here: http://continuum.io/anacondace.html .
Using Anaconda's package manager:
$ conda install ply
$ conda install blosc
To build project inside of Anaconda:
$ make build
To build documentation:
$ make docs
To run tests:
$ python setup.py test
If you desire not to use Anaconda it is possible to build Blaze using standard Python tools. This method is not recommended.
- After you have checked out the Blaze source, create a virtualenv under the root of the Blaze repo.
$ virtualenv venv --distribute --no-site-packages
$ . venv/bin/activate
- Pull the Conda package manager for use inside of your virtualenv.
git clone git@github.com:ContinuumIO/conda.git
- Build and install conda.
cd conda
python setup.sh install
cd ..
- Create a directory in your virtualenv to mimic the behavior of Anaconda and allow Continuum signed packages to be installed.
mkdir venv/pkgs
- Add
conda
to your path.
$ PATH=venv/bin:$PATH
- Use Anaconda to resolve Blaze dependencies.
conda install ply
conda install blosc
conda install numpy
conda install cython
- From inside the Blaze directory run the Makefile.
make build
Anyone wishing to discuss on Blaze should join the [blaze-dev](https://groups.google.com/a/continuum.io/forum/#!forum/blaze -dev) mailing list at: blaze-dev@continuum.io
Blaze development is sponsored by Continuum Analytics.
Released under BSD license. See LICENSE for details.