Skip to content

kanghaiyang/blaze

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Blaze is the next-generation of NumPy. It is designed as a foundational set of abstractions on which to build out-of-core and distributed algorithms over a wide variety of data sources and to extend the structure of NumPy itself.

Our goal is to allow easy composition of low level computation kernels ( C, Fortran, Numba ) to form complex data transformations on large datasets.

In Blaze computations are described in a high-level language ( Python ) but executed on a low-level runtime outside of Python. Allowing the easy mapping of high-level expertise to data while not sacrificing low-level performance. Blaze aims to bring Python and NumPy into the massively-multicore arena, allowing it to able to leverage many CPU and GPU cores across computers, virtual machines and cloud services.

The general parallelization and distributed scheduling problem is extremely difficult and under active research, as such we do not aim to solve the problem in its full generality. We aim to provide a compact set of abstractions and types to express general transformations between code and data in addition to a framework for exploring distributed computations.

Simultaneously, in reality most analysts and scientific-computing users spend a large portion of their time combating practical, operational issues, such as cleaning data, matching data formats, and navigating heterogeneous technology environments. Blaze aims to tackle this problem in its entirely and become a "glue project" allowing many different users of other PyData projects ( Pandas, Theano, Numba, SciPy, Scikit-Learn) to interoperate.

Status

Blaze is a work in progress at the moment. The code is quite a distance from feature complete. The code is released in an effort to start a public discussion with our end users and community.

Documentation

Installing

If you are interested in the development version of Blaze you can obtain the source from Github.

$ git clone git@github.com:ContinuumIO/blaze.git

Many of the dependencies ( llvm, numba, ... ) are non-trivial to install. It is highly recommend that you build Blaze using the Anaconda Python distribution.

Free Anaconda CE is available here: http://continuum.io/anacondace.html .

Using Anaconda's package manager:

$ conda install ply
$ conda install blosc

Introduction

To build project inside of Anaconda:

$ make build

To build documentation:

$ make docs

To run tests:

$ python setup.py test

Alternative Installation

If you desire not to use Anaconda it is possible to build Blaze using standard Python tools. This method is not recommended.

  1. After you have checked out the Blaze source, create a virtualenv under the root of the Blaze repo.
$ virtualenv venv --distribute --no-site-packages 
$ . venv/bin/activate
  1. Pull the Conda package manager for use inside of your virtualenv.
git clone git@github.com:ContinuumIO/conda.git
  1. Build and install conda.
cd conda
python setup.sh install
cd ..
  1. Create a directory in your virtualenv to mimic the behavior of Anaconda and allow Continuum signed packages to be installed.
mkdir venv/pkgs
  1. Add conda to your path.
$ PATH=venv/bin:$PATH
  1. Use Anaconda to resolve Blaze dependencies.
conda install ply
conda install blosc
conda install numpy
conda install cython
  1. From inside the Blaze directory run the Makefile.
make build

Contributing

Anyone wishing to discuss on Blaze should join the [blaze-dev](https://groups.google.com/a/continuum.io/forum/#!forum/blaze -dev) mailing list at: blaze-dev@continuum.io

License

Blaze development is sponsored by Continuum Analytics.

Released under BSD license. See LICENSE for details.

About

Blaze is the next generation of NumPy

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published