Skip to content

davehunt/ActiveData

 
 

Repository files navigation

ActiveData Build Status

Provide high speed filtering and aggregation over data see ActiveData Wiki Page for project details

Use it now!

ActiveData is a service! You can certainly setup your own service, but it is easier to use Mozilla's!

curl -XPOST -d "{\"from\":\"unittest\"}" http://activedata.allizom.org/query

Requirements

  • Python2.7 installed
  • Access to an Elasticsearch cluster, or installed one locally

Installation

It is still too early for PyPi install, so please clone master off of github:

git clone https://github.com/klahnakoski/ActiveData.git
git checkout master

and install your requirements:

pip install -r requirements.txt

Configuration

The ActiveData service requires a configuration file that will point to the default Elasticsearch index. You can find a few sample config files in resources/config. simple_settings.json is simplest one:

    {
        "flask":{
             "host":"0.0.0.0",
             "port":5000,
             "debug":false,
             "threaded":true,
             "processes":1
         },
        "constants":{
            "pyLibrary.env.http.default_headers":{"From":"https://wiki.mozilla.org/Auto-tools/Projects/ActiveData"}
        },
        "elasticsearch":{
            "host":"http://localhost",
            "port":9200,
            "index":"unittest",
            "type":"test_result",
            "debug":true
        }
    }

The elasticsearch property must be updated to point to a specific cluster, index and type. It is used as a default, and to find other indexes by name.

Run

Jump to your git project directory, set your PYTHONPATH and run:

    cd ~/ActiveData
    export PYTHONPATH=.
    python active_data/app.py --settings=resources/config/simple_settings.json

Verify

Assuming you used the defaults, you can verify the service is up if you can access the Query Tool at http://localhost:5000/tools/query.html. You may use it to send queries to your instance of the service. For example:

    {"from":"unittest"}

This query can be used on Engineering Productivity's public ActiveData instance, and you can use a similar query to get a few sample lines from your cluster.

Tests

The Github repo also included the test suite, and you can run it against your service if you wish. The tests will create indexes on your cluster which are filled, queried, and destroyed

    cd ~/ActiveData
    export PYTHONPATH=.
    # OPTIONAL, TEST_SETTINGS already defaults to this file
    export TEST_SETTINGS=tests/config/test_simple_settings.json
    python -m unittest discover -v -s tests

About

Provide high speed filtering and aggregation over data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 67.2%
  • Python 22.1%
  • CSS 7.5%
  • HTML 2.8%
  • Shell 0.3%
  • Batchfile 0.1%