This is a very basic example of how to use Test Driven Development (TDD) in the context of PySpark, Spark's Python API.
Author: Dat Tran
License: See LICENSE.txt
- Use brew to install Apache Spark:
brew install apache-spark
- Change logging settings:
cd /usr/local/Cellar/apache-spark/1.6.1/libexec/conf
cp log4j.properties.template log4j.properties
- Set info to error:
log4j.rootCategory=ERROR, console
- Add this to your bash profile:
export SPARK_HOME="/usr/local/Cellar/apache-spark/1.6.1/libexec/"
- Use nosetests to run the test:
nosetests -vs test_clustering.py