Skip to content

datitran/spark-tdd-example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A simple PySpark example using TDD

This is a very basic example of how to use Test Driven Development (TDD) in the context of PySpark, Spark's Python API.

Getting Started

  1. Use brew to install Apache Spark: brew install apache-spark
  2. Change logging settings:
  • cd /usr/local/Cellar/apache-spark/2.1.0/libexec/conf
  • cp log4j.properties.template log4j.properties
  • Set info to error: log4j.rootCategory=ERROR, console
  1. Add this to your bash profile: export SPARK_HOME="/usr/local/Cellar/apache-spark/2.1.0/libexec/"
  2. Use nosetests to run the test: nosetests -vs test_clustering.py

Dependencies

Copyright

See LICENSE for details. Copyright (c) 2017 Dat Tran.