Spark eQTL

This code enables eQTL analysis in Apache Spark, using Spark's python API and has been tested with Spark 1.3.1 and 1.4.0.

Klick here for the correspoding report, explaining motivation, design and outline of the algorithm.

Requirements:

Start a spark master and submit some workers
Set up your Spark context within a python shell (see spark_context.py for an example, no of cores and amount of memory is defined here.)
Define paths to your data inside the trans_analysis.py
Logging behavious is defined in your spark directory (very verbose by default).
Within the python shell call trans_analysis.py. Command line arguments define the output name and chromosome to be analyzed. E.g.:: $run trans_analyis.py 'full_analysis_chrom_1' 'chr1'

An shell script that automates all these task and analyzes the whole genome is: run_full_example.sh

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
matrix_mapping.py		matrix_mapping.py
run_full_example.sh		run_full_example.sh
run_full_example.sh~		run_full_example.sh~
trans_analysis.py		trans_analysis.py
trans_analysis.py~		trans_analysis.py~