Skip to content

A pipeline for Human GWAS analysis that accomodates both Affymetrix (raw .CEL files) and Illumina (Plink binaries) data

License

Notifications You must be signed in to change notification settings

magosil86/witsGWAS

Repository files navigation

witsgwas_banner2

Background

witsGWAS is a simple human GWAS analysis workflow built at the Sydney Brenner Institute for data quality control (QC) and basic association testing. It takes away the need for having to enter individual commands at the unix prompt and rather organizes GWAS tasks sequentially (facilitated via Ruffus) for submission to a distributed PBS Torque cluster (managed via Rubra). witsGWAS monitors (using flag files) the progress of jobs/tasks submitted to the cluster on behalf of the user, courteously waiting for one job to finish before sending another one

Documentation

Installation, Examples and tutorials for witsGWAS can be accessed at the witsGWAS_wiki

Features

QC of Affymetrix array data (SNP6 raw .CEL files)

  • genotype calling
  • converting birdseed calls to PLINK format

Sample and SNP QC of PLINK Binaries

Sample QC tasks checking:

  • discordant sex information
  • calculating missingness
  • heterozygosity scores
  • relatedness
  • divergent ancestry

SNP QC tasks checking:

  • minor allele frequencies
  • SNP missingness
  • differential missingness
  • Hardy Weinberg Equilibrium deviations

Association testing

  • Basic PLINK association tests, producing manhattan and qqplots
  • CMH association test - Association analysis, accounting for clusters
  • permutation testing
  • logistic regression
  • emmax association testing

Dockerized Pipeline

The pipeline has been 'dockerized', simplifying its use. See the Dockerized section on the WitsGWAS Wiki for more information.

Authors

Lerato E. Magosi, Scott Hazelhurst, Rob Clucas and the WITS Bioinformatics team

License

witsGWAS is offered under the MIT license. See LICENSE.txt.

Download

witsGWAS-0.1.0

References

Anderson, C. et al. Data quality control in genetic case-control association studies. Nature Protocols. 5, 1564-1573, 2010

Sloggett, Clare; Wakefield, Matthew; Philip, Gayle; Pope, Bernard (2014): Rubra - flexible distributed pipelines. figshare. http://dx.doi.org/10.6084/m9.figshare.895626

About

A pipeline for Human GWAS analysis that accomodates both Affymetrix (raw .CEL files) and Illumina (Plink binaries) data

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages