Hail

Hail is an open-source, scalable framework for exploring and analyzing genetic data. Starting from sequencing or microarray data in VCF and other formats, Hail can, for example:

generate variant annotations like call rate, Hardy-Weinberg equilibrium p-value, and population-specific allele count
generate sample annotations like mean depth, imputed sex, and TiTv ratio
load variant and sample annotations from text tables, JSON, VCF, VEP, and locus interval files
generate new annotations from existing annotations and the genotypes, and use these to filter samples, variants, and genotypes
find Mendelian violations in trios, analyze genetic similarity between samples via the GRM and IBD matrix, and compute sample scores and variant loadings using PCA
perform association analyses using linear, logistic, and linear mixed regression, and estimate heritability

All this functionality is exposed through Python and backed by distributed algorithms built on top of Apache Spark to efficiently analyze gigabyte-scale data on a laptop or terabyte-scale data on an on-prem cluster or in the cloud.

Hail is used in published research and as the core analysis platform of large-scale genomics efforts including ExAC v2 and gnomAD. The project began in Fall 2015 and is under very active development as we work toward a stable release, so we do not guarantee forward compatibility of formats and interfaces. Want to get involved in development? Check out the Github repo, chat with us in the Gitter dev room, and view our keynote at Spark Summit East 2017 below.

Interactive Tutorial

We've partnered with Databricks so you can explore Hail's functionality on data from the 1000 Genomes Project in just a few clicks.

Sign up for the free Community Edition of the Databricks platform at: https://accounts.cloud.databricks.com/registration.html#signup
Import the Hail tutorial notebook into your Workspace using this URL: https://docs.databricks.com/_static/notebooks/hail-tutorial-sse-2017.html
Follow the instructions in the notebook.

You can also view the notebook here.

Getting Started

To get started using Hail on your data:

follow the installation instructions in Getting Started
check out the Overview, Tutorial, and Python API
chat with the Hail team in the Hail Gitter room

We encourage use of the Discussion Forum for user and dev support, feature requests, and sharing your Hail-powered science. Follow Hail on Twitter @hailgenetics. Please report any suspected bugs to github issues.

Hail Team

The Hail team is based in the Neale lab at the Stanley Center for Psychiatric Research of the Broad Institute of MIT and Harvard and the Analytic and Translational Genetics Unit of Massachusetts General Hospital.

Contact the Hail team at hail@broadinstitute.org.

Citing Hail

If you use Hail for published work, please cite the software:

Hail, https://github.com/hail-is/hail

and either the forthcoming manuscript describing Hail (if possible):

Cotton Seed, Alex Bloemendal, Jonathan M Bloom, Jacqueline I Goldstein, Daniel King, Timothy Poterba, Benjamin M. Neale. Hail: An Open-Source Framework for Scalable Genetic Data Analysis. In preparation.

or the following paper which includes a brief introduction to Hail in the online methods:

Andrea Ganna, Giulio Genovese, et al. Ultra-rare disruptive and damaging mutations influence educational attainment in the general population. Nature Neuroscience

And we'd love to hear about your work in the Science category of the discussion forum!

Name		Name	Last commit message	Last commit date
Latest commit History 2,079 Commits
docs		docs
gradle/wrapper		gradle/wrapper
python/hail		python/hail
src		src
www		www
.gitignore		.gitignore
AUTHORS		AUTHORS
LICENSE		LICENSE
README.md		README.md
acknowledgements.txt		acknowledgements.txt
build.gradle		build.gradle
changes.md		changes.md
code_style.xml		code_style.xml
generate-build-info.sh		generate-build-info.sh
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle		settings.gradle
style-guide.md		style-guide.md
testng.xml		testng.xml

License

Fedja/hail

Folders and files

Latest commit

History

Repository files navigation

Hail

Interactive Tutorial

Getting Started

Hail Team

Citing Hail

About

Resources

License

Stars

Watchers

Forks

Languages