seqr

seqr is a web-based tool for rare disease genomics. This repository contains code that underlies the Broad seqr instance and other seqr deployments.

Technical Overview

seqr consists of the following components:

postgres - SQL database used by seqr and phenotips to store project metadata and user-generated content such as variant notes, etc.
elasticsearch - NoSQL database used to store variant callsets.
redis - in-memory cache used to speed up request handling.
phenotips - 3rd-party web-based tool for entering structured phenotype data.
seqr - the main client-server application built using react.js, python and django.
pipeline-runner - optional container for running hail pipelines to annotate and load new datasets into elasticsearch. If seqr is hosted on google cloud (GKE or GCE), Dataproc spark clusters can be used instead.
kibana - optional dashboard and visual interface for elasticsearch.

The seqr production instance runs on Google Kubernetes Engine (GKE) and data is loaded using Google Dataproc Spark clusters.

On-prem installs can be created using docker-compose: Local installs using docker-compose

For notes on how to update an older instance, see

Name		Name	Last commit message	Last commit date
Latest commit History 8,281 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
deploy		deploy
hail_elasticsearch_pipelines @ d6e9ded		hail_elasticsearch_pipelines @ d6e9ded
matchmaker		matchmaker
reference_data		reference_data
seqr		seqr
ui		ui
.gitignore		.gitignore
.gitmodules		.gitmodules
.travis.yml		.travis.yml
LICENSE.txt		LICENSE.txt
README.md		README.md
collect_static.sh		collect_static.sh
docker-compose.yml		docker-compose.yml
install_dependencies.sh		install_dependencies.sh
manage.py		manage.py
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
servctl		servctl
settings		settings
settings.py		settings.py
wsgi.py		wsgi.py