Getting Started

PQuery is a Flask-based web application enabling fast interactive visualisation, querying, flexible filtering and alignment inspection of multi-sample, annotated, genetic variant data.

Without programming skills required, users can automatically convert their input VCF file (compressed or uncompressed, annotated or not annotated) to an indexed and structured SQLite database, fully adjusted to the imported VCF file. Once the database is created and stored, PQuery can repeatedly open in a browser window and instantly run on it with no need for re-importing every time it runs. The application allows users to query the genetic variants for specific samples and genomic regions and it then instantly returns a table with the queried variants (one variant per row) and all fields present in the original VCF file. It also dynamically calculates the allele counts and frequencies for the selected samples.

PQuery can run in two modes: a. The 'cohort mode' in which no sample-specific information is included in the returned table b. the 'sample-specific' mode in which the generated table will fully include the sample names and all the related information of the original VCF file.

Users can efficiently filter all columns in both modes by using a customisable filter pane and inspect the variant calls through the integrated IGV viewer.

Getting Started

Execute PQuery

From a linux machine:

sh run_app.sh

From a Windows machine:
Double-click the PQuery.bat file or the PQuery.lnk shortcup (you can copy the shortcut to your Desktop).

How to Guide

Preprocess an annotated VCF file so you can use in PQuery

Check the preprocessing directory.

Further Information

PQuery is currently operating on POlab hg19 data.

To do:

Update the DB with the latest hg38 data.
Update the back-end DB so can be faster. This scheme used right now (VCFdbR) is currently slow for querying whole exome for many samples. The major limitation is the time needed to calculate on-the-fly AC,AN and AF (Allele Count, Allele Number, Allele Frequency) for the queried cohort of samples and then filter these numbers to only include variants which have at least ONE alternative allele in at least ONE queried sample-set. Using NOSQL db management or GATK GenomicDBI might be good solutions.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
app		app
preprocessing		preprocessing
PQuery.bat		PQuery.bat
PQuery.lnk		PQuery.lnk
PQuery.py		PQuery.py
PQuery_Manual.pdf		PQuery_Manual.pdf
README.md		README.md
config.py		config.py
run_app.sh		run_app.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

app

app

preprocessing

preprocessing

PQuery.bat

PQuery.bat

PQuery.lnk

PQuery.lnk

PQuery.py

PQuery.py

PQuery_Manual.pdf

PQuery_Manual.pdf

README.md

README.md

config.py

config.py

run_app.sh

run_app.sh

Repository files navigation

Getting Started

Execute PQuery

How to Guide

Preprocess an annotated VCF file so you can use in PQuery

Further Information

PQuery Architecture

About

Releases

Packages

Languages

digrigor/PQuery

Folders and files

Latest commit

History

Repository files navigation

Getting Started

Execute PQuery

How to Guide

Preprocess an annotated VCF file so you can use in PQuery

Further Information

PQuery Architecture

About

Resources

Stars

Watchers

Forks

Languages