Navigation Menu

Skip to content

ngannguyen/referenceViz

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

#Reference MHC Project Analysis Codebase

This is a set of scripts developed by Ngan Nguyen and Benedict Paten to generate Tables and Figures for the Reference MHC project.

##Dependencies

##Installation

  1. Download the package
  2. cd into referenceViz/src/
  3. Type make
  4. Add the referenceViz/bin/ directory into your PATH

##Run

  1. getPlot.py: a wrapper to run various analyses, including: Contiguity Coverage N50 SNP rate Indel rate Indel length distribution CNV dbSNP and 1000 Genomes project SNPs and short indels (<= 10bp) validation

    Input include: Location of output directory after running referenceScript pipeline If dbSNP and 1000 Genomes project SNPs validation is included, the dbSNP file with the known SNPs is required. If dbSNP and 1000 Genomes project indels validation is included, the dbSNP file with the known indels is required. The SNP files must be in the tab-separated format of the fields specified here , excluding the "bin" field.

  2. mapReadsToRef.py: script to compare mapping of short reads to two different reference sequences (i.e the C. Ref. sequence and GRCh37 sequence)

  3. mapLargeIndels.py: script to extract the large-indel sequences and map them to the reference.

    largeIndelTab.py: generates the summary table of the large indel mapping.

  4. uniqMapStats.py: script to compare reads that map uniquely to one reference but not uniquely to the other.

    uniqMapTab.py: generates the summary table for uniquely-mapping comparisons.

About

Scripts to generate plots for the reference project

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages