Skip to content

gmarcais/pavfinder

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Post-Assembly Variant Finder (PAVFinder)

PAVFinder is a Python package that detects structural variants from de novo assemblies (e.g. ABySS, Trans-ABySS). As such, it is able to analyse both genome and transcriptome assemblies:

genomic structural variants pavfinder genome

  • translocations
  • inversions
  • duplications
  • insertions
  • deletions
  • simple-repeat expansions/contractions

transcriptomic structural variants pavfinder fusion

  • gene fusions
  • internal tandem duplications (ITD)
  • partial tandem duplications (PTD)
  • small indels
  • simple-repeat expansions/contractions

transcriptomic splice variants pavfinder splice

  • skipped exons
  • novel exons
  • novel introns
  • retained introns
  • novel splice acceptors/donors

PAVFinder infers variants from non-contiguous (split or gapped) contig sequence alignments to the reference genome. Assemblies can be aligned to the reference genome (c2g alignment) using bwa mem(genome) or gmap(transcriptome). Read support for events can be gathered by aligning reads to the assembly using bwa mem (r2c alignment).

A pipeline that bundles the 3 analysis steps called TAP (Transabyss-Alignment-PAVFinder) is provided to facilitate whole transcriptome analysis. TAP is also designed to be run in a targeted mode on selected genes. This requires a Bloom Filter of targeted gene sequences to be created beforehand. Whereas the full assembly of a single RNAseq library with over 100 million read pairs requires more than 24 hours to complete, a targeted assembly and analysis of a gene list (e.g. COSMIC) of several hundred can be completed within half an hour.

A new pipeline named fusion-bloom coupling PAVFinder with our latest RNA-seq assembler RNA-Bloom has been added to the repository. We demonstrated that it has higher senstivitiy and specificity than most state-of-the-art fusion callers.

TAP2, the next version of TAP using RNA-Bloom instead of Trans-ABySS for better transcriptome assembly, has been released.

Publication

Readman Chiu, Ka Ming Nip, Justin Chu and Inanc Birol. TAP: a targeted clinical genomics pipeline for detecting transcript variants using RNA-seq data. BMC Med Genomics (2018) 11:79 https://doi.org/10.1186/s12920-018-0402-6

Readman Chiu, Ka Ming Nip, Inanc Birol. Fusion-Bloom: fusion detection in assembled transcriptomes. Bioinformatics (2019) btz902 https://doi.org/10.1093/bioinformatics/btz902

About

Post Assembly Variants Finder

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 99.1%
  • Other 0.9%