Skip to content

nfarzaneh/vcftidy

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

vcftidy

bring some order to the fruit salad that is VCF.

About

VCF is the standard for representing variants, but different aligners use different conventions for key pieces of information, e.g. alt depths in the sample fields. vcftidy aims to improve this by

  1. putting ref and alt read depths into the AD field for each genotype (a la GATK) and setting Number=A for the header.
  2. splitting multiple alts
  3. normalizing (trimming and left-aligning) the variants.

Where 2 and 3 should greatly reduce false negatives due to incorrect annotations. If you have a common error in a VCF, please open an issue so that we can address it in vcftidy.

Use

$ python vcftidy.py $VCF $REFERENCE_FASTA > $TIDY_VCF

See Also

  • vt and associated paper does a nice job decomposing and normalizing variants.

About

normalize, left-align, trim, validate and clean VCF files

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%