Skip to content

tomasbelusky/gataca

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

gataca

Detection of genome variations by combination of read pair and split read methods. Depth of coverage is also used for increase of accuracy. Variations are set into clusters for better sensitivity.

manual

Usage: gataca.py [OPTIONS] <sample.bam> <reference.fasta>

Options:
  -h, --help                          show this help message and exit

  Input/output:
    -r STR, --region=STR              specify region (chr:from-to) of your interest, default: whole genome
    -o STR, --output=STR              name of output VCF file, default: standard output

  Reads:
    -p STR, --policy=STR              set how reads were sequenced (fr, rf) [fr]
    -q INT, --min_quality=INT         minimal mapping Phred quality score of read [30]
    -l INT, --min_length=INT          minimal length of split part [10]

  Depth of coverage:
    -w INT, --window_size=INT         size of window for getting coverage [100]
    -c STR, --coverage=STR            interval (min,max) of accepted coverage in windows, default: estimate from reads
    -a FLOAT, --coverage_core=FLOAT   core of windows from which min and max allowed coverage will be estimated [0.1]
    -u INT, --min_coverage_count=INT  minimal number of windows in core [1000]

  Insert size:
    -i STR, --insert_size=STR         interval (min,max) of accepted size between reads, default: estimate from reads
    -n INT, --insert_reads=INT        number of reads from which insert size will be estimated [50000]
    -e FLOAT, --insert_core=FLOAT     core of reads_num from which min and max insert size will be estimated [0.1]
    -m INT, --min_insert_count=INT    minimal number of reads in core [1000]

  Variations:
    -v FLOAT, --min_confidence=FLOAT   minimal confidence about variation [0.3]

requirements

  • Python libraries
    • pysam - manipulating with reads and reference genome
    • bx-python - fast implementation of finding overlapped intervals in tree
  • Tools
    • bwa - remapping of clipped sequences in reads

About

Detection of genome variations

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published