Skip to content

ampend/jumps-map

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 

Repository files navigation

jumps-map

This repository contains scripts for initial processing reads jump library data generated by Talkowski et al style jumps.

See http://www.ncbi.nlm.nih.gov/pubmed/21473983 and http://www.ncbi.nlm.nih.gov/pubmed/24789519 and for library details

Required Components

Python: genutils, fastqstats, Bio

Other: Pear http://www.ncbi.nlm.nih.gov/pubmed/24142950 http://www.exelixis-lab.org/web/software/pear

Background

The Talkowski jumping library method is based on circularization of long DNA fragments using a biotinylated linker with a pair of EcoP15I recognition sites. EcoP15I cuts 25/27 nucelotides away, resulting in double stranded fragments that look like:

5' XXXXXXXCTGCTGTACCGTTCTCCGTACAGCAGXXXXXXXX 3'
3' XXXXXXXGACGACATGGCAAGAGGCATGTCGTCXXXXXXXX 5'

Where XXX is 27 nucleotides of DNA at opposite ends of the original fragment. We thus expect a 27+27+26 = 80 bp long fragment. These are typically sequenced from both ends.

Here, we merge together overlapping read pairs, look for the linker sequence (which could be CTGCTGTACCGTTCTCCGTACAGCAG or CTGCTGTACGGAGAACGGTACAGCAG), and write out resulting paired end sequences, taking care to reverse complement read 1 to match standard library orientation.

Example Usage

python process-jump-fastq.py \
--r1fq miseq-runs/150601_M03079_0012_000000000-AG0UJ/Zoey_jump_R1.fastq.gz \
--r2fq miseq-runs/150601_M03079_0012_000000000-AG0UJ/Zoey_jump_R2.fastq.gz  \
--sample Zoey_miseq_jump \
--outdir ../results/2015-06-05/ 

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%