UMIpipe.py

This python script converts fastq file of dropseq sequencing data to an expression matrix where each column correspond to a cell, and each row correspond to a gene.
The dependencies of this pipeline include
- picard-tools
- dropseq-tools (included in the depository)
- STAR
- samtools
- references files: fasta, a picard .dict file in the same path as the fasta file, STAR index, a gtf file. The paths to the file is pre-specified and the user only need to specify the species using the --ref option. For now mm10 and hg38 are supported.
The script take a number of arguments at the beginning (see using -h), most of which have default values adapted to running on the yosef2 queue. An example of command is in runUMIpipe.sh. It has 5 basic parts:
- Convert fastq to sam: requires 3 arguments
  - --fq1: fastq read 1
  - --fq2: fastq read 2
  - --samplename: output name
- Tag barcode: this lets tag in bam files. The cell barcode tag is attached to an optional field in the sam file with the non-barcode read XC. The molecular barcode is attached to field XM. It assumes by default that read 1 contains the barcode sequence, and that cell barcode is base 1-12, and the molecular barcode is 13-20. At the end of the tagging, the first read is discarded. The default

Auxilary scripts

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Drop-seq_tools-1.12		Drop-seq_tools-1.12
UMI-tools_pipelines-0.0.4		UMI-tools_pipelines-0.0.4
DavidsData.sh		DavidsData.sh
README.md		README.md
UMIpipe.py		UMIpipe.py
UMIpipe_funcs.py		UMIpipe_funcs.py
compare_counts.R		compare_counts.R
count_gene_exons.py		count_gene_exons.py
runUMIpipe.sh		runUMIpipe.sh