Spatially varying cis-regulatory divergence in

######################################################

Spatially varying cis-regulatory divergence in

Drosophila embryos elucidates cis-regulatory logic

###################################################### Peter A. Combs and Hunter B. Fraser

Department of Biology, Stanford University

This repository contains the analysis code for Combs and Fraser 2017 (bioRxiv preprint; currently submitted for review). Raw and processed data files available from the Gene Expression Omnibus.

Very briefly, the data analyses in these scripts do two things:

Process RNA-seq data from cryosliced D. melanogaster x simulans hybrid embryos. We are looking for genes with spatially varying allele-specific expression (svASE) is different in one part of the embryo compared to the other.
Perform modeling of the cis-regulatory input functions of genes using a modeling approach inspired in large part by Ilsley, et al (2013). Using genome alignments and motif searches, we can then make inferences about which cis-regulatory changes actually produced the spatially varying ASE that we observed in part 1.

Almost all of the code is written in either Python 3 or Snakemake. Known dependencies include:

SRA Tools
Snakemake
Bowtie2 (Reference gDNA alignment)
STAR (RNA-seq alignment)
Cufflinks (RNA-seq quantification)
Bedtools
Samtools
ASEr (https://github.com/thefraserlab/aser)
Hornet (https://github.com/thefraserlab/hornet)
Various python modules
- pysam
- progressbar
- numpy/scipy/pandas/matplotlib
- svgwrite
- pyemd
- BioPython
- statsmodels

RNA-seq and finding svASE

You should be able to go from raw reads to summary data tables by doing:

snakemake

Though you probably want to run this on a pretty high-powered machine---or, ideally, a compute cluster. The basic steps for a single sample are:

Download reads from SRA
Map reads to the D. melanogaster genome (the resulting file is called assigned_dmelR.bam for various historical reasons)
Calculate absolute abundances using Cufflinks
Filter out potentially ambiguous/mismapped reads using Hornet to implement the WASP pipeline.
Count allele-specific reads for each gene using ASEr

Then, all of the expresion and ASE data are combined into summary tables.

We found that fitting either a logistic or gaussian function to the data found all of the genes whose allele-specific expression had spatial patterns.

Name		Name	Last commit message	Last commit date
Latest commit History 513 Commits
Parameters		Parameters
defunct_code		defunct_code
iPythonNotebooks		iPythonNotebooks
prereqs		prereqs
.gitignore		.gitignore
AdjacentSliceExpr.py		AdjacentSliceExpr.py
CalculatePSI.py		CalculatePSI.py
ChromSizes.py		ChromSizes.py
CisTransASE.py		CisTransASE.py
CluToGene.py		CluToGene.py
CompareAtlases.py		CompareAtlases.py
ComparePatsers.py		ComparePatsers.py
Compare_svASE_TF_effects.py		Compare_svASE_TF_effects.py
CompileForGEO.py		CompileForGEO.py
CondenseMemes.py		CondenseMemes.py
CountSNPASE.py		CountSNPASE.py
DESeq.R		DESeq.R
DEseqDesign.txt		DEseqDesign.txt
DistributionDifference.py		DistributionDifference.py
EMD_Comparison_Figure.py		EMD_Comparison_Figure.py
FindAutocorrPSI.py		FindAutocorrPSI.py
FitASEFDR.py		FitASEFDR.py
FitASEFuncs.py		FitASEFuncs.py
FitASEvsGenotype.py		FitASEvsGenotype.py
FitModelToGenes.py		FitModelToGenes.py
FlyBaseAssocToGMT.py		FlyBaseAssocToGMT.py
GetASEStats.py		GetASEStats.py
GetGeneASE.py		GetGeneASE.py
GetMapStats.py		GetMapStats.py
GetSingleMapStats.py		GetSingleMapStats.py
GetTrueHets.py		GetTrueHets.py
GetUpstreamUntranscribed.py		GetUpstreamUntranscribed.py
HybridUtils.py		HybridUtils.py
MakeSimVersion.py		MakeSimVersion.py
MakeSummaryTable.py		MakeSummaryTable.py
MaskReferenceFromGATKTable.py		MaskReferenceFromGATKTable.py
MelSimCoverage.py		MelSimCoverage.py
NonSpatialASEFDR2.py		NonSpatialASEFDR2.py
OrderedSeqRec.py		OrderedSeqRec.py
OverallASEFDR.py		OverallASEFDR.py
ParseDelta.py		ParseDelta.py
PartitionReads.py		PartitionReads.py
PatserAlignToSVG.py		PatserAlignToSVG.py
PlotMatPatCounts.py		PlotMatPatCounts.py
PlotUtils.py		PlotUtils.py
PointClouds.py		PointClouds.py
RandomizeSampleOrder.py		RandomizeSampleOrder.py
RectifyASE.py		RectifyASE.py
SampleUnmapped.py		SampleUnmapped.py
SexSpecificDifferences.py		SexSpecificDifferences.py
Snakefile		Snakefile
SpatialDiffs.py		SpatialDiffs.py
SplitGenome.py		SplitGenome.py
SubFDR_PSI.sh		SubFDR_PSI.sh
SubFDR_velvet.sh		SubFDR_velvet.sh
SubSample.py		SubSample.py
SubmitAlignments.py		SubmitAlignments.py
Utils.py		Utils.py
VelvetAnt.py		VelvetAnt.py
cluster.json		cluster.json
qsub_base.sh		qsub_base.sh
qsubber		qsubber
readme.md		readme.md
setcolor.py		setcolor.py
svASEPhenotypes.py		svASEPhenotypes.py

petercombs/HybridSliceSeq

Folders and files

Latest commit

History

Repository files navigation

Spatially varying cis-regulatory divergence in

Drosophila embryos elucidates cis-regulatory logic

RNA-seq and finding svASE

About

Resources

Stars

Watchers

Forks