GitHub - svembu/phylosub

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
python		python
vis		vis
README		README
alleles.py		alleles.py
cc.py		cc.py
data.py		data.py
data.txt		data.txt
evolve.py		evolve.py
node.py		node.py
params.py		params.py
porder.py		porder.py
printo.py		printo.py
tssb.py		tssb.py
util.py		util.py
util2.py		util2.py

Repository files navigation

This Python code is the accompanying software for the paper:
Wei Jiao, Shankar Vembu, Amit G. Deshwar, Lincoln Stein and Quaid Morris,
Inferring clonal evolution of tumors from single nucleotide somatic mutations,
BMC Bioinformatics 15:35, 2014.

It is built on top of the the code associated with the paper:
Ryan Prescott Adams, Zoubin Ghahramani and Michael I. Jordan,
Tree-Structured Stick Breaking for Hierarchical Data,  
Advances in Neural Information Processing Systems (NIPS) 23, 2010
http://hips.seas.harvard.edu/files/tssb.tgz

######################################################################

DEPENDENCIES:
SciPy - http://www.scipy.org/
ETE2 - http://ete.cgenomics.org/
CVXOPT - http://cvxopt.org/
PyGraphviz - http://networkx.lanl.gov/pygraphviz/

#######################################################################

PREPARING THE INPUT FILE:
Input file format:
The input to evolve.py should be a tab-delimited text file with 7 columns. The required column headers are:
-- gene: identifier for each somatic variant (each row should have a unique identifier)
-- a: number of reference allele read counts on the variant locus (if the read counts are from multiple tumour samples, the numbers from each sample should be separated by a comma ',', e.g. for three samples:  500, 342, 423)
-- d: total number of reads at the locus (if the read counts are from multiple tumor samples, they should be separated by a comma ',')
-- mu_r: fraction of expected reference allele sampling from reference population (e.g. if it is an A->T somatic mutation at the locus, the genotype of the reference population should be AA, so the mu_r should be 1-sequencing error rate)
-- mu_v: fraction of expected reference allele sampling from variant population (e.g. if it is an A->T somatic mutation at the locus, copy number is 2 and the expected genotype is AT for the variant population, then the expected fraction of expected reference should be 0.5)
-- delta_r and delta_v: pseudo-count for the Dirichlet prior on genotype probabilities (recommended value is 1, if no other prior information is available)

#######################################################################

RUNNING THE SOFTWARE:
1. To run the code on the sample data set "data.txt," execute the following command:

python evolve.py 'data.txt' 'trees' 'top_k_trees' 'clonal_frequencies' 'llh_trace' 100 100 1

# 'data.txt': file name of the input data
# 'trees': output folder name where the MCMC trees/samples are saved 
# 'top_k_trees': output file name to save top-k trees in text format. 
# 'clonalFrequencies': output file to save clonal frequencies
# 'lllhtrace': output file name to save log likelihood trace
The last three arguments are the number of MCMC samples, number of Metropolis-Hastings iterations, and a random seed number for initializing the MCMC sampler.

2. To generate the partial order plot for the trees (MCMC samples) generated by the above code, execute the following command:

 python porder.py 'trees' 'data.txt' 'partial_order.pdf'
  
# 'trees': the folder name where the MCMC trees/samples are located (saved from the above command)
# 'data.txt': file name of the input data
# 'partial_order.pdf': partial order plot output in pdf format

#######################################################################

About

No description, website, or topics provided.

Readme

Activity

0 stars

1 watching

0 forks

Report repository

Releases

No releases published

Packages

No packages published

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

python

python

vis

vis

README

README

alleles.py

alleles.py

cc.py

cc.py

data.py

data.py

data.txt

data.txt

evolve.py

evolve.py

node.py

node.py

params.py

params.py

porder.py

porder.py

printo.py

printo.py

tssb.py

tssb.py

util.py

util.py

util2.py

util2.py

Repository files navigation

About

Releases

Packages

svembu/phylosub

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks