Skip to content

xulijunji/TCGA

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Software Overview

This repository contains instructions for reproduction and extension of Multi-tiered genomic analysis of head and neck cancer ties TP53 mutation to 3p loss by Gross et al. In general code for data-processing and computation is enclosed in standard python modules, while high level analyis was recorded in IPython Notebooks. The analysis for this project was relatively non-linear and has thus been split into a number of notebooks as described in Analysis Notebooks, but results should be able to be replicated by running these notebooks.

As of July 1, 2014 all error bars are off due to a Pandas bug. They now show the difference between the mean and the lower bound as the uncertanty for the upper and lower bound rather than show the true 95% confidence interval... hopefully this will be addressed soon.

Dependencies

This code uses a number of features in the scientific python stack as well as a small set of standard R libraries. Thus far, this code has only been tested in a Linux enviroment, it may take some modification to run on other operating systems.

I highly recomend installing a scientific Python distribution such as Anaconda or Enthought to handle the majority of the Python dependencies in this project (other than rPy2 and matplotlib_venn). These are both free for academic use.

Python Dependencies

  • Numpy and Scipy, numeric calculations and statistics in Python
  • matplotlib, plotting in Python
  • Pandas, data-frames for Python, handles the majority of data-structures
  • statsmodels, used for statstics
  • scikit-learn, used for supervised learning
  • rPy2, communication between R and Python
    • NOT IN DISTRIBUTIONS
    • I recommend installing with pip install rpy2
    • Needs R to be compiled with shared libraries
  • matplotlib_venn
    • NOT IN DISTRIBUTIONS
    • I recommend installing with pip install matplotlib_venn
    • Only used for Venn diagrams, not essential

R Dependencies

  • Needs to be compiled with shared libraries to communicate with Python (this can be tricky)
  • Packages
    • base
    • survival
    • MASS

Command Line Dependencies

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Jupyter Notebook 98.0%
  • Python 2.0%