Skip to content

WenchaoLin/goenrich

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

goenrich

gitter.im chat

Documentation Status

Convenient GO enrichments from python. For use in python projects.

  1. Builds the GO-ontology graph
  2. Propagates GO-annotations up the graph
  3. Performs enrichment test for all categories
  4. Performs multiple testing correction
  5. Allows for export to pandas for processing and graphviz for visualization

Installation

Install package from pypi and download ontology
and needed annotations.
pip install goenrich
mkdir db
# Ontology
wget http://purl.obolibrary.org/obo/go/go-basic.obo -O db/go-basic.obo
# UniprotACC
wget http://geneontology.org/gene-associations/gene_association.goa_ref_human.gz -O db/gene_association.goa_ref_human.gz
# Entrez GeneID
wget ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene2go.gz -O db/gene2go.gz

Run GO enrichment

import goenrich

# build the ontology
O = goenrich.obo.ontology('db/go-basic.obo')

# use all entrez geneid associations form gene2go as background
# use goenrich.read.goa('db/gene_association.goa_ref_human.gz') for uniprot
gene2go = goenrich.read.gene2go('db/gene2go.gz')
values = {k: set(v) for k,v in gene2go.groupby('GO_ID')['GeneID']}

# propagate the background through the ontology
background_attribute = 'gene2go'
goenrich.enrich.propagate(O, values, background_attribute)

# extract some list of entries as example query
query = gene2go['GeneID'].unique()[:20]

# for additional export to graphviz just specify the gvfile argument
# the show argument keeps the graph reasonably small
df = goenrich.enrich.analyze(O, query, background_attribute, gvfile='test.dot')

# generate html
df.dropna().head().to_html('example.html')

# call to graphviz
import subprocess
subprocess.check_call(['dot', '-Tpng', 'test.dot', '-o', 'test.png'])
name namespace p q rejected term
1245 response to organic cyclic compound biological_process 2.856257e-06 6.732606e-06 1 GO:0014070
1668 ATP binding molecular_function 8.821334e-09 3.325412e-07 1 GO:0005524
1988 phosphorylation biological_process 1.101491e-03 1.118437e-03 1 GO:0016310
3319 cellular response to organonitrogen compound biological_process 2.639774e-05 5.084590e-05 1 GO:0071417
3422 metal ion binding molecular_function 1.719726e-05 3.439452e-05 1 GO:0046872

Generate png image using graphviz:

dot -Tpng example.dot > example.png

or directly from python:

import subprocess
subprocess.check_call(['dot', '-Tpng', 'example.dot', '-o', 'example.png'])

image

Check the documentation for all available parameters

Licence

This work is licenced under the MIT licence

Contributions are welcome!

Building the documentation

sphinx-apidoc -f -o docs goenrich goenrich/tests

About

GO enrichment with python -- pandas meets networkx

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%