MAFA - Massive automatic functional annotation

The easy way to do functional annotation of genomes and transcriptomes.

Setup

Dependences

Python 2.7
MySql Database

Optional Dependences

Graphing tools:

CairoSVG
Cairo
Tinycss
Cssselect
Pygal
Pycha

How to install optional dependences?

easy_install CairoSVG tinycss cssselect pygal

Clone the git repository

git clone https://github.com/alejo0317/GeneOntology-Python.git

Configure database

Edit the config.py file with your database settings (user, password, database)

Download mappings file and populate mysql database

Download the file idmapping.tab from ftp://ftp.pir.georgetown.edu/databases/idmapping/idmapping.tb and use it to populate your local database to generate GeneOntology associations.

To populate the database with the file idmapping.tab use the script named mappingsToDB.py. This a populate example:
python2 mappingsToDb.py /path/to/idmapping.tab

How to use

You can use the scripts one by one or can use the full wrapper of all process.

Scripts on this repository

GoDistribution.py

Generates two files: a file containing relation between GO Terms and sequences, another file containing the tabbed counts of GO terms wanted.

hits2go.py

Associantes Uniprot, Refseq, GI, accessions to GO identifiers using a mappings table.

GraphPie.py

Generates a Pie chart from a file with the GO counts.

Test data include

We have include a set of files to test this scripts. You will find it on the folder test_data

sequences2hits.csv

A blast-generated-file with the querys and subjects (NR and Uniprot database has been used)

sequences2Gos.csv

A relation between Sequences and his associated GO terms.

gos2Sequences.csv

A relation between GO terms and his association with the Sequences.

gosCounts.tab

Ready to graph file.

Other files include

go.obo

This file contain s the descriptions and relations bettwen the avaliable GO terms. Has been downloaded from http://purl.obolibrary.org/obo/go.obo

Celullar_Component

A list of "2nd level" GO terms from the GO category Celullar Component (GO:0005575)

Biological_Process

A list of "2nd level" GO terms from the GO category Biological Process (GO:0008150)

Mollecular_Function

A list of "2nd level" GO terms from the GO category Mollecular Function (GO:0003674)

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
Data		Data
Documentation/html		Documentation/html
Events		Events
Utilities		Utilities
test_data		test_data
BlastExec.py		BlastExec.py
Config.py		Config.py
Config.pyc		Config.pyc
CrossedGOSearch.py		CrossedGOSearch.py
GoDistribution.py		GoDistribution.py
GoFullAnalysis.py		GoFullAnalysis.py
Hits2Go.py		Hits2Go.py
Install.sh		Install.sh
MappingsToDB.py		MappingsToDB.py
README.md		README.md
UpdateDBs.py		UpdateDBs.py

BioinfUD/MAFA

Folders and files

Latest commit

History

Repository files navigation