Skip to content
/ MAFA Public

Massive Automatic Functional Annotation

Notifications You must be signed in to change notification settings

BioinfUD/MAFA

Repository files navigation

MAFA - Massive automatic functional annotation

The easy way to do functional annotation of genomes and transcriptomes.

Setup

Dependences

  • Python 2.7
  • MySql Database

Optional Dependences

Graphing tools:

  • CairoSVG
  • Cairo
  • Tinycss
  • Cssselect
  • Pygal
  • Pycha

How to install optional dependences?

easy_install CairoSVG tinycss cssselect pygal

Clone the git repository

git clone https://github.com/alejo0317/GeneOntology-Python.git

Configure database

Edit the config.py file with your database settings (user, password, database)

Download mappings file and populate mysql database

Download the file idmapping.tab from ftp://ftp.pir.georgetown.edu/databases/idmapping/idmapping.tb and use it to populate your local database to generate GeneOntology associations.

To populate the database with the file idmapping.tab use the script named mappingsToDB.py. This a populate example:
python2 mappingsToDb.py /path/to/idmapping.tab

How to use

You can use the scripts one by one or can use the full wrapper of all process.

Scripts on this repository

GoDistribution.py

Generates two files: a file containing relation between GO Terms and sequences, another file containing the tabbed counts of GO terms wanted.

hits2go.py

Associantes Uniprot, Refseq, GI, accessions to GO identifiers using a mappings table.

GraphPie.py

Generates a Pie chart from a file with the GO counts.

Test data include

We have include a set of files to test this scripts. You will find it on the folder test_data

sequences2hits.csv

A blast-generated-file with the querys and subjects (NR and Uniprot database has been used)

sequences2Gos.csv

A relation between Sequences and his associated GO terms.

gos2Sequences.csv

A relation between GO terms and his association with the Sequences.

gosCounts.tab

Ready to graph file.

Other files include

go.obo

This file contain s the descriptions and relations bettwen the avaliable GO terms. Has been downloaded from http://purl.obolibrary.org/obo/go.obo

Celullar_Component

A list of "2nd level" GO terms from the GO category Celullar Component (GO:0005575)

Biological_Process

A list of "2nd level" GO terms from the GO category Biological Process (GO:0008150)

Mollecular_Function

A list of "2nd level" GO terms from the GO category Mollecular Function (GO:0003674)

About

Massive Automatic Functional Annotation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published