Skip to content

Shabhonam/Mikado

Repository files navigation

#Mikado - pick your transcript: a pipeline to determine and select the best RNA-Seq prediction

Mikado is a lightweight Python3 pipeline to identify the most useful or “best” set of transcripts from multiple transcript assemblies. Our approach leverages transcript assemblies generated by multiple methods to define expressed loci, assign a representative transcript and return a set of gene models that selects against transcripts that are chimeric, fragmented or with short or disrupted CDS. Loci are first defined based on overlap criteria and each transcript therein is scored based on up to 50 available metrics relating to ORF and cDNA size, relative position of the ORF within the transcript, UTR length and presence of multiple ORFs. Mikado can also utilize blast data to score transcripts based on proteins similarity and to identify and split chimeric transcripts. Optionally, junction confidence data as provided by Portcullis_ [Portcullis]_ can be used to improve the assessment. The best-scoring transcripts are selected as the primary transcripts of their respective gene loci; additionally, Mikado can bring back other valid splice variants that are compatible with the primary isoform.

Mikado uses GTF or GFF files as mandatory input. Non-mandatory but highly recommended input data can be generated by obtaining a set of reliable splicing junctions with Portcullis_, by locating coding ORFs on the transcripts using Transdecoder_, and by obtaining homology information through BLASTX [Blastplus]_.

Our approach is amenable to include sequences generated by de novo Illumina assemblers or reads generated from long read technologies such as Pacbio.

Extended documentation is hosted on ReadTheDocs: http://mikado.readthedocs.org/

Installation

Mikado can be installed from PyPI with pip:

pip3.5 install mikado

Alternatively, you can clone the repository from source and install with:

python3 setup.py test;
python3 setup.py bdist_wheel;
pip3.5 install dist/*whl

You can verify the correctness of the installation with the unit tests:

python3 setup.py test

About

Transcript rebuilding

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages