Skip to content

A Python package for creating an annotation file for a pseudoscaffold

Notifications You must be signed in to change notification settings

mojaveazure/Pseudoscaffold_Annotator

Repository files navigation

pseudoscaffold_annotator

A program to annotate an assembled pseudoscaffold



If this program fails on you, revert back to a previous release using the commands

git checkout -f
git reset --hard 25543ca66a9a1a7cc1ab8dc72f4c6e237e797478

This is a program for annotating an assembled pseudoscaffold using a reference genome and annotation file. Currently, this only supports using GFF3 files as input, but can output both GFF3 files and 3-column BED files. Increased support for the BED format will come later.

Running this program to annotate a pseudoscaffold is done using the following command:

./pseudoscaffold_annotator.py annotate -r REFERENCE_FASTA -a ORIGINAL_ANNOTATION -p PSEUDOSCAFFOLD_FASTA -o OUTFILE_NAME -c BLAST_CONFIG_FILE

The BLAST configuration file can be run using the following command:

./pseudoscaffold_annotator.py blast-config

Use the -h flag to see all options for configuring.

IMPORTANT

pseudoscaffold_annotator.py requires no new lines within the sequence of the pseudoscaffold. The following is not an allowed sequence:

    >pseudoscaffold
    ACTGTCAG
    GCTATCGA

The 'fix' subroutine removes new lines between sequence data, creating a fasta file that reads like: >pseudoscaffold ACTGTCAGGCTATCGA

To fix a pseudoscaffold, run the following command:

./pseudoscaffold_annotator.py fix -p PSEUDOSCAFFOLD_FASTA -n FIXED_FASTA

This program requires Python 2.7 or higher, or the argparse module installed for Python 2.6

NOTE: this has NOT been tested on Python 3.x

Other dependencies include:

NOTE: This program has only been tested with the Morex (Barley) genome, please use with caution

TODO

  • Add support for extracting information from BED file
  • Add parallelization support
  • Add BED to GFF annotating capabilities
  • Add GFF to BED annotating capabilities
  • Finish GFF to GFF annotation capabilities DONE!
  • Add BED to BED annotating capabilities

About

A Python package for creating an annotation file for a pseudoscaffold

Resources

Stars

Watchers

Forks

Packages

No packages published