Skip to content

buchanae/odetta

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

99 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Odetta

Odetta is a set of tools for discovering and analyzing novel transcript isoforms using paired-end RNA-Seq data.

External software is used at various points in the pipeline:

  • CASHX is used for sequence alignment.
  • multisplat is used to discover splice junctions.
  • GMB is used to discover novel gene models.

Installation

TODO TODO note about rtree

Usage

Using Odetta might look like this...

TODO

Configuration

You can set up a mrjob configuration. For example...

mrjob.conf

runners:
  local:
    base_tmp_dir: /path/to/tmp/dir
      jobconf:
        mapreduce.job.maps: 8
        mapreduce.job.reduces: 7

Use the configuration with...

python example.py --conf-path ./mrjob.conf input_file > output_file

mapreduce.job.maps and mapreduce.job.reduces are particularly useful for utilizing all available processors when running locally (i.e. not Hadoop).

The mrjob docs describe all available options.

Development Notes

Odetta uses mrjob for map/reduce processing. mrjob makes developing and running map/reduce easy, both locally and on Hadoop.

I have not tested this on Hadoop.

nose is used for unit testing. You can run the tests using nosetests tests/.

About

Tools for filtering paired-end Splat alignments

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages