Skip to content

ratschlab/ASP

Repository files navigation

<<<<<<< HEAD
Explanation of directory content
================================

README       - this file
Makefile     - to create release archives
src          - shogun source code
doc          - documentation (to be built using doxygen)
examples     - example files for all interfaces
applications - applications of shogun
testsuite    - the shogun test suite

The following table depicts the status of each interface available in shogun:

+==================+===========================================================+
|    interface     |     status                                                |
+==================+===========================================================+
|python_modular    | mature (no known problems)                                |
|octave_modular    | mature (no known problems)                                |
|java_modular      | beta quality - not all examples are ported and working    |
|lua_modular       | alpha work in progress quality - some examples work       |
|ruby_modular      | pre-alpha work in progress quality - proof-of-concept only|
|csharp_modular    | pre-alpha work in progress quality - proof-of-concept only|
|r_modular         | pre-alpha quality (swig does not properly handle reference|
|                  | counting and thus only for the brave:                     |
|                  | --disable-reference-counting to get it to work, but beware|
|                  | that it will leak memory; disabled by default.)           |
+------------------+-----------------------------------------------------------+
|octave_static     | mature (no known problems)                                |
|matlab_static     | mature (no known problems)                                |
|python_static     | mature (no known problems)                                |
|r_static          | mature (no known problems)                                |
|libshogun_static  | mature (no known problems)                                |
|cmdline_static    | stable but some data types incomplete                     |
|                  |                                                           |
|elwms_static      | this is the eierlegendewollmilchsau interface, a chimera  |
|                  | that in one file interfaces with python,octave,r,matlab   |
|                  | and provides the run_python command to run code in python |
|                  | using the in octave,r,matlab available variables, etc)    |
+==================+===========================================================+

Visit src/README and http://www.shogun-toolbox.org/doc/en/current for further information.
=======
This is the accurate splicer (asp) program accompanying the paper 
"Accurate Splice Site Prediction Using Support Vector Machines"
by Soeren Sonnenburg, Gabriele Schweikert, Petra Philips,
Jonas Behr and Gunnar Raetsch [1].


ASP PROGRAM REQUIREMENTS:

Asp requires a working python (2.4 or later) installation with numpy
(version 1.0 or later) and the shogun toolbox (version 0.7.3 or later)
- which is available from http://www.shogun-toolbox.org for Linux, MacOSX,
cygwin/win32. If you are running Debian GNU Linux, shogun 0.7.3 is available in
debian unstable http://packages.debian.org/unstable/science/shogun-python-modular.

ASP PROGRAM RUNNING TIME AND MEMORY REQUIREMENTS:

Asp requires about 100M of memory for short sequences. Memory requirements
don't grow much (a additional linear term w.r.t. the length of the input
sequence). On first run with a new model (see --model option below),
asp will load and decompress the .bz2 compressed model file and store it
as a python native pickle dump, which increases startup times a lot.
Due to the optimizations in [2] splice form prediction (layer 1) times
won't change much for many/long sequences. 

ASP PROGRAM USAGE:

./asp fasta_file.fa

This will read all entries in the .fa file and print a .gff file with the
predictions for each of the entries to stdout. One may optionally specify the
start and stop of the transcript via --start <basenum> / --stop <basenum> and
the model via --model one of worm, fly, cress, fish, human.
<basenum> is zero based.


REFERENCES:

[1] S. Sonnenburg, G. Schweikert, P. Philips, J. Behr and Gunnar Raetsch,
	Accurate Splice Site Prediction, BMC Bioinformatics, Special Issue from NIPS workshop on
	New Problems and Methods in Computational Biology Whistler, Canada, 18 December 2006},
	December, 2007, BMC Bioinformatics,8:(Suppl. 10):S7

[2]	Sonnenburg, S, Rätsch, G, Schäfer, C, Schölkopf, B. Large Scale Multiple
	Kernel Learning. Journal of Machine Learning Research,7:1531-1565,
	July 2006, K.Bennett and E.P.-Hernandez Editors.
>>>>>>> 50b22bac2693593beb11ecd477c9bb5330a1aff8