Skip to content

The DRIP Toolkit (DTK) identifies tandem mass spectra utilizing a Dynamic Bayesian network for Rapid Identification of Peptides (DRIP). DTK performs efficient dynamic Bayesian network (DBN) inference using the Graphical Models Toolkit (GMTK).

License

melodi-lab/dripToolkit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The DRIP Toolkit utilizes a dynamic Bayesian network (DBN) 
for Rapid Identification of Peptides (DRIP) in tandem mass spectra.  
Given an observed spectrum, DRIP scores a peptide by aligning the 
peptide's theoretical spectrum and the observed spectrum, i.e., 
computing the most probable sequence of insertions (spurious 
observed peaks) and deletions (missing theoretical peaks).  
DBN inference is efficiently performed utilizing the Graphical Models 
Toolkit (GMTK), which allows easy alterations to the model. If you use 
the DRIP toolkit in your research, please cite:

John T. Halloran, Jeff A. Bilmes, and William S. Noble. "Learning 
Peptide-Spectrum Alignment Models for Tandem Mass Spectrometry". 
Thirtieth Conference on Uncertainty in Artificial Intelligence 
(UAI 2014). AUAI, Quebic City, Quebec Canada, July 2014.

Written by John T. Halloran (halloj3@uw.edu), with contributing files from
Ajit Singh (ajit@ee.washington.edu) and Jeff Bilmes (bilmes@ee.washington.edu)

---------------------------------------------------
----------------- Installation
---------------------------------------------------
The toolkit requires the following be installed:
Cygwin (if using Windows)
g++
the Graphical Models Toolkit (GMTK) - https://melodi.ee.washington.edu/gmtk/
Python 2.7
argparse and numpy python packages
SWIG

After installing the above, perform the following in the unzipped toolkit directory:
cd pyFiles/pfile
swig -c++ -python libpfile.i
CC=g++ python setup.py build_ext -i

Assuming no errors were output, the DRIP Toolkit 
is now ready for use!  To test that the above was
compiled and linked correctly, run:
./test.py

---------------------------------------------------
----------------- Searching ms2 files
---------------------------------------------------
For convenience, sample data is included in directory data and
example DRIP Toolkit search commands are provided in test.sh.

To search an MS2 dataset data/test.ms2 given FASTA file data/yeast.fasta, perform the following steps:
1.) Digest the FASTA file using dripDigest.py.  Example:
python dripDigest.py \
    --digest-dir dripDigest-output \
    --min-length 6 \
    --fasta data/yeast.fasta \
    --enzyme 'trypsin/p' \
    --monoisotopic-precursor true \
    --missed-cleavages 0 \
    --digestion 'full-digest'

The digested peptide database will be written to directory dripDigest-output

2.) Search using dripSearch.py.  Example:
python dripSearch.py \
    --digest-dir 'dripDigest-output' \
    --spectra data/test.ms2 \
    --precursor-window 3.0 \
    --learned-means dripLearned.means \
    --learned-covars dripLearned.covars \
    --num-threads 8 \
    --top-match 1 \
    --high-res-ms2 F \
    --output dripSearch-test-output

If the data was collected utilizing high-resolution fragment ions, set --high-res-ms2 T.  The output PSMs
will be written to file dripSearch-test-output.txt.

For a detailed explanation of using the toolkit, including training DRIP, 
preparing data for cluster usage, and speeding up a search using approximate 
inference, please consult:
http://melodi-lab.github.io/dripToolkit/documentation.html

For a full list of allowable toolkit options, please consult:
http://melodi-lab.github.io/dripToolkit/dripDigest.html
http://melodi-lab.github.io/dripToolkit/dripSearch.html
http://melodi-lab.github.io/dripToolkit/dripTrain.html

---------------------------------------------------
----------------- Contaact
---------------------------------------------------
Please send all questions and bug reports to:
halloj3@uw.edu

About

The DRIP Toolkit (DTK) identifies tandem mass spectra utilizing a Dynamic Bayesian network for Rapid Identification of Peptides (DRIP). DTK performs efficient dynamic Bayesian network (DBN) inference using the Graphical Models Toolkit (GMTK).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages