The DRIP Toolkit (DTK) identifies tandem mass spectra utilizing a Dynamic Bayesian network for Rapid Identification of Peptides (DRIP). DTK performs efficient dynamic Bayesian network (DBN) inference using the Graphical Models Toolkit (GMTK).
License
melodi-lab/dripToolkit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
The DRIP Toolkit utilizes a dynamic Bayesian network (DBN) for Rapid Identification of Peptides (DRIP) in tandem mass spectra. Given an observed spectrum, DRIP scores a peptide by aligning the peptide's theoretical spectrum and the observed spectrum, i.e., computing the most probable sequence of insertions (spurious observed peaks) and deletions (missing theoretical peaks). DBN inference is efficiently performed utilizing the Graphical Models Toolkit (GMTK), which allows easy alterations to the model. If you use the DRIP toolkit in your research, please cite: John T. Halloran, Jeff A. Bilmes, and William S. Noble. "Learning Peptide-Spectrum Alignment Models for Tandem Mass Spectrometry". Thirtieth Conference on Uncertainty in Artificial Intelligence (UAI 2014). AUAI, Quebic City, Quebec Canada, July 2014. Written by John T. Halloran (halloj3@uw.edu), with contributing files from Ajit Singh (ajit@ee.washington.edu) and Jeff Bilmes (bilmes@ee.washington.edu) --------------------------------------------------- ----------------- Installation --------------------------------------------------- The toolkit requires the following be installed: Cygwin (if using Windows) g++ the Graphical Models Toolkit (GMTK) - https://melodi.ee.washington.edu/gmtk/ Python 2.7 argparse and numpy python packages SWIG After installing the above, perform the following in the unzipped toolkit directory: cd pyFiles/pfile swig -c++ -python libpfile.i CC=g++ python setup.py build_ext -i Assuming no errors were output, the DRIP Toolkit is now ready for use! To test that the above was compiled and linked correctly, run: ./test.py --------------------------------------------------- ----------------- Searching ms2 files --------------------------------------------------- For convenience, sample data is included in directory data and example DRIP Toolkit search commands are provided in test.sh. To search an MS2 dataset data/test.ms2 given FASTA file data/yeast.fasta, perform the following steps: 1.) Digest the FASTA file using dripDigest.py. Example: python dripDigest.py \ --digest-dir dripDigest-output \ --min-length 6 \ --fasta data/yeast.fasta \ --enzyme 'trypsin/p' \ --monoisotopic-precursor true \ --missed-cleavages 0 \ --digestion 'full-digest' The digested peptide database will be written to directory dripDigest-output 2.) Search using dripSearch.py. Example: python dripSearch.py \ --digest-dir 'dripDigest-output' \ --spectra data/test.ms2 \ --precursor-window 3.0 \ --learned-means dripLearned.means \ --learned-covars dripLearned.covars \ --num-threads 8 \ --top-match 1 \ --high-res-ms2 F \ --output dripSearch-test-output If the data was collected utilizing high-resolution fragment ions, set --high-res-ms2 T. The output PSMs will be written to file dripSearch-test-output.txt. For a detailed explanation of using the toolkit, including training DRIP, preparing data for cluster usage, and speeding up a search using approximate inference, please consult: http://melodi-lab.github.io/dripToolkit/documentation.html For a full list of allowable toolkit options, please consult: http://melodi-lab.github.io/dripToolkit/dripDigest.html http://melodi-lab.github.io/dripToolkit/dripSearch.html http://melodi-lab.github.io/dripToolkit/dripTrain.html --------------------------------------------------- ----------------- Contaact --------------------------------------------------- Please send all questions and bug reports to: halloj3@uw.edu
About
The DRIP Toolkit (DTK) identifies tandem mass spectra utilizing a Dynamic Bayesian network for Rapid Identification of Peptides (DRIP). DTK performs efficient dynamic Bayesian network (DBN) inference using the Graphical Models Toolkit (GMTK).
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published