A powerful tool for identifying both cis-spliced and trans-spliced peptides
For more information about hypedsearch, installation and usage instructions, please see the documentation found here.
hypedsearch
is a tool for identifying both hybrid and non-hybrid protein sequences from mass spectrometry data. hypedsearch
takes MS/MS files (mzML
only currently) and fasta
files as inputs and identifies peptides, both hybrid and nonhybrid.
hypedsearch
identifies sequences by using a method called "k-mer extension". If you're not familiar, a "k-mer" is a k long string, or in this case, k long sequence of amino acids. The process, at a high level, works like this:
- Pre-processing of the protein database to identify all k-mers for k in the range
min_peptide_len
tomax_peptide_len
. These k-mers are generated from both theN
andC
terminus - Using the k-mers generated from the
N
termninus side, attempt to identify a sequence of amino acids that describe theb
ions in the observed spectrum. - Repeat step 2, but from the
C
terminus side and try to describe they
ions in the observed spectrum - Filter out the poor scoring sequences
- For the rest of the sequences, attempt to align and overlap the two sequences to the spectrum
- If two sequences have no overlap, or do overlap but are from different proteins, report the alignment as hybrid
- Save all alignments
Currently hypedsearch
works on Unix and MacOS but has yet to be tested on Windows.
hypedsearch uses python3. If you don't have python3, you can find instructions to download here.
First clone the repository
$> git clone https://github.com/zmcgrath96/hypedsearch.git
*Goloborodko, A.A.; Levitsky, L.I.; Ivanov, M.V.; and Gorshkov, M.V. (2013) “Pyteomics - a Python Framework for Exploratory Data Analysis and Rapid Software Prototyping in Proteomics”, Journal of The American Society for Mass Spectrometry, 24(2), 301–304. DOI: 10.1007/s13361-012-0516-6
Levitsky, L.I.; Klein, J.; Ivanov, M.V.; and Gorshkov, M.V. (2018) “Pyteomics 4.0: five years of development of a Python proteomics framework”, Journal of Proteome Research. DOI: 10.1021/acs.jproteome.8b00717