Skip to content

LysSanzMoreta/Theseus-PP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Theseus-PP

Protein superposition using probabilistic programming in Pyro

Publication: https://ieeexplore.ieee.org/document/8791469

Citation:

@INPROCEEDINGS{theseusPP2019,
author={L. S. {Moreta} and A. S. {Al-Sibahi} and D. {Theobald} and W. {Bullock} and B. N. {Rommes} and A. {Manoukian} and T. {Hamelryck}},
booktitle={2019 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)},
title={A Probabilistic Programming Approach to Protein Structure Superposition},
year={2019},
volume={},
number={},
pages={1-5},
keywords={Bayes methods;biology computing;expectation-maximisation algorithm;iterative methods;maximum likelihood estimation;molecular biophysics;proteins;Bayesian model;THESEUS model;probabilistic model;protein superposition;latent mean structure;deep probabilistic programming language Pyro;maximum likelihood estimation;Bayesian probabilistic models;biomolecular structure;Bayesian protein structure prediction;probabilistic programming approach;protein structure superposition;correlated atom positions;THESEUS-PP model;MAP estimation;Proteins;Probabilistic logic;Covariance matrices;Bayes methods;Biological system modeling;Quaternions;Programming;protein superposition;Bayesian modelling;deep probabilistic programming;protein structure prediction},
doi={10.1109/CIBCB.2019.8791469},
ISSN={null},
month={July},}

The Calling_SUPERPOSION.py file is designed to call the Theseus-PP model n amount of times with different seeds

Animation_1adz0T.mp4

The input data is managed on this line of Calling_SUPERPOSITION.py: https://github.com/LysSanzMoreta/Theseus-PP/blob/master/Calling_SUPERPOSITION.py#L31

DataManagement.Read_Data('../PDB_files/{}.pdb'.format(name1), '../PDB_files/{}.pdb'.format(name2),type='all',models =(0,100),RMSD=True)

  • The files are in a PDB_files folder in the same directory
  • name1 and name2 are usually the same file name, but they are separated because some proteins NMR coordinates are divided in 2 different files (1adz1T and 1adz0T)

    a) PDB files containing a single sequence/model

  • type = "all"---> for the PDB with a single sequence/model
  • models = (0,len(seq))----> for the PDB with a single sequence. The number indicates the number of aa to be included

    b) PBD files containing >1 models

  • type = "models"
  • models = (0,3) ------> for the protein files with several models in the same PDB. The number of the models to be compared
  • RMSD = True, indicates if the superposition is initialized with the RMSD (Kabsch)

About

Protein superposition using probabilistic programming in Pyro

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages