Protein-Ligand Binding Affinity Prediction

Dataset preparation

Training Data

Training data should be placed in the directory training_data/, and this directory should reside in the project root directory.

The format of the training data files, be it naming or content, are as per what was given to us.

Testing Data

Similarly, testing data should be placed in the directory testing_data/, and this directory should reside in the project root directory.

The format is also as per what was given to us.

Training

The following assumes that you are using Python3 and have libraries like numpy and keras installed. You may also need to install GraphViz as the training step also tries to print out a visualization of the model.

For both models, you should expect the following output:

A graph with loss and accuracy plots saved as a .png file
Model weights saved as a .h5 file
Model visualization saved as a .png file

Dual-stream 3D Convolution Neural Network

To start training the Dual-stream 3DCNN, simply run the following command.

python train.py

Baseline 5x25 MLP

To start training the MLp model, simply run the following command.

python dist_train.py

Prediction

First make sure you change the WEIGHTS_FILENAME variable in predict_utils.py. Then run the following command.

python predict.py

The predictions of the top ten ligand candidates for binding to any protein will be printed in the file test_predictions.txt.

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
.gitignore		.gitignore
README.md		README.md
dist_train.py		dist_train.py
dist_train_sequence.py		dist_train_sequence.py
dist_train_utils.py		dist_train_utils.py
models.py		models.py
predict.py		predict.py
predict_utils.py		predict_utils.py
replot_history.py		replot_history.py
train.py		train.py
train_sequence.py		train_sequence.py
train_utils.py		train_utils.py
utils.py		utils.py

szenius/protein-ligand-model

Folders and files

Latest commit

History

Repository files navigation

Protein-Ligand Binding Affinity Prediction

Dataset preparation

Training Data

Testing Data

Training

Dual-stream 3D Convolution Neural Network

Baseline 5x25 MLP

Prediction

About

Resources

Stars

Watchers

Forks

Languages