Neural networks for predicting chemical properties and UI for graphing predict properties of datasets. Developing generative adversarial network that learns to create molecules based on configured predicted properties and other constants such as molecular weight.
conda create -n Chemistry
conda activate Chemistry
conda install -c rdkit rdkit
conda install scikit-learn
conda install matplotlib
Requires models to be compiled and saved. Molecules can be manually entered as SMILES strings or a file can be loaded.
- Table containing all loaded molecules
- Search bar to search through molecules
- Property display window to show predicted properties of a selected molecule
Select graph type from View>Graph tab, a window to configure the graph then appears.
Data: 24652 compounds scraped from PubChem
Regression predicts LogP from compounds Atom Pair fingerprint
See models/octanol_water_partition_coefficient/regression.py
Data: 9982 compounds from AqSolDB
Two regression models predict LogS, one from the Atom Pair fingerprint and the other from the predicted LogP. Both predictions are combined in another neural network to predict with higher accuracy.
LogP see models/solubility_logP/regression.py
Atom Pair see models/water_solubility/regression.py
Combined see models/combined_water_solubility/regression.py
Data: 7952 compounds from PubChem
Regression model predicts melting point from logP model and water solubility.
See models/melting_point/reverse_gse_regression.py
Data: 2694 compounds scraped from PubChem and sorted
Classification model to determine best fingerprint for this application, predicts range of boiling point for a compound.
See models/boiling_point/simple_ia.py
To run the models you will need data to train the models.
For data that collects from PubChem go to: https://pubchem.ncbi.nlm.nih.gov/classification/
- Under select classification select PubChem>PubChem Compound TOC.
- In the tree select Chemical Properties>Experimental Properties.
- Click on the number next to property and open new page.
- On the right click structure download.
- Select SMILES format and leave the rest and download.
- Use this file as input list for data_collection
Run data_collection/octanol_water_partition_coefficient.py and select the path to the PubChem list of compounds(See above)
Download https://www.amdlab.nl/database/AqSolDB/
Run data_collection/aqsoldb_transform.py on the csv to change the format.
Run data_collection/melting_point.py and select the path to the PubChem list of compounds(See above)