Skip to content
/ pca Public

An attempt to use principal component analysis for reaction scoring.

Notifications You must be signed in to change notification settings

mxk62/pca

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 

Repository files navigation

rxnpca

Script is an attempt to use principal component analysis (PCA) to identify key characteristics allowing to differ 'efficient' (working) reactions from 'inefficient' (not-working) ones.

Description

For a random sample of chemical compounds, script employs retrosynthetic transforms developed at NU, to generate a space of all possible reactions. Then, it performs PCA for a predefined set of reaction descriptors.

Currently about a 100 of molecular descriptors are available. For each reaction a sum of values of a given molecular descriptor is calculated separately for its reactants and products. Compounds which H-depleted molecular graphs contains a single atom only (e.g. water 'O') are excluded, as many topological descriptors are not defined for them.

Each such a pair is treated as reaction descriptor corresponding to a given molecular descriptor.

Usage

To run the script type (currently all arguments are optional):

rxnpca.py --size <size> --seed <seed> --selection-type <type>

where

  • size

    Sample size, i.e. number of chemicals for which reactions will be generated. Defaults to 1000.

  • seed

    Pseudo-random number generator seed. Default to None, meaning that current system time will be used.

  • type

    Selection method of reactions sharing the same product. Available choices are:

    • all: all available reactions will be selected,
    • random: a random pair of published and unpublished reactions will be selected for each product,
    • popular: a pair of published and unpublished reactions having the highest value of popularity index will be picked. If absent, default to all.

About

An attempt to use principal component analysis for reaction scoring.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages