Skip to content

alvinwan/storm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Semi-Supervised Deep Learning for Molecular Structures

By Alvin Wan and Allen Guo

During clathrin-mediated endocytosis (CME), clathrin surrounds molecules awaiting transport, forming a spherical coat. Our goal was to pick out clathrin undergoing this process. This repository employs semi-supervised learning methods to classify "cup-like" clathrin structures given STORM microscopies for proteins of interest. See the problem formulation and approach specifics in our presentation slides or full report.

The clathrin data was provided by the Ke Xu lab in UC Berkeley's College of Chemistry, whose research work we are supporting. If you find this work useful for your research, please consider citing:

@citation{storm,
    Author = {Alvin Wan and Allen Guo},
    Title = {Semi-Supervised Deep Learning for Molecular Structures},
    Year = {2017}
}

Install

This project requires Python3. We begin by navigating to the root of the repository, which we will call $STORM.

cd $STORM

(optional) We recommend setting up a virtual environment first. This project uses Python3.

virtualenv ../env --python=python3
source ../env/bin/activate

Install all Python requirements.

pip install -r requirements.txt

Train

Alternatively, you can toy with various hyperparameters and attempt training on your own. We approached the problem using a two-step pipeline. First, find a latent representation in a lower-dimensional space. Then, run a simple classifier on the encoded data.

If your data is located at data/train_molecules.mat and data/test_molecules.mat, the <data_class> mentioned below would be molecules.

Featurize

Start by picking a featurization technique.

cd $STORM
bash storm.sh encode_(ae|kmeans|pca) <data_class>

Classify

We then train a support vector machine (SVM) using the featurizations. For the below command, make sure to featurize both the train.mat and test.mat datasets, specified above.

bash storm.sh svm <data_class>

About

semi-supervised deep learning for classification of molecular structures

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published