EM Mitochondria Pipeline

Pipeline Overview

This pipeline is derived from zudi-lin/pytorch_connectomics, I did some modification and new implementation:

Pre-process and Post-process

get_dataset.py: Get .h5 volume dataset from .png proofreading files and EM images.
get_seg.py: Use CC and watershed to get mitochondria segmentation map from a distance transform map, and calculate the precision and recall for the segmentation. derived from aarushgupta/NeuroG/post_process.py
get_vast_deploy_HUMAN.py: Large scale version of get_seg.py. After getting the heatmap results(20 slices each), merge them to 200 slices each and do post-processing.
trace.py: After using get_vast_deploy_HUMAN.py to get the segmentation. The labels will not be consistent through the 1200 slices. This script help to trace the mitochondria and relabel them.
Upsample.py: Upsample the large volume for better visualization in neuroglancer. For neuroglancer visualization, please refer to aarushgupta/NeuroG/neuroG.py
T_util.py: Some useful functions
torch_connectomics/run/genertae_slurm.py: Automatically generate slurm scripts for running inference on RC cluster
torch_connectomics/run/deploy.py: Use trained model on large scale dataset to get heatmaps.

Online Hard Negative Mining

scripts/train.py: Training script for OHEM, please set args.aux and criterion for OHEM.
torch_connectomics/run/train.py: Train function for OHEM
torch_connectomics/model/model_zoo/unetv3.py: new unetv3 version, with auxiliary outputs
torch_connectomics/model/loss/loss.py: implementation of different OHEM loss
torch_connectomics/utils/vis/visualize.py: also visualize the auxiliary outputs and gts

Nearest Neighbor Classification

scripts/train_NN.py: Training script for NNC
torch_connectomics/run/train_NN.py:Train function for NNC
torch_connectomics/run/test_NN.py:Test function for NNC
torch_connectomics/model/model_zoo/unetv3_ebd2.py:Model for 3D UNet+NNC, get embedding from the last but one layer

Discriminative Loss

scripts/train_dsc.py:Training script for Discriminative Loss
torch_connectomics/run/train_dsc.py:Train function for Discriminative Loss
torch_connectomics/run/test_dsc.py:Test function for Discriminative Loss
torch_connectomics/model/loss/loss.py: implementation of Discriminative Loss

Embedding

Implementation of paper on Mitochondria

scripts/train_2D.py:Training script for 2D UNet+Embedding
torch_connectomics/run/train_2d.py:Train function for 2D UNet+Embedding
torch_connectomics/model/model_zoo/unet2d_ebd.py: Model for 2D UNet+Embedding
torch_connectomics/model/model_zoo/unetv3_ebd.py: Model for 3D UNet+Embedding

Training & Testing in one file

The past code requires using scripts/test.py to evaluate the trained model, but here I implement automatically evaluate the model after every 1000 iterations

scripts/train_val.py: Scripts for training & testing
torch_connectomics/run/train_val.py: Function for training & testing
torch_connectomics/utils/net/arguments_val.py: Arguments for training & testing
torch_connectomics/utils/net/dataloader_val.py: Dataloader for training & testing

Segmented Data in Neuronglancer

Please install the package as follows:

PyTorch Connectomics

Introduction

The field of connectomics aims to reconstruct the wiring diagram of the brain by mapping the neural connections at the level of individual synapses. Recent advances in electronic microscopy (EM) have enabled the collection of a large number of image stacks at nanometer resolution, but the annotation requires expertise and is super time-consuming. Here we provide a deep learning framework powered by PyTorch for automatic and semi-automatic data annotation in connectomics. This repository is actively under development by Visual Computing Group (VCG) at Harvard University.

Key Features

Multitask Learning
Active Learning
CPU and GPU Parallelism

If you want new features that are relatively easy to implement (e.g. loss functions, models), please open a feature requirement discussion in issues or implement by yourself and submit a pull request. For other features that requires substantial amount of design and coding, please contact the author directly.

Environment

The code is developed and tested under the following configurations.

Hardware: 1-8 Nvidia GPUs (with at least 12G GPU memories) (change [-g NUM_GPU] accordingly)
Software: CentOS Linux 7.4 (Core), CUDA>=9.0, Python>=3.6, PyTorch>=1.0.0

Installation

Create a new conda environment:

conda create -n py3_torch python=3.6
source activate py3_torch
conda install pytorch torchvision cudatoolkit=9.0 -c pytorch

Download and install the package:

git clone git@github.com:zudi-lin/pytorch_connectomics.git
cd pytorch_connectomics
pip install -r requirements.txt
pip install --editable .

For more information and frequently asked questions about installation, please check the installation guide. If you meet compilation errors, please check TROUBLESHOOTING.md.

Visulazation

Training

Visualize the training loss and validation images using tensorboardX.
Use TensorBoard with tensorboard --logdir runs (needs to install TensorFlow).

Test

Visualize the affinity graph and segmentation using Neuroglancer.

Notes

Data Augmentation

We provide a data augmentation interface several different kinds of commonly used augmentation method for EM images. The interface is pure-python, and operate on and output only numpy arrays, so it can be easily incorporated into any kinds of python-based deep learning frameworks (e.g. TensorFlow). For more details about the design of the data augmentation module, please check the documentation.

Model Zoo

We provide several encoder-decoder architectures, which can be found here. Those models can be applied to any kinds of semantic segmentation tasks of 3D image stacks. We also provide benchmark results on SNEMI3D neuron segmentation challenges here with detailed training specifications for users to reproduce.

Syncronized Batch Normalization on PyTorch

Previous works have suggested that a reasonable large batch size can improve the performance of detection and segmentation models. Here we use a syncronized batch normalization module that computes the mean and standard-deviation across all devices during training. Please refer to Synchronized-BatchNorm-PyTorch for details. The implementation is pure-python, and uses unbiased variance to update the moving average, and use sqrt(max(var, eps)) instead of sqrt(var + eps).

Acknowledgement

This project is built upon numerous previous projects. Especially, we'd like to thank the contributors of the following github repositories:

pyGreenTea: Janelia FlyEM team
DataProvider: Princeton SeungLab
EM-affinity: Harvard Visual Computing Group

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

Zudi Lin

Name		Name	Last commit message	Last commit date
Latest commit History 201 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
benchmark		benchmark
configs		configs
demo		demo
docs		docs
figs		figs
scripts		scripts
torch_connectomics		torch_connectomics
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
TROUBLESHOOTING.md		TROUBLESHOOTING.md
T_util.py		T_util.py
Upsample.py		Upsample.py
get_dataset.py		get_dataset.py
get_seg.py		get_seg.py
get_vast_deploy_HUMAN.py		get_vast_deploy_HUMAN.py
requirements.txt		requirements.txt
setup.py		setup.py
trace.py		trace.py

License

charlotte12l/EM-pipeline-mito

Folders and files

Latest commit

History

Repository files navigation

EM Mitochondria Pipeline

Pipeline Overview

Pre-process and Post-process

Online Hard Negative Mining

Nearest Neighbor Classification

Discriminative Loss

Embedding

Training & Testing in one file

Segmented Data in Neuronglancer

Please install the package as follows:

PyTorch Connectomics

Introduction

Key Features

Environment

Installation

Visulazation

Training

Test

Notes

Data Augmentation

Model Zoo

Syncronized Batch Normalization on PyTorch

Acknowledgement

License

Contact

About

Resources

License

Stars

Watchers

Forks

Languages