GitHub

GitHub repo

The repo is available at: https://github.com/fivebillionmph/be223c

To run with Docker

On the AWS server:

$ docker run project_lung

This will start a Docker container running the web server listening on port 8085. Because the Docker container has its private IP, you will need to look up the IP and proxy into the server to reach it.

Find the container's ID by running:

$ docker ps

Then get the container's private IP with:

$ docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' CONTAINER_ID

Reference: https://stackoverflow.com/a/20686101/3704042

Then go to http://<CONTAINERS_IP>:8085 and run the models through the interface. After submitting an image query, you may have to scroll down on the webpage to see the results.
Lung image files that can be used for testing submissions are found in the /data/segs2/PNG-v2 in the archive submitted to CCLE.

Architecture

Overview

This project is a webapp that serves a GUI for submitting lung images to a Flask server. The Flask server segments the lung region out from the tissue, returns the probability of disease progression based on multiple deep learning classifiers and returns a list of similar images.

Technologies used include: Flask for the web server; Keras, numpy, scikit-learn for deep learning; Vue.js and Bootstrap for frontend interactivity and styling. A list of Python libraries used can be found in the requirements.txt file.

The main entry point of the Flask web server is the /src/server.py script. Scripts in the /src/mod folder are modules loaded by the server used for reading and caching Keras models.

Models and other data are mostly stored in /data. These models are generated by scripts that can be found in /src, /src/mod, /src2/amy, /src2/mohammad.

The /web directory holds the HTML, CSS and JavaScript used by the web server.

More detailed explanations of the directories and files are in the next section.

Directories and file descriptions

/data
- This is where all the variable or generated data is kept. This includes the classification models, miniVGG model, automatic lung segmentation model, model test results and the lung images (original images, lesion segmentations, patches, etc).
- This directory is not as organized, it has unused models and images because data was swapped out and interchanged over the course of development.
- It is gitignored, but the data is packaged in the zip file submitted to CCLE.
/data/miniVGG.h5
- The miniVGG hash model HDF5 file
- This was generated by /src/mod/miniVGG_FFT_hash.py and /src/train-miniVGG.py
/data/segs2/patches-training
- Directory of patches that are hashed and checked against for the content based retrieval image similarity
/data/lesion_classification.model
- HDF5 file for the UNET encoder based whole image lesion classification model (model 1)
- This was generated by /src/mod/classify_lesion.py
/data/model_lung_pro_cv_patch.h5
- HDF5 file for the VGG16 based patch classification model (model 2)
- This was generated by /src2/mohammad/lung_pa_cv.py
/data/model_lung_pro_cv_image1.h5
- HDF5 file for the VGG16 based whole image classification model (model 3)
- This was generated by /src2/mohammad/lung_im_cv.py
/data/lung_seg.model
- HDF5 file for the lung segmentation model
- This was generated by /src/mod/seg_lung.py
/data/Train.csv
- The training set images and their labels
/data/Test.csv
- The test set images and their labels
/data/test-model1, /data/test-model2, /data/test-model3
- The test results directories for the 3 models which ROC, AUC, etc...
/scripts
- Simple scripts for starting the web server and getting the installed Python modules
/src
- Python scripts for running the server, segmenting, testing models, training the miniVGG, etc
- The description of each script and it's purpose is in each file's main function
/src/server.py
- The main Flask server script
/src/mod
- Shared modules used by scripts in /src.
/src2
- Other scripts for generating models or segmenting
/src2/amy
- Scripts written by Amy for preprocessing and segmenting the lung images
/src2/mohammad
- Scripts for generating the VGG16 based classifier models
/web
- HTML templates and static CSS, JS and images

Project overview

This project attempts to predict if a lung cancer patient will respond to immunotherapy treatment. Lung CT scans were taken before and after treatment with anti-PD1 immunotherapy.

Three deep learning models were trained on the pre-treatment images to classify if the disease will pregress or not: A encoder based model which classifies against the entire image, a VGG16 based model which classifies based on the selected patch and a VGG16 based model which classifies on the whole image.

A content based retrieval based on miniVGG and image similarity hashing was also trained so similar lesions can be returned.

An automatic lung segmentation model was also created so that users do not need to segment the lungs out of surrounding tissue themselves.

Name		Name	Last commit message	Last commit date
Latest commit History 120 Commits
data		data
scripts		scripts
src		src
src2		src2
web		web
.gitignore		.gitignore
Dockerfile		Dockerfile
NOTES.txt		NOTES.txt
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

scripts

scripts

src

src

src2

src2

web

web

.gitignore

.gitignore

Dockerfile

Dockerfile

NOTES.txt

NOTES.txt

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

GitHub repo

To run with Docker

Architecture

Overview

Directories and file descriptions

Project overview

About

Releases

Packages

Languages

aaronzq/be223c

Folders and files

Latest commit

History

Repository files navigation

GitHub repo

To run with Docker

Architecture

Overview

Directories and file descriptions

Project overview

About

Resources

Stars

Watchers

Forks

Languages