Skip to content

aaronzq/be223c

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GitHub repo

The repo is available at: https://github.com/fivebillionmph/be223c

To run with Docker

  • On the AWS server:
$ docker run project_lung

This will start a Docker container running the web server listening on port 8085. Because the Docker container has its private IP, you will need to look up the IP and proxy into the server to reach it.

  • Find the container's ID by running:
$ docker ps
  • Then get the container's private IP with:
$ docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' CONTAINER_ID

Reference: https://stackoverflow.com/a/20686101/3704042

  • Then go to http://<CONTAINERS_IP>:8085 and run the models through the interface. After submitting an image query, you may have to scroll down on the webpage to see the results.
  • Lung image files that can be used for testing submissions are found in the /data/segs2/PNG-v2 in the archive submitted to CCLE.

Architecture

Overview

This project is a webapp that serves a GUI for submitting lung images to a Flask server. The Flask server segments the lung region out from the tissue, returns the probability of disease progression based on multiple deep learning classifiers and returns a list of similar images.

Technologies used include: Flask for the web server; Keras, numpy, scikit-learn for deep learning; Vue.js and Bootstrap for frontend interactivity and styling. A list of Python libraries used can be found in the requirements.txt file.

The main entry point of the Flask web server is the /src/server.py script. Scripts in the /src/mod folder are modules loaded by the server used for reading and caching Keras models.

Models and other data are mostly stored in /data. These models are generated by scripts that can be found in /src, /src/mod, /src2/amy, /src2/mohammad.

The /web directory holds the HTML, CSS and JavaScript used by the web server.

More detailed explanations of the directories and files are in the next section.

Directories and file descriptions

  • /data

    • This is where all the variable or generated data is kept. This includes the classification models, miniVGG model, automatic lung segmentation model, model test results and the lung images (original images, lesion segmentations, patches, etc).
    • This directory is not as organized, it has unused models and images because data was swapped out and interchanged over the course of development.
    • It is gitignored, but the data is packaged in the zip file submitted to CCLE.
  • /data/miniVGG.h5

  • /data/segs2/patches-training

    • Directory of patches that are hashed and checked against for the content based retrieval image similarity
  • /data/lesion_classification.model

    • HDF5 file for the UNET encoder based whole image lesion classification model (model 1)
    • This was generated by /src/mod/classify_lesion.py
  • /data/model_lung_pro_cv_patch.h5

  • /data/model_lung_pro_cv_image1.h5

  • /data/lung_seg.model

  • /data/Train.csv

    • The training set images and their labels
  • /data/Test.csv

    • The test set images and their labels
  • /data/test-model1, /data/test-model2, /data/test-model3

    • The test results directories for the 3 models which ROC, AUC, etc...
  • /scripts

    • Simple scripts for starting the web server and getting the installed Python modules
  • /src

    • Python scripts for running the server, segmenting, testing models, training the miniVGG, etc
    • The description of each script and it's purpose is in each file's main function
  • /src/server.py

    • The main Flask server script
  • /src/mod

    • Shared modules used by scripts in /src.
  • /src2

    • Other scripts for generating models or segmenting
  • /src2/amy

    • Scripts written by Amy for preprocessing and segmenting the lung images
  • /src2/mohammad

    • Scripts for generating the VGG16 based classifier models
  • /web

    • HTML templates and static CSS, JS and images

Project overview

This project attempts to predict if a lung cancer patient will respond to immunotherapy treatment. Lung CT scans were taken before and after treatment with anti-PD1 immunotherapy.

Three deep learning models were trained on the pre-treatment images to classify if the disease will pregress or not: A encoder based model which classifies against the entire image, a VGG16 based model which classifies based on the selected patch and a VGG16 based model which classifies on the whole image.

A content based retrieval based on miniVGG and image similarity hashing was also trained so similar lesions can be returned.

An automatic lung segmentation model was also created so that users do not need to segment the lungs out of surrounding tissue themselves.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 93.1%
  • HTML 4.1%
  • JavaScript 2.4%
  • Other 0.4%