Skip to content

bundasmanu/Colorectal_Histopathology

Repository files navigation

Colorectal_Histopathology

The Dataset consists of a series of histological images of colorectal cancer. Colorectal cancer is one of the most common cancers. It is important to establish an appropriate tissue classification mechanism associated with this pathology. The presence of several classes helps to improve the treatment of patients, since most datasets separate only between disease/non-disease.

Data:

The dataset is composed of 5000 RGB samples, with dimensions 150*150 (height and width). And the main objective of the problem is to correctly classify the tissue type of colorectal cancer. The 8 classes are:

  • Tumor;
  • Stroma;
  • Complex;
  • Lympho;
  • Debris;
  • Mucosa;
  • Adipose;
  • Empty;

Limitations:

The major limitation of this dataset is the low number of samples available. The dataset is balanced and no sampling technique is required and used.

What this project offers

  • Disponibilization of a Jupyter notebook with problem pre-analysis;
  • The Data Augmentation technique is used to allow the consequent increase in the number of training samples available for model learning;
  • It implements and uses four convolutional architectures for the consequent resolution of the problem: AlexNet, VGGNet, ResNet and DenseNet;
  • Use of PSO algorithm to optimize the structure and other hyperparameters of different convolutional architectures;
  • Application of the ensemble technique to improve the performance obtained, individually, by the architectures (combining the probabilistic distributions of the different architectures - average);

Results - Colorectal Histopathology:

Model Memory Macro Average F1Score Accuracy File
AlexNet 19,0 MB 94.2% 94.3% AlexNet h5 File
VGGNet 15,5 MB 94.5% 94.6% VGGNet h5 File
ResNet 11,4 MB 95.5% 95.7% ResNet h5 File
DenseNet 17,9 MB 96.0% 96.1% DenseNet h5 File
Ensemble Average All Models 21,4 MB 95.5% 95.6% Ensemble All Models h5 File
Ensemble Average Res+ Dense 9,9 MB 96.6% 96.6% Ensemble Best Combination h5 File

How can I use it

  1. Clone Project: git clone https://github.com/bundasmanu/Colorectal_Histopathology.git;
  2. Install requirements: pip install -r requirements.txt;
  3. Check config.py file, and redraw the configuration variables used to read, obtain and divide the data of the problem, and variables that are used for construction, training and optimization of the architectures:
    • Samples of problem are readed from ../input/images/LESION_NAME/*.tif, e.g, ../input/images/STROMA/image1.tif --> this is an example that you need to pay attention and redraw before use project;

Data Access:

https://www.kaggle.com/kmader/colorectal-histology-mnist

Licence

GPL-3.0 License
I am open to new ideas and improvements to the current repository. However, until the defense of my master thesis, I will not accept pull request's.

About

Analysis of Colorectal Histology MNIST

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published