AI powered Prostate cancer ISUP grading. Team rähmä.ai solution for the Prostate cANcer graDe Assessment (PANDA) Challenge. We ranked 24th (top 3%) in the competition. The models included in this repository get 0.930 QWK in the private test set and 0.904 QWK in the public test set.
-
Download PANDA dataset
-
Clone this repository.
-
cd into cloned repository and install dependencies:
python3 -m venv venv
source venv/bin/activate
pip3 install --upgrade pip
pip3 install -r requirements.txt
- (Optional) Install ResNeSt pre-trained models package
pip install git+https://github.com/zhanghang1989/ResNeSt
- Find all serial section replicates by running Detect_serial_sections.ipynb
The training data contains near duplicate slides which come from serial sections. The same tissue sample is sliced multiple times but these are essentially different parts and not duplicate although they may look similar.
Replicate slides of the same tissue sample are physically different and shouldn't be removed. However, they can be very similar to each other and they have same labels so this is problematic for evaluation. Replicates shouldn't be placed in same cross-validation folds to avoid sample memorization.
- Sample tissue parts of the training data ot generate tile training sets.
- Level 1 6x6 256-tiles from 256 slide size
- Level 1 4x6 256-tiles from 384 slide size
- Level 1 5x5 299-tiles from 299 slide size
Our method samples tile images along the tissue skeleton. The first image shows the WSI (whole slide image) with green sampling places and the second image shows the cropped tiles. The Figure below is sampled with 6x6 256-sized tiles using 256 slide size.
Train using Train template notebook.
Training readme has additional instructions.
We trained our 256
, 299
and 384
models that we used in the final submission with these scripts (in order).
256 model
train_256_ordinal_0.py
train_256_ordinal_1.py
384 model
train_384_ordinal_0.py
train_384_ordinal_1.py
train_384_ordinal_2.py
299 model
train_299_0.py
train_299_1.py
train_299_2.py
train_299_3.py
Please see our inference notebook that uses 256
and 384
models. This scored 0.930 qwk in the private, and
0.904 qwk in the public test sets.
The one that we used in the final competition submission scored 0.926 qwk in the private, and 0.907 qwk in the public test sets.