Code adapted from [Camelyon17] (https://github.com/Camelyon17/camelyon17)
- torch
- torchvision
- openslide
- opencv
- matplotlib
- First download the label data
lesion_annotations.zip
, unzip it, and you will find the label datalesion_annotations/patient###_node_#.xml
(where each # represents a digit) for the image data filescentre_#/patient###_node_#.tif
you will download. - From folders
centre_0
...centre_4
, only download thepatient###.zip
files that have label files in the above folderlesion_annotations
, since otherwise you will to download a much larger number of hugh image data files that cannot be used in training due to lacking of labels.
python3 unzip_sh.py > unzip_all.sh
chmod +x unzip_all.sh
mkdir tif
./unzip_all.sh
Cut each huge .tif
image file and put the results into a folder (named after the image file) containing small .png
patches, and create downsized .png
masks according the labels (tumor/normal) of the patches:
python3 make_patch.py
python3 get_thumbnail.py
python3 make_manifest.py
python3 train.py --train_epoch=10
python3 main.py