Using Convolutional Neural Networks to generate water masks from SAR data.
Installing dependencies is straight forward with pipenv. First install the GDAL dev libraries:
$ sudo apt-get install libgdal-dev
Then install the python packages:
$ pipenv install --dev
Specifying the --dev
flag will also install dependencies you will need to run
the training and unit tests.
NOTE: If you have trouble installing PyGDAL make sure that the package version
in Pipfile
corresponds to the version of your GDAL installation.
To tile your tiff image create a folder in the same directory as main.py and name it prep_tiles. Store the tiff file within this folder, like below:
AI_Water
├── prep_tiles
└── name_of_img.tiff
Next run this command in the terminal (Note that 512 is the dimensions and can be any arbitrary value, but to be ran in the provided Neural Network it must be 512):
$ python3 scripts/prepare_data.py tile tile_name_of_img.tiff 512
To get more help on tiling run this command:
$ python3 scripts/prepare_data.py tile -h
In the terminal run the command:
$ python3 scripts/prepare_data.py classify prep_tiles
to get more help run the command:
$ python3 scripts/prepare_data.py classify -h
To run the Neural Net your data will first need to be prepared. This example would have a binary output as it includes a labels.json file. A masked data set would not have the labels.json file.
Within the same directory that main.py resides create a new folder called 'datasets'. Wrap all of your data and metadata into a folder and then move that folder into data set. Below is an example of a tiled data set that is ready to be restructured.
AI_Water
└── datasets
└── example_rtc # Each data set gets a directory
├── labels.json # Your .json file needs to be named labels.json
├── img1.tif
└── img2.tif
Once your data is in the correct directory run the following command:
$ python3 scripts/prepare_data.py prepare datasets/example_rtc .3
This will move the image tiles into the directory structure expected by the training script using a holdout of 30%.
To get more information on preparing the data set run:
$ python3 scripts/prepare_data.py prepare -h
At this point your data set is ready and the directory should look like this:
AI_Water
└── datasets
└── example_rtc
├── labels.json
├── test
│ └── img1.tif
└── train
└── img2.tif
The project is organized into directories as follows.
AI_Water
├── datasets
│ └── example_rtc # Each data set gets a directory
│ ├── labels.json
│ ├── test
│ └── train
├── models
│ └── example_net # Each model gets a directory containing .h5 files
│ ├── epoch1.h5
│ ├── history.json
│ └── latest.h5
├── src # Neural network source code
├── tests # Unit and integration tests
│ ├── unit_tests
│ └── integration_tests
└── ...
This project uses pytest
for unit testing. The easiest way to run the tests is
with pipenv. Make sure you have installed the development dependencies with:
$ pipenv install --dev
Then you can run the tests and get the full report with:
$ pipenv run tests
- Move your data set (along with
labels.json
) to thedataset
folder. - If you’re loading in weights run
main.py
with the--continue
option. If you’re not loading them in and you're restarting the training of the CNN you will need to runmain.py
with the--overwrite
option.
Start training a new network:
$ python3 main.py train awesome_net awesome_dataset --epochs 10
Evaluate the models performance:
$ python3 main.py test awesome_net awesome_dataset
Train for an additional 20 epochs:
$ python3 main.py train awesome_net awesome_dataset --epochs 20 --continue
You can view information about a model's performance with model_info.py
. This
includes a summary of model parameters, a visualization of convolutional
filters, a graph of training history and more.
View the models training history:
$ python3 scripts/model_info.py awesome_net history
For a list of available statistics run the help command:
$ python3 scripts/model_info.py -h