Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

Code for ICDM 2020 paper Context-aware Deep Representation Learning for Geo-spatiotemporal Analysis

License

Notifications You must be signed in to change notification settings

facebookresearch/Context-Aware-Representation-Crop-Yield-Prediction

Repository files navigation

Context-aware Deep Representation Learning for Geo-spatiotemporal Analysis

Code for ICDM 2020 paper Context-aware Deep Representation Learning for Geo-spatiotemporal Analysis.

Data Preprocessing

Data Sources

  1. County-level soybean yields (year 2003 to 2018) is downloaded from USDA NASS Quick Stats Database.
  2. Landcover class is from the MODIS product MCD12Q1 and downloaded from Google Earth Engine. gee_county_lc.py and gee_landcover.py are the files to call Google Earth Engine and download the data. County boundaries from Google's fusion table are utilized to download the landcover class data for each county separately.

Input:

  1. Vegetation indices including NDVI and EVI are from MODIS product MOD13A3 and downloaded from AρρEEARS.
  2. Precipitation is from the PRISM dataset.
  3. Land surface temperature is from MODIS product MOD11A1 and downloaded from AρρEEARS.
  4. Elevation is from the NASA Shuttle Radar Topography Mission Global 30 m product and downloaded from AρρEEARS.
  5. Soil properties including soil sand, silt, and clay fractions are from the STATSGO data base.

Preprocessing

Data from various sources are first converted to a unified format netCDF4 with their original resolutions being kept. They are then rescaled to the MODIS product grid at 1 km resolution.

Experiment Data Generation

Quadruplet sampling code is contained in folder data_preprocessing/sample_quadruplets. Functions are then called by generate_experiment_data.py to generate experiment data.

Modeling

We provide code here for the context-aware representation learning model and all baselines mentioned in the paper, including traditional models for scalar inputs, deep gausian models, cnn-lstm and c3d.

A few examples of commands to train the models:

  1. attention model - semisupervised:
python ./crop_yield_train_semi_transformer.py --neighborhood-radius 25 --distant-radius 100 --weight-decay 0.0 --tilenet-margin 50 --tilenet-l2 0.2 --tilenet-ltn 0.001 --tilenet-zdim 256 --attention-layer 1 --attention-dff 512 --sentence-embedding simple_average --dropout 0.2 --unsup-weight 0.2 --patience 9999 --feature all --feature-len 9 --year 2018 --ntsteps 7 --train-years 10 --query-type combine
  1. attention model - supervised:
python ./crop_yield_train_semi_transformer.py --neighborhood-radius 25 --distant-radius 100 --weight-decay 0.0 --tilenet-margin 50 --tilenet-l2 0.2 --tilenet-ltn 0.001 --tilenet-zdim 256 --attention-layer 1 --attention-dff 512 --sentence-embedding simple_average --dropout 0.2 --unsup-weight 0.0 --patience 9999 --feature all --feature-len 9 --year 2018 --ntsteps 7 --train-years 10 --query-type combine

  When query type is set as "combine", the hybrid attention mechanism introduced in the ICDM 2020 papaer is adopted. You can test other query types ("global", "fixed", "separate") on your data as well.

  1. c3d
python ./crop_yield_train_c3d.py --patience 9999 --feature all --feature-len 9 --year 2018 --ntsteps 7 --train-years 10
  1. cnn-lstm
python ./crop_yield_train_cnn_lstm.py --patience 9999 --feature all --feature-len 9 --year 2018 --ntsteps 7 --train-years 10 --tilenet-zdim 256 --lstm-inner 512
  1. deep gaussian
python ./crop_yield_deep_gaussian.py --type cnn --time 7 --train-years 10
  1. traditional models
python ./crop_yield_no_spatial.py --predict no_spatial --train-years 10

Cite this work

License

MIT licensed. See the LICENSE file for details.

About

Code for ICDM 2020 paper Context-aware Deep Representation Learning for Geo-spatiotemporal Analysis

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages