Skip to content

Vadermit/TransPAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TransPAI

MIT License Python 3.7 repo size GitHub stars

Transportation data online Prediction And Imputation(TransPAI).

This is the code repository for paper 'Real-time Spatiotemporal Prediction and Imputation of Traffic Status Based onLSTM and Graph Laplacian Regularized Matrix Factorization' which is submitted to Transportation Research Part C: Emerging Technologies

Contents

Strategic aim

Minning the spatial temporal characteristics of transportation data to predict the future transportation status. And impute possible missing entries along the real-time data collection.

Tasks and challenges

Tasks

  • Online traffic data prediction and imputation

    • Online prediction Predict traffic status in the next time step using real-time observation data.
    • Online imuputation Impute incomplete observations with the real-time data collection.

Challenges

  • Incomplete observations

The data we acquired may not be complete due to detector mailfunction, data transmission error and so on. We need to mine the data characteristic and make predictions with insufficient information. There are basically two forms of data missing:

  • Point-wise missing (PM): Each sensor lost observations for individual time steps at completely random.
  • Continuous missing (CM): Each sensor lost observations for continuous periods e.g. a day.

Overview

Accurate prediction of traffic status in real time is critical for advanced traffic management and travel navigation guidance. There are many attempts to predict short-term traffic flows using various deep learning algorithms. Most existing prediction models are only tested on spatiotemporal data assuming no missing data entries. However, this ideal situation rarely exists in real world due to sensor or network transmission failure. Missing data is an unnegligible problem.  Previous studies either remove time series with missing entries or impute missing data before building prediction models. The former may cause insufficient data for model training, while the latter adds extra computational burden and the imputation accuracy has direct impacts on the prediction performance.

Proposed method

We propose a framework based on Matrix Factorization which is able to make spatiotemporal predictions using raw incomplete data and perform online data imputation simultaneously. We innovatively design a spatial and temporal regularized matrix factorization model, namely LSTM-GL-ReMF, as the key component of the framework.

  • LSTM Graph Laplacian Regularized Matrix Factorization (LSTM-GL-ReMF)

On the basis of TRMF, we propose a novel LSTM and Graph Laplacian regularized matrix factorization (LSTM-GL-ReMF). In LSTM-GL-ReMF, its temporal regularizer depends on the state-of-the-art Long Short-term Memory (LSTM) model, and the spatial regularizer is designed based on Graph Laplacian (GL) spatial regularization. These regularizers enable the incorporation of complex spatial and temporal dependence into matrix factorization process for more accurate prediction performance. The illustration of LSTM-GL-ReMF is presented as:

The proposed MF model can be easily extended to LSTM Regularized Matrix Factorization (LSTM-ReMF) model by neglectng the Graph Laplacian spatial regularizer. LSTM-ReMF and LSTM-GL-ReMF can also be extended to there tensor deomcomposition version LSTM-ReTF and LSTM-GL-ReTF respectively by following the tensor Canonical Polyadic (CP) decomposition method.

  • An online prediction and imputation framework for spatiotemporal traffic status

We propose a framework based on the aforementioned LSTM-(GL-)ReMF/TF models which is able to make spatiotemporal predictions using raw incomplete data and perform online data imputation simultaneously. As shown in the figure below, the framework basically consists of two steps: static training and dynamic prediction and imputation.

Model Comparison

  • Our proposed models

Proposed Models Seattle Speed Data Shanghai Pollutant Data Code Format
Online Tasks: Prediction Imputation Prediction Imputation
LSTM-ReMF Jupyter Notebook
LSTM-GL-ReMF Jupyter Notebook
LSTM-ReTF Jupyter Notebook
LSTM-GL-ReTF Jupyter Notebook
  • Baseline models

Proposed Models Seattle Speed Data Shanghai Pollutant Data Code Format
Online Tasks: Prediction Imputation Prediction Imputation
TRMF Jupyter Notebook
BTMF Jupyter Notebook
LSTM Jupyter Notebook
GRU-D Jupyter Notebook
GCN-DDGF Python Code
TGC-LSTM Jupyter Notebook

Selected references

Our blog posts (in Chinese)

License

This work is released under the MIT license.

About

Transportation data online prediction

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published