Skip to content

afvk/FIML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Field inversion and machine learning

Introduction

Machine learning has a huge impact in fields where theory-based models have failed to perform in a satisfactory manner. Examples of these fields are image processing, speech recognition, and machine translation. However, machine learning can also play an important role in improving methods in fields where physical theories have dominated. An important difference with aforementioned fields is that in physics dominated domains, the majority of the problem can be modeled using physical laws. Directly applying machine learning to map the input of the problem to some output of interest usually does not work that well. However, finding the step where the largest assumption is made and applying machine learning to this step can significantly improve the results of these methods.

Field inversion and machine learning

One way to implement machine learning to physical models is the paradigm of field inversion and machine learning [1]. An important advantage of this paradigm is that enables incorporating prior knowledge into the model, and also allows the modeler to extract modeling knowledge from the results. The paradigm can be summarized as follows:

  1. Define some corrective term in the base model
  2. Extract the optimal corrective function from high fidelity data
  3. Train a machine learning model to estimate the corrective function, given a set of features

Optimization problem

One way of defining the optimization step is in maximizing the probability of the corrective term given the data, i.e. finding the maximum a posteriori (MAP) solution. Assuming that the prior and the discrepancy between the model output and the high-fidelity data are normally distributed gives us the following posterior.

As we want to maximize the posterior, our optimization objective is to minimize $J$. If we choose our prior and observational covariance matrices to be simple identity matrices multiplied with some constant, the optimization problem reduces to minimizing the sum of the square discrepancies with a regularization term.

Gradients

If we want to use gradient-based optimization methods, we need some way to find the gradient of the objective function with respect to the corrective term. For small scale problems, it is easy to find these gradient using a finite difference approximation. However, in applications where the modeling problem is discretized into a large number of cells (e.g. in computational fluid dynamics), this approach is computationally unfeasible.

We can rewrite the gradient of the objective function using the chain rule.

The explicit derivatives are easy to obtain: they can be derived directly from our definition of the objective function. However, the sensitivity of our variables cannot be obtained straightforwardly. Also, we have a set of governing equations which we can rewrite as

As we don't want the validity of our governing equations to change if we change the corrective term, we can write

Again, the explicit derivatives follow straightforwardly from the discretization of the governing equations. Introducing some new set of variables $\psi$, which we will determine later, we can write the gradient as

We will call this new set of variables the adjoint variables. Using the constraint that the indicated term should be zero, they can be determined by solving a system of linear equations.

We now have an expression for the gradients which we can easily evaluate, given that we do one extra system solve. Note that our gradient calculation is now practically independent of the number of points in our simulation.

Example

To illustrate the paradigm, [1] uses the following scalar ordinary differential equation

where can be a function of z, T is our primal variable, and

where h = 0.5. Let's say we want to model this process using

and want to enhance this model using a spatially varying corrective term,

The convenience of illustrating the paradigm using a simple model problem like this is that we can derive the true form of the corrective term.

Discretization

Forward problem/primal equation

The problem can be discretized using finite volume discretization with homogeneous boundary conditions. Using a central difference scheme for the second order derivative and rewriting the equation for the temperature in cell i gives

Similarly, the base model and the augmented model can be solved as

and

These equations are then solved iteratively until convergence, using under-relaxation to stabilize the iterations, i.e.

where trades off stability (for low alpha) and convergence (for high alpha). The iterations are stopped once the L2-norm between two consecutive solutions drops below a specified criterion.

Adjoint equation

The partial derivatives necessary for setting up the adjoint equation require taking two scalar-by-vector and two vector-by-vector derivatives of the objective function and the governing equation, respectively. This can be done conveniently using Einstein summation convention.

Making use of the fact that the prior and observational covariance matrices are symmetric, and using

where is the Kronecker delta, we can easily derive

Machine learning

In [1], Gaussian processes are used for the machine learning phase. Some preliminary investigations with random forests and neural networks show similar or slightly improved results. An important requirement for the machine learning phase is its capability of taking into account the variance information coming from the field inversion phase. As a next step I will look into using TensorFlow Probability [2] for implementing a Bayesian neural network to take into account the posterior variance of the field inversion phase.

[1] Parish, E. J., & Duraisamy, K. (2016). A paradigm for data-driven predictive modeling using field inversion and machine learning. Journal of Computational Physics, 305, 758-774.

[2] https://www.tensorflow.org/probability/

About

Reproduction exercise for the paradigm of field inversion and machine learning [1]

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages