Inverse problem

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

An inverse problem in science is the process of calculating from a set of observations the causal factors that produced them: for example, calculating an image in X-ray computed tomography, source reconstruction in acoustics, or calculating the density of the Earth from measurements of its gravity field.

It is called an inverse problem because it starts with the results and then calculates the causes. This is the inverse of a forward problem, which starts with the causes and then calculates the results.

Inverse problems are some of the most important mathematical problems in science and mathematics because they tell us about parameters that we cannot directly observe. They have wide application in system identification, optics, radar, acoustics, communication theory, signal processing, medical imaging, computer vision, geophysics, oceanography, astronomy, remote sensing, natural language processing, machine learning, nondestructive testing, and many other fields.

History[edit]

One of the earliest examples of a solution to an inverse problem was discovered by Hermann Weyl and published in 1911, describing the asymptotic behavior of eigenvalues of the Laplace–Beltrami operator.[1] Today known as Weyl's law, it is perhaps most easily understood as an answer to the question of whether it is possible to hear the shape of a drum. Weyl conjectured that the eigenfrequencies of a drum would be related to the area and perimeter of the drum by a particular equation, which has later been improved upon by many mathematicians.

The field of inverse problems was later touched on by Soviet-Armenian physicist, Viktor Ambartsumian.[2][3]

While still a student, Ambartsumian thoroughly studied the theory of atomic structure, the formation of energy levels, and the Schrödinger equation and its properties, and when he mastered the theory of eigenvalues of differential equations, he pointed out the apparent analogy between discrete energy levels and the eigenvalues of differential equations. He then asked: given a family of eigenvalues, is it possible to find the form of the equations whose eigenvalues they are? Essentially Ambartsumian was examining the inverse Sturm–Liouville problem, which dealt with determining the equations of a vibrating string. This paper was published in 1929 in the German physics journal Zeitschrift für Physik and remained in obscurity for a rather long time. Describing this situation after many decades, Ambartsumian said, "If an astronomer publishes an article with a mathematical content in a physics journal, then the most likely thing that will happen to it is oblivion."

Nonetheless, toward the end of the Second World War, this article, written by the 20-year-old Ambartsumian, was found by Swedish mathematicians and formed the starting point for a whole area of research on inverse problems, becoming the foundation of an entire discipline.

Conceptual understanding[edit]

The inverse problem can be conceptually formulated as follows:

Data → Model parameters

The inverse problem is considered the "inverse" to the forward problem which relates the model parameters to the data that we observe:

Model parameters → Data

The transformation from data to model parameters (or vice versa) is a result of the interaction of a physical system with the object that we wish to infer properties about. In other words, the transformation is the physics that relates the physical quantity (i.e., the model parameters) to the observed data.

The table below shows some examples of physical systems, the governing physics, the physical quantity that we are interested, and what we actually observe.

Physical system Governing equations Physical quantity Observed data
Earth's gravitational field Newton's law of gravity Density Gravitational field
Earth's magnetic field (at the surface) Maxwell's equations Magnetic susceptibility Magnetic field
Seismic waves (from earthquakes) Wave equation Wave-speed (density) Particle velocity

Linear algebra is useful in understanding the physical and mathematical construction of inverse problems, because of the presence of the transformation or "mapping" of data to the model parameters.

General statement of the problem[edit]

The objective of an inverse problem is to find the best model parameters such that (at least approximately)

where is an operator describing the explicit relationship between the observed data, , and the model parameters. In various contexts, the operator is called forward operator, observation operator, or observation function. In the most general context, G represents the governing equations that relate the model parameters to the observed data (i.e., the governing physics).

Linear inverse problems[edit]

In the case of a discrete linear inverse problem describing a linear system, (the data) and (the best model) are vectors, and the problem can be written as

where is a matrix (an operator), often called the observation matrix.

Examples[edit]

Earth's gravitational field[edit]

Only a few physical systems are actually linear with respect to the model parameters. One such system from geophysics is that of the Earth's gravitational field. The Earth's gravitational field is determined by the density distribution of the Earth in the subsurface. Because the lithology of the Earth changes quite significantly, we are able to observe minute differences in the Earth's gravitational field on the surface of the Earth. From our understanding of gravity (Newton's Law of Gravitation), we know that the mathematical expression for gravity is:

where is a measure of the local gravitational acceleration, is the universal gravitational constant, is the local mass (which is related to density) of the rock in the subsurface and is the distance from the mass to the observation point.

By discretizing the above expression, we are able to relate the discrete data observations on the surface of the Earth to the discrete model parameters (density) in the subsurface that we wish to know more about. For example, consider the case where we have 5 measurements on the surface of the Earth. In this case, our data vector, is a column vector of dimension (5x1). We also know that we only have five unknown masses in the subsurface (unrealistic but used to demonstrate the concept). Thus, we can construct the linear system relating the five unknown masses to the five data points as follows:

The system has five equations, , with five unknowns, . To solve for the model parameters that fit our data, we might be able to invert the matrix to directly convert the measurements into our model parameters. For example:

However, not all square matrices are invertible ( is almost never invertible). This is because we are not guaranteed to have enough information to uniquely determine the solution to the given equations unless we have independent measurements (i.e. each measurement adds unique information to the system). It's important to note that in most physical systems, we do not ever have enough information to uniquely constrain our solutions because the observation matrix does not contain unique equations. From a linear algebra perspective, the matrix is rank deficient (i.e. has zero eigenvalues), meaning that is not invertible. Further, if we add additional observations to our matrix (i.e. more equations), then the matrix is no longer square. Even then, we're not guaranteed to have full-rank in the observation matrix. Therefore, most inverse problems are considered to be underdetermined, meaning that we do not have unique solutions to the inverse problem. If we have a full-rank system, then our solution may be unique. Overdetermined systems (more equations than unknowns) have other issues.

Because we cannot directly invert the observation matrix, we use methods from optimization to solve the inverse problem. To do so, we define a goal, also known as an objective function, for the inverse problem. The goal is a functional that measures how close the predicted data from the recovered model fits the observed data. In the case where we have perfect data (i.e. no noise) and perfect physical understanding (i.e. we know the physics) then the recovered model should fit the observed data perfectly. The standard objective function, , is usually of the form:

which represents the 2-norm of the misfit between the observed data and the predicted data from the model. We use the 2-norm here as a generic measurement of the distance between the predicted data and the observed data, but other norms are possible for use. The goal of the objective function is to minimize the difference between the predicted and observed data.

To minimize the objective function (i.e. solve the inverse problem) we compute the gradient of the objective function using the same rationale as we would to minimize a function of only one variable. The gradient of the objective function is:

where GT denotes the matrix transpose of G. This equation simplifies to:

After rearrangement, this becomes:

This expression is known as the Normal Equation and gives us a possible solution to the inverse problem. It is equivalent to ordinary least squares:

Additionally, we usually know that our data has random variations caused by random noise, or worse yet coherent noise. In any case, errors in the observed data introduce errors in the recovered model parameters that we obtain by solving the inverse problem. To avoid these errors, we may want to constrain possible solutions to emphasize certain possible features in our models. This type of constraint is known as regularization.

Fredholm integral[edit]

One central example of a linear inverse problem is provided by a Fredholm integral equation of the first kind:

For sufficiently smooth the operator defined above is compact on reasonable Banach spaces such as Lp spaces. Even if the mapping is injective, its inverse will not be continuous. (However, by the bounded inverse theorem, if the mapping is bijective, then the inverse will be bounded (i.e. continuous).) Thus small errors in the data are greatly amplified in the solution . In this sense the inverse problem of inferring from measured is ill-posed.

To obtain a numerical solution, the integral must be approximated using quadrature, and the data sampled at discrete points. The resulting system of linear equations will be ill-conditioned.

Computed tomography[edit]

Another example is the inversion of the Radon transform, essential to tomographic reconstruction for X-ray computed tomography. Here a function (initially of two variables) is deduced from its integrals along all possible lines. Although from a theoretical point of view many linear inverse problems are well understood, problems involving the Radon transform and its generalisations still present many theoretical challenges with questions of sufficiency of data still unresolved. Such problems include incomplete data for the x-ray transform in three dimensions and problems involving the generalisation of the x-ray transform to tensor fields. Solutions explored include Algebraic Reconstruction Technique, filtered backprojection, and as computing power has increased, iterative reconstruction methods such as iterative Sparse Asymptotic Minimum Variance[4].

Riemann hypothesis[edit]

A final example related to the Riemann hypothesis was given by Wu and Sprung, the idea is that in the semiclassical old quantum theory the inverse of the potential inside the Hamiltonian is proportional to the half-derivative of the eigenvalues (energies) counting function n(x).

Permeability Matching in Shale-gas Reservoirs[edit]

To accurately reproduce the permeability, a new method based on a combination of the Metropolis-Hastings and the genetic algorithms. The new method learns from its own previously generated realizations of the shale and produces models that match the existing permeability data.[5]

Deconvolution[edit]

A classical example of inverse problems is image (or signal) deblurring, i.e., a deconvolution problem in the plane. In such cases, the forward problem is a convolution with a smoothing convolution kernel. Considering the integral equation (of the Freholm type 1):

where is the kernel, and . The inverse problem is to reconstruct the original image based on a noisy and blurred image .[6]

Non-linear inverse problems[edit]

An inherently more difficult family of inverse problems are collectively referred to as non-linear inverse problems.

Non-linear inverse problems have a more complex relationship between data and model, represented by the equation:

Here is a non-linear operator and cannot be separated to represent a linear mapping of the model parameters that form into the data. In such research, the first priority is to understand the structure of the problem and to give a theoretical answer to the three Hadamard questions (so that the problem is solved from the theoretical point of view). It is only later in a study that regularization and interpretation of the solution's (or solutions', depending upon conditions of uniqueness) dependence upon parameters and data/measurements (probabilistic ones or others) can be done. Hence the corresponding following sections do not really apply to these problems. Whereas linear inverse problems were completely solved from the theoretical point of view at the end of the nineteenth century, only one class of nonlinear inverse problems was so before 1970, that of inverse spectral and (one space dimension) inverse scattering problems, after the seminal work of the Russian mathematical school (Krein, Gelfand, Levitan, Marchenko). A large review of the results has been given by Chadan and Sabatier in their book "Inverse Problems of Quantum Scattering Theory" (two editions in English, one in Russian).

In this kind of problem, data are properties of the spectrum of a linear operator which describe the scattering. The spectrum is made of eigenvalues and eigenfunctions, forming together the "discrete spectrum", and generalizations, called the continuous spectrum. The very remarkable physical point is that scattering experiments give information only on the continuous spectrum, and that knowing its full spectrum is both necessary and sufficient in recovering the scattering operator. Hence we have invisible parameters, much more interesting than the null space which has a similar property in linear inverse problems. In addition, there are physical motions in which the spectrum of such an operator is conserved as a consequence of such motion. This phenomenon is governed by special nonlinear partial differential evolution equations, for example the Korteweg–de Vries equation. If the spectrum of the operator is reduced to one single eigenvalue, its corresponding motion is that of a single bump that propagates at constant velocity and without deformation, a solitary wave called a "soliton".

A perfect signal and its generalizations for the Korteweg–de Vries equation or other integrable nonlinear partial differential equations are of great interest, with many possible applications. This area has been studied as a branch of mathematical physics since the 1970s. Nonlinear inverse problems are also currently studied in many fields of applied science (acoustics, mechanics, quantum mechanics, electromagnetic scattering - in particular radar soundings, seismic soundings, and nearly all imaging modalities).

Applications[edit]

Inverse problem theory is used extensively in weather predictions, oceanography, hydrology, and petroleum engineering.[7][8]

Inverse problems are also found in the field of heat transfer, where a surface heat flux[9] is estimated outgoing from temperature data measured inside a rigid body. The linear inverse problem is also the fundamental of spectral estimation and direction-of-arrival (DOA) estimation in signal processing.

Inverse, parameter and crack identification problems have been studied, by using optimization and soft computing tools. [10] [11]

Mathematical considerations[edit]

Inverse problems are typically ill posed, as opposed to the well-posed problems more typical when modeling physical situations where the model parameters or material properties are known. Of the three conditions for a well-posed problem suggested by Jacques Hadamard (existence, uniqueness, and stability of the solution or solutions) the condition of stability is most often violated. In the sense of functional analysis, the inverse problem is represented by a mapping between metric spaces. While inverse problems are often formulated in infinite dimensional spaces, limitations to a finite number of measurements, and the practical consideration of recovering only a finite number of unknown parameters, may lead to the problems being recast in discrete form. In this case the inverse problem will typically be ill-conditioned. In these cases, regularization may be used to introduce mild assumptions on the solution and prevent overfitting. Many instances of regularized inverse problems can be interpreted as special cases of Bayesian inference[12].

See also[edit]

Academic journals[edit]

Four main academic journals cover inverse problems in general:

  • Inverse Problems
  • Journal of Inverse and Ill-posed Problems[13]
  • Inverse Problems in Science and Engineering[14]
  • Inverse Problems and Imaging[15]

Many journals on medical imaging, geophysics, non-destructive testing, etc. are dominated by inverse problems in those areas.

References[edit]

  1. ^ Weyl, Hermann (1911). "Über die asymptotische Verteilung der Eigenwerte". Nachrichten der Königlichen Gesellschaft der Wissenschaften zu Göttingen: 110–117.
  2. ^ » Epilogue — Ambartsumian’ s paper Viktor Ambartsumian
  3. ^ Ambartsumian, Rouben V. (1998). "A life in astrophycis. Selected papers of Viktor A. Ambartsumian". Astrophysics. 41 (4): 328–330. doi:10.1007/BF02894658.
  4. ^ Abeida, Habti; Zhang, Qilin; Li, Jian; Merabtine, Nadjim (2013). "Iterative Sparse Asymptotic Minimum Variance Based Approaches for Array Processing" (PDF). IEEE Transactions on Signal Processing. 61 (4): 933–944. arXiv:1802.03070. doi:10.1109/tsp.2012.2231676. ISSN 1053-587X.
  5. ^ Tahmasebi, Pejman; Javadpour, Farzam; Sahimi, Muhammad (August 2016). "Stochastic shale permeability matching: Three-dimensional characterization and modeling". International Journal of Coal Geology. 165: 231–242. doi:10.1016/j.coal.2016.08.024.
  6. ^ Kaipio, J., & Somersalo, E. (2010). Statistical and computational inverse problems. New York, NY: Springer.
  7. ^ Carl Wunsch (13 June 1996). The Ocean Circulation Inverse Problem. Cambridge University Press. pp. 9–. ISBN 978-0-521-48090-1.
  8. ^ Tahmasebi, Pejman; Javadpour, Farzam; Sahimi, Muhammad (August 2016). "Stochastic shale permeability matching: Three-dimensional characterization and modeling". International Journal of Coal Geology. 165: 231–242. doi:10.1016/j.coal.2016.08.024.
  9. ^ Patric Figueiredo (December 2014). Development Of An Iterative Method For Solving Multidimensional Inverse Heat Conduction Problems. Lehrstuhl für Wärme- und Stoffübertragung RWTH Aachen.
  10. ^ G.E. Stavroulakis (2001). Inverse and Crack Identification Problems in Engineering Mechanics. Springer. ISBN 978-0-7923-6690-4.
  11. ^ Z. Mróz, G.E. Stavroulakis (2005-11-24). Parameter Identification of Materials and Structures. Springer. ISBN 978-3-211-30151-7.
  12. ^ Tarantola, Albert (2005). "Front Matter" (PDF). Inverse Problem Theory and Methods for Model Parameter Estimation. SIAM. pp. i–xii. doi:10.1137/1.9780898717921.fm. ISBN 978-0-89871-572-9 – via epubs.siam.org.
  13. ^ "Journal of Inverse and Ill-posed Problems".
  14. ^ "Inverse Problems in Science and Engineering: Vol 25, No 4".
  15. ^ "IPI". Archived from the original on 11 October 2006.

References[edit]

Further reading[edit]

  • C. W. Groetsch (1999). Inverse Problems: Activities for Undergraduates. Cambridge University Press. ISBN 978-0-88385-716-8.

External links[edit]