Deep Image Prior

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Deep Image Prior is a type of convolutional neural network used to enhance a given image with no prior training data other than the image itself. A neural-network is randomly initialized and used as prior to solve inverse problems such as noise reduction, super-resolution, and inpainting. Image statistics is captured by the structure of a convolutional image generator rather than by any previously learned capabilities.



Inverse problems such as noise reduction, super-resolution, and inpainting can be formulated as the optimization task , where is an image, a corrupted representation of that image, is a task-dependent data term, and R(x) is the regularizer. This forms an energy minimization problem.

Deep neural networks learn a generator/decoder which maps a random code vector to an image .

The image corruption method used to generate is selected for the specific application.


In this approach, the prior is replaced with the implicit prior captured by the neural network (where for images that can be produced by a deep neural networks and otherwise). This yields the equation for the minimizer and the result of the optimization process .

The minimizer (typically a gradient descent) starts from a randomly initialized parameters and descends into a local best result to yield the restoration function.


A parameter θ may be used to recover any image, including its noise. However, the network is reluctant to pick up noise because it contains high impedance while useful signal offers low impedance. This results in the θ parameter approaching a good-looking local optimum so long as the number of iterations in the optimization process remains low enough not to overfit data.



The principle of denoising is to recover an image from a noisy observation , where . The distribution is sometimes known (e.g.: profiling sensor and photon noise[1]) and may optionally be incorporated into the model, though this process works well in blind denoising.

The quadratic energy function is used as the data term, plugging it into the equation for yields the optimization problem .


Super-resolution is used to generate a higher resolution version of image x. The data term is set to where d(·) is a downsampling operator such as Lanczos that decimates the image by a factor t.


Inpainting is used to reconstruct a missing area in an image . These missing pixels are defined as the binary mask . The data term is defined as (where is the Hadamard product).

Flash-no-flash reconstruction[edit]

This approach may be extended to multiple images. A straightforward example mentioned by the author is the reconstruction of an image to obtain natural light and clarity from a flash-no-flash pair. Video reconstruction is possible but it requires optimizations to take into account the spatial differences.



  1. ^ jo (2012-12-11). "profiling sensor and photon noise .. and how to get rid of it". darktable.
  2. ^
  • Ulyanov, Dmitry; Vedaldi, Andrea; Lempitsky, Victor (30 November 2017). "Deep Image Prior". arXiv:1711.10925v2.