# Kriging

Kriging is a group of statistical techniques to interpolate the value of a random field (e.g., the elevation, z, of the landscape as a function of the geographic location) at an unobserved location from observations of its value at nearby locations.

The theory behind interpolation and extrapolation by kriging was developed by the French mathematician Georges Matheron based on the Master's thesis of Daniel Gerhardus Krige, the pioneering plotter of distance-weighted average gold grades at the Witwatersrand reef complex in South Africa. The English verb is to krige and the most common noun is kriging; both are often pronounced with a hard "g", following the pronunciation of the name "Krige".

## Kriging interpolation Figure 1. Example of one-dimensional data interpolation by kriging, with confidence intervals. Squares indicate the location of the data. The kriging interpolation is in red. The confidence intervals are in green.

Kriging belongs to the family of linear least squares estimation algorithms. As illustrated in Figure 1, the aim of kriging is to estimate the value of an unknown real-valued function, $f\,$, at a point, $x^*\,$, given the values of the function at some other points, $x_1,\ldots, x_n$. A kriging estimator is said to be linear because the predicted value $\hat f(x^*)$ is a linear combination that may be written as $\hat f(x^*) = \sum_{i=1}^n \lambda_i(x^*) f(x_i)$ .

The weights $\lambda_i$ are solutions of a system of linear equations which is obtained by assuming that $f$ is a sample-path of a random process $F(x)$, and that the error of prediction $\varepsilon(x) = F(x) - \sum_{i=1}^n \lambda_i(x) F(x_i)$

is to be minimized in some sense. For instance, the so-called simple kriging assumption is that the mean and the covariance of $F(x)$ is known and then, the kriging predictor is the one that minimizes the variance of the prediction error.

From the geological point of view, the practice of kriging is based on assuming continued mineralization between measured values. Assuming prior knowledge encapsulates how minerals co-occur as a function of space. Then, given an ordered set of measured grades, interpolation by kriging predicts mineral concentrations at unobserved points.

## Applications of kriging

The application of kriging to problems in geology and mining as well as to hydrology started in the mid-60s and especially in the 70s with the work of Georges Matheron. The connection between kriging and geostatistics remains prevalent today.

Kriging has been used in:

• Environmental science
• Black box modelling in computer experiments

## Controversy in climate change, mineral exploration, and mining

The question of whether spatial dependence may be assumed or ought to be verified (by applying Fisher's F-test to the variance of a set of measured values and the first variance term of the ordered set prior to interpolation by kriging) is of relevance in mineral exploration, mining, and the study of climate change. Following is an example in mineral exploration. Clark and the Kriging Game is explored in Clark's Practical Geostatistics. The ordered set of measured values does not display a significant degree of spatial dependence. Yet, Clark reports a kriged estimate for some selected coordinates within this sample space anyway. Neither does the data set that underpins the above Figure 1 display a significant degree of spatial dependence. Interpolation by kriging makes no sense when applied to ordered sets of widely spaced measured values in sample spaces or sampling units without verifying spatial dependence. Spatial dependence between borehole grades or blasthole grades was assumed at Bre-X's Busang property, Hecla's Grouse Creek mine, and scores of others where mined grades were significantly lower than predicted grades. A significant degree of spatial dependence would justify interpolation between measured values in ordered sets. Failing the test for spatial dependence would imply that more measured values are required to derive unbiased confidence limits for metal grades and contents.

## Mathematical details

### General equations of kriging

Kriging is a group of statistical techniques to interpolate the value $Z(x_0)$ of a random field $Z(x)$ (e.g. the elevation $Z$ of the landscape as a function of the geographic location $x$) at an unobserved location $x_0$ from observations $z_i=Z(x_i),\;i=1,\ldots,n$ of the random field at nearby locations $x_1,\ldots,x_n$. Kriging computes the best linear unbiased estimator $\hat{Z}(x_0)$ of $Z(x_0)$ based on a stochastic model of the spatial dependence quantified either by the variogram $\gamma(x,y)$ or by expectation $\mu(x)=E[Z(x)]$ and the covariance function $c(x,y)$ of the random field.

The kriging estimator is given by a linear combination $\hat{Z}(x_0)=\sum_{i=1}^n w_i(x_0) Z(x_i)$

of the observed values $z_i=Z(x_i)$ with weights $w_i(x_0),\;i=1,\ldots,n$ chosen such that the variance (also called kriging variance or kriging error): $\sigma^2_k(x_0):=\mathrm{Var}\left(\hat{Z}(x_0)-Z(x_0)\right)=\sum_{i=1}^n\sum_{j=1}^n w_i(x_0) w_j(x_0) c(x_i,x_j) + \mathrm{Var}\left(Z(x_0)\right)-2\sum_{i=1}^nw_i(x_0)c(x_i,x_0)$

is minimized subject to the unbiasedness condition: $\mathrm{E}[\hat{Z}(x)-Z(x)]=\sum_{i=1}^n w_i(x_0)\mu(x_i) - \mu(x_0) =0$

The kriging variance must not be confused with the variance $\mathrm{Var}\left(\hat{Z}(x_0)\right)=\mathrm{Var}\left(\sum_{i=1}^n w_iZ(x_i)\right)=\sum_{i=1}^n\sum_{j=1}^n w_i w_j c(x_i,x_j)$

of the kriging predictor $\hat{Z}(x_0)$ itself.

### The types of kriging

Depending on the stochastic properties of the random field different types of kriging apply. The type of kriging determines the linear constraint on the weights $w_i$ implied by the unbiasedness condition; i.e. the linear constraint, and hence the method for calculating the weights, depends upon the type of kriging.

Classical types of kriging are

• Simple kriging assumes a known constant trend: $\mu(x)=0\,$
• Ordinary kriging assumes an unknown constant trend: $\mu(x)=\mu\,$
• Universal kriging assumes a general linear trend model $\mu(x)=\sum_{k=0}^p \beta_k f(x)$
• IRFk-kriging assumes $\mu(x)$ to be an unknown polynomial in $x$.
• Indicator kriging uses indicator functions instead of the process itself, in order to estimate transition probabilities.
• Multiple-indicator kriging is a version of indicator kriging working with a family of indicators. However, MIK has fallen out of favour as an interpolation technique in recent years. This is due to some inherent difficulties related to operation and model validation. Conditional simulation is fast becoming the accepted replacement technique in this case.
• Disjunctive kriging is a nonlinear generalisation of kriging.
• Lognormal kriging interpolates positive data by means of logarithms.

### Simple kriging

Simple kriging is mathematically the simplest, but the least general. It assumes the expectation of the random field to be known, and relies on a covariance function. However, in most applications neither the expectation nor the covariance are known beforehand.

#### Simple kriging assumptions

The practical assumptions for the application of simple kriging are:

• wide sense stationarity of the field.
• The expectation is zero everywhere: $\mu(x)=0\,$.
• Known covariance function $c(x,y)=\mathrm{Cov}(Z(x),Z(y))\,$

#### Simple kriging equation

The kriging weights of simple kriging have no unbiasedness condition and are given by the simple kriging equation system: $\begin{pmatrix}w_1 \\ \vdots \\ w_n \end{pmatrix}= \begin{pmatrix}c(x_1,x_1) & \cdots & c(x_1,x_n) \\ \vdots & \ddots & \vdots \\ c(x_n,x_1) & \cdots & c(x_n,x_n) \end{pmatrix}^{-1} \begin{pmatrix}c(x_1,x_0) \\ \vdots \\ c(x_n,x_0) \end{pmatrix}$

This is analogous to a linear regression of $Z(x_0)$ on the other $z_1 , \ldots, z_n$.

#### Simple kriging interpolation

The interpolation by simple kriging is given by: $\hat{Z}(x_0)=\begin{pmatrix}z_1 \\ \vdots \\ z_n \end{pmatrix}' \begin{pmatrix}c(x_1,x_1) & \cdots & c(x_1,x_n) \\ \vdots & \ddots & \vdots \\ c(x_n,x_1) & \cdots & c(x_n,x_n) \end{pmatrix}^{-1} \begin{pmatrix}c(x_1,x_0) \\ \vdots \\ c(x_n,x_0)\end{pmatrix}$

#### Simple kriging error

The kriging error is given by: $\mathrm{Var}\left(\hat{Z}(x_0)-Z(x_0)\right)=\underbrace{c(x_0,x_0)}_{\mathrm{Var}(Z(x_0))}- \underbrace{\begin{pmatrix}c(x_1,x_0) \\ \vdots \\ c(x_n,x_0)\end{pmatrix}' \begin{pmatrix} c(x_1,x_1) & \cdots & c(x_1,x_n) \\ \vdots & \ddots & \vdots \\ c(x_n,x_1) & \cdots & c(x_n,x_n) \end{pmatrix}^{-1} \begin{pmatrix}c(x_1,x_0) \\ \vdots \\ c(x_n,x_0) \end{pmatrix}}_{\mathrm{Var}(\hat{Z}(x))}$

which leads to the generalised least squares version of the Gauss-Markov theorem (Chiles & Delfiner 1999, p. 159): $\mathrm{Var}(Z(x_0))=\mathrm{Var}(\hat{Z}(x_0))+\mathrm{Var}\left(\hat{Z}(x_0)-Z(x_0)\right).$

### Ordinary kriging

Ordinary kriging is the most commonly used type of kriging. It assumes a constant but unknown mean.

#### Typical ordinary kriging assumptions

The typical assumptions for the practical application of ordinary kriging are:

• Intrinsic stationarity or wide sense stationarity of the field
• enough observations to estimate the variogram.

The mathematical condition for applicability of ordinary kriging are:

• The mean $E[Z(x)]=\mu\,$ is unknown but constant
• The variogram $\gamma(x,y)=E[(Z(x)-Z(y))^2]\,$ of $Z(x)\,$ is known.

#### Ordinary kriging equation

The kriging weights of ordinary kriging fulfill the unbiasedness condition $\sum_{i=1}^n \lambda_i = 1$

and are given by the ordinary kriging equation system: $\begin{pmatrix}\lambda_1 \\ \vdots \\ \lambda_n \\ \mu \end{pmatrix}= \begin{pmatrix}\gamma(x_1,x_1) & \cdots & \gamma(x_1,x_n) &1 \\ \vdots & \ddots & \vdots & \vdots \\ \gamma(x_n,x_1) & \cdots & \gamma(x_n,x_n) & 1 \\ 1 &\cdots& 1 & 0 \end{pmatrix}^{-1} \begin{pmatrix}\gamma(x_1,x^*) \\ \vdots \\ \gamma(x_n,x^*) \\ 1\end{pmatrix}$

the additional parameter $\mu$ is a Lagrange multiplier used in the minimization of the kriging error $\sigma_k^2(x)$ to honor the unbiasedness condition.

#### Ordinary kriging interpolation

The interpolation by ordinary kriging is given by: $\hat{Z}(x^*)=\begin{pmatrix}\lambda_1 \\ \vdots \\ \lambda_n \end{pmatrix}' \begin{pmatrix}Z(x_1) \\ \vdots \\ Z(x_n) \end{pmatrix}$

#### Ordinary kriging error

The kriging error is given by: $var\left(\hat{Z}(x^*)-Z(x^*)\right)= \begin{pmatrix}\lambda_1 \\ \vdots \\ \lambda_n \\ \mu \end{pmatrix}' \begin{pmatrix}\gamma(x_1,x^*) \\ \vdots \\ \gamma(x_n,x^*) \\ 1\end{pmatrix}$

### Properties of kriging

(Cressie 1993, Chiles&Delfiner 1999, Wackernagel 1995)

• The kriging estimation is unbiased: $E[\hat{Z}(x_i)]=E[Z(x_i)]$
• The kriging estimation honors the actually observed value: $\hat{Z}(x_i)=Z(x_i)$ (assuming no measurement error is incurred)
• The kriging estimation $\hat{Z}(x)$ is the best linear unbiased estimator of $Z(x)$ if the assumptions hold. However (e.g. Cressie 1993):
• As with any method: If the assumptions do not hold, kriging might be bad.
• There might be better nonlinear and/or biased methods.
• No properties are guaranteed, when the wrong variogram is used. However typically still a 'good' interpolation is achieved.
• Best is not necessarily good: e.g. In case of no spatial dependence the kriging interpolation is only as good as the arithmetic mean.
• Kriging provides $\sigma_k^2$ as a measure of precision. However this measure relies on the correctness of the variogram.

## Related terms and techniques

### Kriging terms

A series of related terms were also named after Krige, including kriged estimate, kriged estimator, kriging variance, kriging covariance, zero kriging variance, unity kriging covariance, kriging matrix, kriging method, kriging model, kriging plan, kriging process, kriging system, block kriging, co-kriging, disjunctive kriging, linear kriging, ordinary kriging, point kriging, random kriging, regular grid kriging, simple kriging and universal kriging.

### Related methods

Kriging is mathematically closely related to regression analysis. Both theories derive a best linear unbiased estimator, based on assumptions on covariances, make use of Gauss-Markov theorem to prove independence of the estimate and error, and make use of very similar formulae. They are nevertheless useful in different frameworks: Kriging is made for interpolation of a single realisation of a random field, while regression models are based on multiple observations of a multivariate dataset.

In the statistical community the same technique is also known as Gaussian process regression, Kolmogorov Wiener prediction, or best linear unbiased prediction.

The kriging interpolation may also be seen as a spline in a reproducing kernel Hilbert space, with reproducing kernel given by the covariance function. The difference with the classical kriging approach is provided by the interpretation: while the spline is motivated by a minimum norm interpolation based on a Hilbert space structure, kriging is motivated by an expected squared prediction error based on a stochastic model.

Kriging with polynomial trend surfaces is mathematically identical to generalized least squares polynomial curve fitting.

Kriging can also be understood as a form of Bayesian inference. Kriging starts with a prior distribution over functions. This prior takes the form of a Gaussian process: $N$ samples from a function will be normally distributed, where the covariance between any two samples is the covariance function (or kernel) of the Gaussian process evaluated at the spatial location of two points. A set of values is then observed, each value associated with a spatial location. Now, a new value can be predicted at any new spatial location, by combining the Gaussian prior with a Gaussian likelihood function for each of the observed values. The resulting posterior distribution is also Gaussian, with a mean and covariance that can be simply computed from the observed values, their variance, and the kernel matrix derived from the prior.

## History

The theory of Kriging was developed by the French mathematician Georges Matheron based on the Master's thesis of Daniel Gerhardus Krige, the pioneering plotter of distance-weighted average gold grades at the Witwatersrand reef complex. The English verb is to krige and the most common adjective is kriging. The method was called krigeage for the first time in Matheron's 1960 Krigeage d’un Panneau Rectangulaire par sa Périphérie. Matheron, in this Note Géostatistique No 28, derives k*, his 'estimateur' and a precursor to the kriged estimate or kriged estimator. In classical statistics, Matheron’s k* is the length-weighted average grade of each of his panneaux in his set. What Matheron failed to derive was var(k*), the variance of his estimateur. On the contrary, he computed the length-weighted average grade of each panneau but did not compute the variance of its central value. In time, he replaced length-weighted average grades for three-dimensional sample spaces such as Matheronian blocks of ore with more abundant distance-weighted average grades for zero-dimensional sample spaces such as Matheronian points.

A central doctrine of geostatistics is that spatial dependence need not be verified but may be assumed to exist between two or more Matheronian points, determined in samples selected at positions with different coordinates. This doctrine of assumed causality is the quintessence of Matheron's new science of geostatistics. The question remains whether assumed causality makes sense in any other scientific discipline. The more so because central values such as distance- and length-weighted averages metamorphosed so smoothly into either kriged estimates or kriged estimators.

Matheron’s 1967 Kriging, or Polynomial Interpolation Procedures? A contribution to polemics in mathematical geology, praises the precise probabilistic background of kriging and finds least-squares polynomial interpolation wanting. In fact, Matheron preferred kriging because it gives infinite sets of kriged estimates or kriged estimators in finite three-dimensional sample spaces. Infinite sets of points on polynomials were rather restrictive for Matheron’s new science of geostatistics.