Spatiotemporal Matérn field modelling of Baltic Sea turbidity

(1)

Computational Engineering and Technical Physics Technomathematics

Aleksi Salo

SPATIOTEMPORAL MATÉRN FIELD MODELLING OF BALTIC SEA TURBIDITY

Master’s Thesis

Examiners: Associate Professor Lassi Roininen Professor Heikki Haario

Supervisors: Associate Professor Lassi Roininen Master of Science Jarkko Suuronen

(2)

Lappeenranta-Lahti University of Technology LUT School of Engineering Science

Computational Engineering and Technical Physics Technomathematics

Aleksi Salo

Spatiotemporal Matérn field modelling of Baltic Sea turbidity

Master’s Thesis 2021

52 pages, 43 figures.

Examiners: Associate Professor Lassi Roininen Professor Heikki Haario

Keywords: Bayesian inference, MCMC, parameter estimation, turbidity, Baltic Sea The state of the Baltic Sea has been a huge topic in recent years, at least in Finland.

Algal blooms, eutrophication and environmental hypoxia have been a cause for worry for scientists. This thesis focuses on the modelling of satellite data of Baltic Sea turbidity as Matérn fields and on the estimation of the Matérn covariance parameters from these fields of turbidity using MCMC methods. Since we are dealing with full covariance matrices, the MCMC sampling part is computationally rather demanding and therefore quite time consuming. Temporal evolution of the field’s parameters is also studied. The method described above is demonstrated to be able to correctly characterise the turbidity fields.

(3)

Lappeenrannan-Lahden teknillinen yliopisto LUT School of Engineering Science

Laskennallinen tekniikka ja teknillinen fysiikka Teknillinen matematiikka

Aleksi Salo

Itämeren sameuden spatiotemporaalista mallinnusta käyttäen Matérn kenttiä

Diplomityö 2021

52 sivua, 43 kuvaa.

Tarkastajat: Apulaisprofessori Lassi Roininen Professori Heikki Haario

Hakusanat: Bayesilainen tilastotiede, MCMC, parametrien estimointi, sameus, Itämeri

Itämeren tila on ollut viime vuosina kuuma puheenaihe, ainakin Suomessa. Leväkukin- nat, rehevöityminen ja happikato ovat huolestuttaneet tutkijoita. Tässä diplomityössä keskitytään mallintamaan Itämeren sameuden satelliittidataa Matérn kenttinä ja es- timoimaan Matérn kovarianssin parametrejä näistä sameuskentistä käyttäen MCMC metodeja. Koska käsittelemme kokonaisia kovarianssimatriiseja, MCMC näytteistys on laskennallisesti jokseenkin raskasta ja täten melko aikaavievää. Myös kenttien parametrien temporaalista evoluutiota tutkitaan. Yllä kuvatun metodin osoitetaan kykenevän oikein luonnehtimaan sameuskenttiä.

(4)

I would like to take this opportunity to thank my supervisor and examiner Assoc. Prof.

Lassi Roininen, my supervisor M.Sc. Jarkko Suuronen and my examiner Prof. Heikki Haario. I would also like to express my thanks to Prof. Marko Laine for his insight. And last but not least, I want to thank my family for their continued support throughout my studies.

Lappeenranta, June 17, 2021

Aleksi Salo

(5)

LIST OF SYMBOLS AND ABBREVIATIONS

∼ "is distributed as"

(·)^T Matrix transpose (·)⁻¹ Matrix inverse

| · | Matrix determinant k·k Euclidean norm

≡ "is defined as"

x A vector

0 A zero vector (that is, a vector containing only zeroes)

A A matrix

X A random variable

x A realisation of a random variable

X A collection of random variables (that is, a random vector) x A realisation of a random vector

A A set

µ A mean vector

N(µ,Σ) A Gaussian distribution with meanµand covariance matrixΣ N(x|µ,Σ) A Gaussian distribution ofX with meanµand covariance matrixΣ p(x) A probability density function of a random vectorX

p(y|x) A probability density function of a random vectorY givenX =x F(x) A spatial random field

E[·] The expected value of a random variable or vector C(x,x⁰) A covariance function

Γ(·) The gamma function

K_α(·) The modified Bessel function of the second kind of orderα θˆMAP Maximum a posteriori estimate

θˆCM Conditional mean estimate cov(·) The sample covariance function

2γ(·) The variogram of a spatial random field Var(·) The variance of a random variable or vector

(7)

AM Adaptive Metropolis CM Conditional mean

i.i.d. independent identically distributed MAP Maximum a posteriori

MCMC Markov Chain Monte Carlo MH Metropolis-Hastings

MSI MultiSpectral Instrument C2RCC Case-2 Regional CoastColour OLS Ordinary least squares

REML Restricted maximum likelihood WNLS Weighted nonlinear least squares

(8)

1 INTRODUCTION

The current state of the Baltic Sea is a cause for concern. Many interconnected problems such as algal blooms, eutrophication, agricultural runoff and hypoxia threaten the inland sea. The focus of this thesis is on modelling satellite measurements of Baltic Sea turbidity using Matérn fields and estimating the smoothness and length-scale parameters using MCMC methods.

Owing to the parameters, Matérn fields are a versatile statistical modelling tool. The Matérn covariance parametric family includes as special cases both the exponential and the squared exponential covariances. The smoothness and length-scale parameters of Matérn fields can model physical, tangible attributes of the fields in question and their values can be used as a basis to characterise the fields. Temporal behaviour and evolution can then be infered by observing how the parameters change with time.

Matérn covariance and Matérn fields are named after the Swedish forestry statistician Bertil Matérn, whose contributions to the foundation of spatial statistics are significant (Matérn, 1960). Matérn fields are widely used in geostatistics and Bayesian statistical inversion. Examples of Matérn fields being utilised in the literature follows. Lindgren et al. (2011) establish a link between certain Matérn fields and Gaussian Markov random fields. This allows to do the modelling intuitively with Gaussian fields and to do the computation efficiently using Gaussian Markov random fields and sparse matrix algorithms.

Roininen et al. (2014) treat the use of Matérn fields as priors for Bayesian statistical inversion. The prior is then expressed as a solution to a stochastic matrix equation resulting in a computationally efficient way of handling the prior especially if the number of un- knowns is great. The priors are then applied to the problem of two-dimensional electrical impedance tomography. Roininen et al. (2019) deal with hypermodels for Matérn field priors with applications in interpolation and numerical differentiation. On top of that, Bayesian inversion that supports both smoothness and preservation of edges is numerically demonstrated.

All calculations in this thesis have been executed on a computer with an AMD Ryzen Threadripper 3990X central processing unit and 256 gigabytes of random access memory.

The multiple cores of the central processing unit have been utilised in the construction of the full covariance matrices by employing the parallel for-loop parfor of MATLAB.

The computations are quite demanding and the aim of this thesis is not to find the most optimal way of estimating the parameters. For that reason, the method described in this thesis is more of a post-processing step rather than a real-time process.

(9)

1.1 Structure of the thesis

Section 2 of the thesis contains a treatment of the theory essential to the topic of this thesis. It starts with an introduction to Matérn random fields and then presents relevant items from probability theory and statistics.

Section 3 is dedicated to the treatment of Monte Carlo sampling methods. The Metropolis-Hastings algorithm is introduced as well as its special case; the Metropolis algorithm. A presentation of the Adaptive Metropolis algorithm is given. Lastly, the Gibbs sampling algorithm is presented.

Section 4 comprises of simulations performed using synthetic data. Estimation of Matérn field parameters is performed for various example cases. These various example cases include cases where there is missing data due to occlusion, cases with added noise and cases with inhomogeneous targets, where the parameters of the boundary curve are also estimated.

Section 5 contains experimentation with real-world data of Baltic Sea turbidity. The analysis is done for data from three different locations of the Baltic Sea. Temporal variations of the estimated parameters is studied.

Section 6 is a short concluding Section with discussion on possible continuing research on this matter.

(10)

2 STATISTICAL RANDOM FIELD MODEL

Here we shall introduce basic notations of Gaussian random fields and Matérn fields especially. We note that the literature on the topic is vast and we guide interested readers to, for example, Hristopulos (2020) and Adler and Taylor (2007) for details on random field theory. Thus, in this section, we merely set the notational framework with the most important properties.

2.1 Matérn field

A spatial random field Hristopulos (2020) is a collection of random variablesF indexed by a spatial domain D

{F(x); x∈D⊂R^k}.

A Gaussian random field Hristopulos (2020) is a random field such that every random vector

F = [F(x₁),F(x₂), . . . ,F(x_n)]^T

follows a multivariate Gaussian distribution for any x₁,x₂, . . . ,x_n ∈ D, such that the mean vectorµ=E[F]and the covariance matrixΣ=E[(F −E[F])(F −E[F])^T].

The Matérn covariance function Rasmussen and Williams (2006) defines a stationary covariance between two pointsx,x⁰ ∈R^kas

C(x,x⁰) = σ²2^1−ν Γ(ν)

√2νkx−x⁰k

`

!ν

K_ν

√2νkx−x⁰k

`

! ,

whereσ² >0is the variance parameter,ν > 0is the smoothness parameter, ` >0is the length-scale parameter,Γ is the gamma function, K_ν is the modified Bessel function of the second kind of orderν andkxk=p

x(1)²+x(2)²+· · ·+x(k)². Let us denote the parameter vector consisting of the smoothness ν and the length-scaling ` parameters as θ = [ν, `].

Since the covariance between two pointsxandx⁰only depends on the Euclidean distance between them, the Matérn covariance function is isotropic, that is, it behaves uniformly in all directions (Adler and Taylor, 2007), and stationary, that is, translation invariant

(11)

(Hristopulos, 2020).

A Matérn field is a Gaussian random field such that the covariance matrix of the Gaussian distribution Σfollows the Matérn covariance function C(x,x⁰) with some parameterθ value.

2.2 Probability theory and statistics

In Section 2.1 it was established, that Matérn fields are Gaussian random fields and as such they follow a multivariate Gaussian distribution. The multivariate Gaussian distribution of ak-dimensional random vectorF Bishop (2006) is given by

F ∼N(µ,Σ) = 1

p(2π)^k|Σ|exp

−1

2(f −µ)^TΣ⁻¹(f −µ)

,

whereµ∈R^kis the mean vector,Σ∈R^k×kis thek-by-kcovariance matrix andf is the realisation ofF.

According to Bishop (2006), given a marginal Gaussian distribution for f and a conditional Gaussian distribution forfˆ=Af +bas

p(f) = N(f|µ,Λ⁻¹) p(fˆ|f) = N(fˆ|Af +b,L⁻¹),

where Λ⁻¹ is the covariance matrix of f and L⁻¹ is the covariance matrix of fˆ, the marginal distribution offˆand the conditional distribution off givenfˆare given by

p(fˆ) =N(fˆ|Aµ+b,L⁻¹+AΛ⁻¹A^T) (1) p(f|fˆ) =N(f|Σ{A^TL(fˆ−b) +Λµ},Σ), (2) where

Σ= (Λ+A^TLA)⁻¹.

Bayes’ theorem Gelman et al. (2013) states that the conditional probability density function p(θ|f) (the posterior) is equal to the product of the conditional density function p(f|θ)(the likelihood) and the density functionp(θ)(the prior), divided by the density

(12)

functionp(f)

p(θ|f) = p(f|θ)p(θ)

p(f) ∝p(f|θ)p(θ). (3)

Assuming that the denominator p(f) does not depend on θ, it can be stated that the posterior ofθis directly proportional to the numerator.

The maximum a posteriori (MAP) estimate Kaipio and Somersalo (2005) θˆ_MAP of a parameterθis defined using the Bayes’ theorem as

θˆ_MAP = arg max

θ

p(θ|f) = arg max

θ

p(f|θ)p(θ)

p(f) = arg max

θ

p(f|θ)p(θ).

The computation of the MAP estimate is an optimisation problem and in this thesis this optimisation has been accomplished using the MATLAB functionfminsearch, which utilises the Nelder-Mead method. The conditional mean (CM) estimate Kaipio and Som- ersalo (2005)θˆCMof a parameterθ conditioned onf is

θˆ_CM =E[Θ|F =f] = Z

θp(θ|f)dθ. (4)

The MAP estimate is used as the starting point for the Monte Carlo sampling methods described in Section 3. The motivation for the employment of the Monte Carlo sampling methods, which are introduced in Section 3, is the calculation of the CM estimate.

(13)

3 MONTE CARLO SAMPLING METHODS

Monte Carlo sampling methods are needed in order to calculate means and variances of analytically intractable distributions numerically and to also generate histograms that approximate said distributions. All this needs to be done in order to access the information these complex distributions encode.

The main application of Monte Carlo sampling methods in this thesis is the numerical computation of the integral that appears in Equation (4) in the calculation of the CM estimate. This integral is most often over a very high-dimensional space and the support of the distributionp(θ|f)is largely unknown, therefore standard quadrature methods are not fit for the purpose (Kaipio and Somersalo, 2005). The integral in the CM estimate can be approximated by the ergodic average; the arithmetic mean of the drawn samples (Kaipio and Somersalo, 2005).

3.1 Metropolis-Hastings algorithm

The Metropolis-Hastings (MH) algorithm Hastings (1970) is a Markov Chain Monte Carlo (MCMC) method, that can be utilised to draw random samples from a target distribution by utilising a random walk, where a proposal distribution q is used to suggest candidate samples to which the random walk will move.

The method is particularly useful because of the fact that only a function proportional to the density of the target distribution in question is needed. This is significant, since the distribution is in many cases only realistically computable up to a normalisation fac- tor. The empirical distribution of the drawn samples will then approach the sought-after distribution.

(14)

Algorithm 1:Metropolis-Hastings Choose an initial pointx₁; fori←2tondo

1. Draw a candidate samplex^∗ from the proposal distributionq(x^∗|xi−1);

2. Calculate the acceptance probabilityα(x^∗,xi−1)= min

1, π(x^∗)q(xi−1|x^∗) π(xi−1)q(x^∗|xi−1)

, whereπ(·)is a function proportional to the target distribution ;

3. Generate a uniform sampleufrom the interval[0,1];

ifu≤αthen x_i =x^∗; else

x_i =xi−1; end

end

3.2 Metropolis and Adaptive Metropolis algorithm

The Metropolis algorithm Metropolis et al. (1953) can be thought of as a special case of the Metropolis-Hastings algorithm, where the proposal distribution q is symmetric, that is, q(x|y) = q(y|x). A common choice for the symmetric proposal distribution q is a Gaussian distribution.

The adaptive Metropolis (AM) algorithm Haario et al. (2001) offers a solution to the painstaking process of finding a suitable proposal distributionqfor the Metropolis algorithm by continually adapting a Gaussian proposal distribution to the target distribution.

The algorithm is identical to the MH algorithm described previously with the exception that the acceptance probabilityαis now calculated as follows

α(x^∗,x_i−1) = min

1, π(x^∗) π(xi−1)

and the covariance matrix of the Gaussian proposal distribution centred on the previous sample is calculated as follows

(15)

Ci =







C₀ , i≤i₀

s_dcov(x₀, . . . ,xi−1) +s_dI , i > i₀ ,

where C₀ is the initial covariance matrix used for the initial i₀ iterations, s_d = 2.4²/d (d is the dimension of the sample vectors), cov(·) denotes the computation of sample covariance, is a small positive constant andI is an identity matrix of size d×d. The covariance matrix of the Gaussian proposal could also be updated only everynth iteration.

3.3 Gibbs sampling

The Gibbs sampling algorithm Geman and Geman (1984) can be used to obtain samples from a joint target distribution, which is difficult to sample from directly, by utilising the conditional distributions of the random variables of the joint target distribution in question.

Algorithm 2:Systematic scan Gibbs sampling Choose an initial pointx1

fori←2tondo

1. Drawx_i(1)fromπ(x_i(1)|xi−1(2), . . . , xi−1(k)) 2. Drawx_i(2)fromπ(x_i(2)|x_i(1), xi−1(3), . . . , xi−1(k)) ...

k. Drawx_i(k)fromπ(x_i(k)|x_i(1), x_i(2), . . . , x_i(k−1)) end

In the above algorithm, k denotes the dimension of the random vectorx, that is, the number of variables. The variables do not need to be sampled one by one; they can also be aggregated together into blocks. Then the draws would be taken from the joint distributions of the block variables conditioned on the rest of the variables. Naturally, MCMC methods may be employed to perform these draws if needed; This is known as Metropolis- within-Gibbs Monterrubio-Gómez et al. (2020) or variable-at-a-time Metropolis.

(16)

4 NUMERICAL SIMULATIONS

Satellite measurements of targets such as water turbidity or algae related chlorophyll con- centration on water can be modelled using Matérn random fields. Both homogeneous (Matérn parameterθ value is the same throughout the field) targets and inhomogeneous (θ value can differ at disparate parts of the field) targets can be modelled. In the case of an inhomogeneous target with well defined regions with differing θ values, where the boundaries between regions can be expressed parametrically, the parameters of the boundary may also be estimated.

4.1 Homogeneous target

Let F(x) be a two-dimensional 10× 10 (for example, kilometers) zero mean Matérn field, which is discretised using a 128 ×128 grid resulting in 16384 individual points.

Furthermore, we shall fixσ = 1, as this will guarantee identifiability ofνand`parameters in the estimation algorithm (Monterrubio-Gómez et al., 2020). Let both ν = 1.5 and

`= 1.5in the Matérn covariance function.

4.1.1 Drawing realisations

Realisations of this field can be drawn by computing the Cholesky decomposition of the covariance matrix Σ = L^TLof the field. Using the upper triangular matrix Lacquired from the decomposition, a realisation of the random field can be generated asf = L^Tz, where z is a realisation of a standard Gaussian random vector of appropriate size. In Figure 1 are two different realisations of the random field.

4.1.2 Parameter estimation

The likelihood of a realisation f of the field given parameter value θ = [ν, `], p(f|θ), is given by the distribution N(f|0,C_θ), where 0 is a zero vector of suitable size and C_θ is the Matérn covariance matrix of the fieldF with parameter valueθ. According to Equation (3), the posterior of θ is then proportional to the likelihood multiplied by the prior distribution of the parametersθ, p(θ). Let us use a uniform distribution from 0 to 10 forνand a uniform distribution from 0 to 10 for`as well as the prior.

(17)

Given a realisation, we estimate the parameters ν and ` using MCMC methods. Let us estimate the parameters using the realisation presented in Figure 1b. MAP estimate of the parameters can be used as the starting point for the AM algorithm. Figure 2 illustrates the samples drawn from the posterior ofθ using AM. The computation of the samples took approximately 58 hours.

Figure 3 presents the chain of samples drawn from the posterior ofθand the marginal histograms of the parameters. The histograms are normalised to an area of one and Gaussian distribution curves with means and standard deviations equal to the conditional sample means and standard deviations are drawn with red dashed lines as references. It can be seen from the Figure, that the chain mixes quite well and the desired distribution is reached very early on. The conditional sample mean ofθcan be calculated by taking the sample mean of the chain. It is advisable to first remove samples from the beginning of the chain, since these samples do not fully reflect the desired distribution. The conditional sample mean ofθ, computed by removing the first 100 samples, is [1.534, 1.432].

0 5 10

0 2 4 6 8 10

-2 -1 0 1 2

(a)

0 5 10

0 2 4 6 8 10

-2 -1 0 1 2 3

(b) Figure 1: Two realisations of the random field.

(18)

1.48 1.5 1.52 1.54 1.56 1.58 1.3

1.35 1.4 1.45 1.5 1.55

Figure 2: Samples drawn from the posterior ofθ.

101 2000 4000 6000 8000 10000 1.48

1.5 1.52 1.54 1.56 1.58

(a)

101 2000 4000 6000 8000 10000 1.3

1.35 1.4 1.45 1.5 1.55

(b)

1.5 1.52 1.54 1.56 1.58 0

10 20 30 40 50

(c)

1.35 1.4 1.45 1.5 1.55

0 5 10 15 20

(d) Figure 3: Sampled chain and marginal histograms.

(19)

4.1.3 Parameter estimation with missing data

Missing measurement data due to obstructions like clouds, ice sheets or snow, can be simulated by omitting parts of the data. Figure 4 shows the realisation in Figure 1b with two elliptical regions of missing measurements.

0 2 4 6 8 10

-2 -1 0 1 2 3

Figure 4: Realisation of the random field with missing data.

Now let us estimate the parameterθusing the realisation in Figure 4. The likelihood, prior and therefore also the posterior are all same as previously. Figure 5 presents the samples drawn from the posterior ofθusing AM in the case of missing data. The computation of the samples took about 74 hours.

Figure 6 displays the chain of samples drawn from the posterior of θ and the marginal histograms in the case of missing data. It can be seen from the Figure, that the chain mixes well and that the marginal histograms match nicely with the Gaussian reference curves. The conditional sample mean ofθ, computed by removing the first 100 samples before computation, is [1.533, 1.433].

(20)

1.48 1.5 1.52 1.54 1.56 1.58 1.3

1.35 1.4 1.45 1.5 1.55

Figure 5: Samples drawn from the posterior ofθin the case of missing data.

101 2000 4000 6000 8000 10000 1.48

1.5 1.52 1.54 1.56 1.58

(a)

101 2000 4000 6000 8000 10000 1.3

1.35 1.4 1.45 1.5 1.55

(b)

1.5 1.52 1.54 1.56 1.58 0

10 20 30 40

(c)

1.35 1.4 1.45 1.5 1.55

0 5 10 15

(d)

Figure 6: Sampled chain and marginal histograms in the case of missing data.

(21)

4.1.4 Parameter estimation with missing and noisy data

No real life measurement is of course completely noise free. To simulate this kind of noise, independent identically distributed (i.i.d.) Gaussian noise of mean zero and standard deviation 0.1 is added to the realisation of the field. Figure 7 portrays the realisation with missing data in Figure 4 with added i.i.d. Gaussian noise. It can be seen that the level of noise is quite high; the standard deviation of the added noise is approximately 3 percent of the maximum absolute value of the noise free realisation.

0 2 4 6 8 10

-2 -1 0 1 2 3

Figure 7: Noisy realisation of the random field with missing data.

The noisy realisation fˆcan be expressed as fˆ= Af +, where f is a realisation of the underlying unobserved field, A is a matrix which selects the non-obstructed points andis a vector of i.i.d. Gaussian noise of appropriate size. Prior ofθ is once again the same as before, but now the likelihoodp(fˆ|θ)is according to Equation (1) expressed as N(fˆ|0,C+AC_θA^T), whereC is the covariance matrix of the added i.i.d. Gaussian noise and C_θ is the Matérn covariance matrix of the underlying field F. Let us use the noisy realisation shown in Figure 7 to estimate the parameter θ. Figure 8 presents the samples drawn from the posterior ofθ using AM in the case of missing data and noise.

The computation of the samples took around 74 hours.

(22)

1.3 1.4 1.5 1.6 1.7 1.8 1.1

1.2 1.3 1.4 1.5 1.6 1.7 1.8

Figure 8: Samples drawn from the posterior ofθ in the case of missing data and added noise.

Figure 9 displays the chain of samples drawn from the posterior of θ and the marginal histograms in the case of missing data and added noise. The conditional sample mean of θ, computed by removing 100 samples from the beginning before calculating, is [1.515, 1.480].

After estimating the parameterθ either by using a MAP estimate or a conditional mean estimate, the underlying field realisation f can be reconstructed using the Equation (2) for the conditional distribution off conditioned onfˆp(f|fˆ) = N(f|ΣA^TC⁻¹fˆ,Σ), whereΣ= (Cθ−1

+A^TC−1

A)⁻¹, using the estimated parameterθvalue and assuming that the standard deviation of the added noise is known or estimated. Figure 10 depicts two reconstructions of the underlying field realisationf.

(23)

101 2000 4000 6000 8000 10000 1.3

1.4 1.5 1.6 1.7 1.8

(a)

101 2000 4000 6000 8000 10000 1.2

1.3 1.4 1.5 1.6 1.7 1.8

(b)

1.3 1.4 1.5 1.6 1.7 1.8

0 2 4 6 8

(c)

1.2 1.4 1.6 1.8

0 2 4 6

(d)

Figure 9: Sampled chain and marginal histograms in the case of missing data and added noise.

0 5 10

0 2 4 6 8 10

-2 -1 0 1 2 3

(a)

0 5 10

0 2 4 6 8 10

-2 -1 0 1 2 3

(b)

Figure 10: Two reconstructions of the underlying field realisationf.

(24)

Figure 11a presents the expected value of the underlying field realisationf. Comparing to Figures 7 and 11b, it can be seen that the noise has been "filtered" out quite nicely and also the regions with missing data due to obstructions have been reconstructed with some accuracy.

0 2 4 6 8 10

-2 -1 0 1 2 3

(a)

0 5 10

0 2 4 6 8 10

-2 -1 0 1 2 3

(b)

Figure 11: Comparison of the expected value of the underlying field realisation with the actual original realisation.

4.2 Inhomogeneous target with boundary

Consider that we have two independent Matérn fields with different parameters θ_a and θ_band that the fields are separated from each other by a parametric curve (for example, a straight line for simplicity, but it could be more complex, that would simply just increase the amount of parameters to be estimated). These two fields could correspond to, for example, sea and land, where the curve separating them would represent a shoreline.

Let the composite field of the two fields be a two-dimensional 10 ×10 (for example, kilometers) zero mean Matérn field, which is discretised using a128×128grid resulting in16384individual points as before. Similarly as in Section 4.1, we shall fix σ = 1for both fields. Let bothν = 1.5and` = 1.5for one field, andν = 1.8and` = 0.9for the other field in the Matérn covariance function. Also let the interceptb and the slopek of the straight separating line be 12 and -1.5 respectively.

(25)

4.2.1 Drawing realisations

Realisations of this composite field can be drawn in a similar way as was described in Section 4.1.1. In Figure 12 are two different realisations of the composite random field.

The field to the left of the separating line is the field with the parameter value θ_a = [ν_a, `_a] = [1.5,1.5]and the field to the right is the field with the parameter value θ_b = [ν_b, `_b] = [1.8,0.9].

(a) (b)

Figure 12: Two realisations of the composite random field.

4.2.2 Parameter estimation with missing data

Similarly as in Section 4.1.2, missing measurement data can be simulated by omitting parts of the data. Figure 13 shows the realisation in Figure 12b with the same two elliptical regions of missing measurements as in Figure 4.

Because of the independence assumption, the likelihood of a realisation f of the composite field given parametersθ_a, θ_b, k and b, is p(f|θ) = N(f_a|0,C_θ_a)N(f_b|0,C_θ_b), whereθ = [k, b,θ_a,θ_b], 0are zero vectors of suitable sizes,f_a is the left part andf_b is the right part of the composite field realisation according to the parametersk andb, C_θ_a is the Matérn covariance matrix for the left part andC_θ_b is the Matérn covariance matrix for the right part. According to Equation (3), the posterior ofθis then proportional to this likelihood multiplied by the prior of the parameterθ. Let us use a uniform distribution from 0 to 10 for ν and a uniform distribution from 0 to 10 for ` of both field parts, a uniform distribution from -3 to 1 forkand a uniform distribution from 5 to 20 forbas the prior. Given a single realisation, the parameterθ can be estimated using Gibbs sampling

(26)

with AM. Let us estimate the parameterθ using the realisation shown in Figure 13. The MAP estimate of the parameterθcan be used as the starting point for the Gibbs sampling algorithm.

Figure 13: Realisation of the composite field with missing data.

Figure 14 depicts the samples drawn from the posterior ofθ using Adaptive Metropolis- within-Gibbs in the case of missing data and a composite field. The computation of the samples took some 41 hours.

-1.505 -1.5 -1.495

k 11.96

11.98 12 12.02

b

(a)

1.45 1.5 1.55

1.4 1.5 1.6 1.7

(b)

1.7 1.75 1.8 1.85 1.9 0.8

0.85 0.9 0.95

(c) Figure 14: Samples drawn from the posterior ofθwith missing data.

Figures 15, 16 and 17 describe the chain of samples drawn from the posterior ofθand the marginal histograms in the case of missing data and a composite field. The chain seems to mix just fine and the desired distribution is reached rather fast. The conditional sample

(27)

mean ofθcalculated the same way as previously, is [-1.500, 11.996, 1.495, 1.519, 1.805, 0.899].

101 2000 4000 6000 8000 10000 -1.506

-1.504 -1.502 -1.5 -1.498 -1.496 -1.494

k

(a)

101 2000 4000 6000 8000 10000 11.96

11.97 11.98 11.99 12 12.01 12.02

b

(b)

-1.51 -1.505 -1.5 -1.495 -1.49 k

0 50 100 150 200

(c)

11.94 11.96 11.98 12 12.02 12.04 b

0 10 20 30

(d)

Figure 15: Sampled chain and marginal histograms ofkandbin the case of missing data.

(28)

101 2000 4000 6000 8000 10000 1.44

1.46 1.48 1.5 1.52 1.54

(a)

101 2000 4000 6000 8000 10000 1.4

1.45 1.5 1.55 1.6 1.65 1.7

(b)

1.44 1.46 1.48 1.5 1.52 1.54 0

10 20 30

(c)

1.4 1.45 1.5 1.55 1.6 1.65 0

2 4 6 8 10

(d)

Figure 16: Sampled chain and marginal histograms ofθ_ain the case of missing data.

(29)

101 2000 4000 6000 8000 10000 0.8

0.85 0.9 0.95

(a)

101 2000 4000 6000 8000 10000 1.7

1.75 1.8 1.85 1.9

(b)

1.75 1.8 1.85

0 5 10 15 20 25

(c)

0.85 0.9 0.95

0 5 10 15 20 25

(d)

Figure 17: Sampled chain and marginal histograms ofθ_bin the case of missing data.

(30)

4.2.3 Parameter estimation with missing and noisy data

Similarly as in Section 4.1.4, noise in the measurement data is simulated by adding i.i.d.

Gaussian noise, but this time the standard deviation of the added noise is only 0.001.

Figure 18 illustrates the realisation with missing data in Figure 13 with added i.i.d. Gaus- sian noise. The level of noise is not that high in this case, the standard deviation of the added noise is only around 0.04 percent of the maximum absolute value of the realisation, therefore the noise is not as apparently visible as it is in Figure 7.

Figure 18: Realisation of the composite field with missing data and added noise.

The noisy composite realisation fˆcan be expressed as fˆ = Af +, where f is the realisation of the underlying unobserved composite field,Ais a matrix which selects the non-obstructed points andis a vector of i.i.d. Gaussian noise of appropriate size. Prior of θ is the same as in the previous Section, but now the likelihood p(fˆ|θ) is according to Equation (1) expressed as N(fˆ_a|0,C_a +C_θ_a)N(fˆ_b|0,C_b +C_θ_b), whereC_a and C_b are the covariance matrices of the added i.i.d. Gaussian noise of suitable sizes,C_θ_a is the Matérn covariance matrix of the underlying fieldF_a with parameter valueθ_a and C_θ_b is the Matérn covariance matrix of the underlying fieldF_bwith parameter valueθ_b. Let us estimate the parameterθ using the realisation in Figure 18. Figure 19 shows the samples drawn from the posterior of θ using Adaptive Metropolis-within-Gibbs in the case of missing and noisy data. The computation of the samples took roughly 49 hours.

(31)

-1.505 -1.5 -1.495 k

11.96 11.98 12 12.02

b

(a)

1.45 1.5 1.55

1.4 1.5 1.6 1.7

(b)

1.7 1.75 1.8 1.85 1.9 0.8

0.85 0.9 0.95 1

(c)

Figure 19: Samples drawn from the posterior ofθwith missing data and added noise.

Figures 20, 21 and 22 present the chain of samples drawn from the posterior of θ and the marginal histograms in the case of missing data and added noise. From Figure 20 it can be seen, that the marginal histogram of k looks approximately Gaussian, but the marginal histogram of b looks more like a uniform distribution rather than a Gaussian.

The conditional sample mean of θ, calculated the same way as previously, is [-1.500, 11.997, 1.501, 1.509, 1.811, 0.894].

After estimating the parameter θ either by using a MAP estimate or a conditional mean estimate, the underlying composite field realisation f can be reconstructed using the Equation (2) for the conditional distribution of f conditioned on fˆ, p(f|fˆ) = N(f|ΣA^TC⁻¹fˆ,Σ), whereΣ= (C_θ⁻¹+A^TC⁻¹A^T)⁻¹, using the estimated param- eterθvalue and assuming that the standard deviation of the added noise is known or has been estimated. Figure 23 presents two different reconstructions of the underlying field realisationf.

(32)

101 2000 4000 6000 8000 10000 -1.506

-1.504 -1.502 -1.5 -1.498 -1.496 -1.494

k

(a)

101 2000 4000 6000 8000 10000 11.96

11.97 11.98 11.99 12 12.01 12.02

b

(b)

-1.51 -1.505 -1.5 -1.495 -1.49 k

0 50 100 150 200

(c)

11.94 11.96 11.98 12 12.02 12.04 b

0 10 20 30

(d)

Figure 20: Sampled chain and marginal histograms ofk andbin the case of missing data and added noise.

(33)

101 2000 4000 6000 8000 10000 1.46

1.48 1.5 1.52 1.54 1.56

(a)

101 2000 4000 6000 8000 10000 1.4

1.45 1.5 1.55 1.6 1.65 1.7

(b)

1.45 1.5 1.55

0 5 10 15 20 25

(c)

1.4 1.45 1.5 1.55 1.6 1.65 0

2 4 6 8 10

(d)

Figure 21: Sampled chain and marginal histograms ofθ_a in the case of missing data and added noise.

(34)

101 2000 4000 6000 8000 10000 1.7

1.75 1.8 1.85 1.9

(a)

101 2000 4000 6000 8000 10000 0.8

0.85 0.9 0.95 1

(b)

1.75 1.8 1.85

0 5 10 15 20 25

(c)

0.85 0.9 0.95

0 5 10 15 20 25

(d)

Figure 22: Sampled chain and marginal histograms ofθ_bin the case of missing data and added noise.

(35)

(a) (b)

Figure 23: Two reconstructions of the underlying composite field realisationf.

Figure 24a depicts the expected value of the underlying composite field realisation f. Comparing to Figure 24b, the actual underlying realisationf from Figure 12b, it can be seen that the regions with missing data due to obstructions have been reconstructed with some accuracy.

(a) (b)

Figure 24: Comparison of the expected value of the field realisation with the actual original realisation.

(36)

4.3 Sensitivity of the parameter estimation to added noise

The parameter estimation in the case of a homogeneous region in Section 4.1.4 was not very sensitive to added noise; The estimated parameter values corresponded nicely with the actual underlying parameter values. The standard deviation of the added zero-mean Gaussian noise was 0.1 or roughly 3 % of the maximum absolute value of the obstructed non-noisy realisation, which is a high amount of noise.

In the case of the inhomogeneous region with a boundary in Section 4.2.3, the parameter estimation was a lot more sensitive to added noise. The standard deviation of the added noise was 0.001 or roughly 0.04 % of the maximum absolute value of the obstructed non- noisy realisation. With higher values of the standard deviation (for example, 0.005 or 0.01), the estimated ν and` parameter values were not close to the true values, but the estimated parameters of the border were highly accurate even with heavy added noise.

(37)

5 EXPERIMENTS WITH REAL-WORLD DATA

As a real-world data analysis example, we use the turbidity data from the Baltic Sea (SYKE data, contains modified Copernicus data 2016-2017) provided by the Finnish En- vironment Institute (SYKE). The data is openly available for download at the server¹. Using Web Coverage Service, the data can be found under the coverage identification EO_HR_WQ_S2_TURB. The data is based on the MultiSpectral Instrument (MSI) observations of the Sentinel-2 mission of the European Union’s Copernicus program from 2016 onwards. The estimation of the turbidity data from the satellite observations has been done using a neural network model called Case-2 Regional CoastColour (C2RCC) (Brockmann et al., 2016).

Three locations were selected for analysis. The first one is Ströömi, Kustavi, which is a strait in the Archipelago Sea off the southwestern coast of Finland. The second one is the bay beside the city of Porvoo on the northern coast of the Gulf of Finland. The third one is near Pakinainen in the Archipelago Sea. Figure 25 displays the three chosen locations on the map of the southern part of Finland.

Figure 25: The chosen locations highlighted on the map of the southern part of Finland (Contains modified Copernicus data, SYKE 2020).

1https://geoserver2.ymparisto.fi/geoserver/eo/ows

(38)

5.1 Spatial analysis

Estimation of the Matérn parameterθcan be done as in Section 4.1, since we assume that the turbidity field is homogeneous. Ten thousand samples was chosen as a sufficiently large size for the sample chains.

5.1.1 Ströömi

Figure 26 shows the turbidity at Ströömi on 23.10.2016. The location is a very narrow but long strait. In the Figure, the grey pixels represent land and the white pixels represent unobserved areas of water due to obstructions like cloud cover. The units on both the x- and y-axis are in kilometers. The size of the field in Figure 26 is 119 pixels in the x-direction and 159 pixels in the y-direction.

Figure 26: Turbidity at Ströömi on 23.10.2016.

Figure 27 illustrates the samples drawn from the posterior ofθ using AM in the case of Ströömi on 23.10.2016. The computation of the samples took nearly 80 minutes. Figure 28 presents the chain of samples drawn from the posterior of θ and the marginal histograms of the parameters. The chain looks great here. The conditional sample mean of θ, computed after removing the first 100 samples, is [0.582, 0.393].

(39)

0.45 0.5 0.55 0.6 0.65 0.7 0.25

0.3 0.35 0.4 0.45 0.5 0.55 0.6

101 2000 4000 6000 8000 10000 0.45

0.5 0.55 0.6 0.65 0.7

(a)

101 2000 4000 6000 8000 10000 0.3

0.35 0.4 0.45 0.5 0.55 0.6

(b)

0.5 0.55 0.6 0.65 0.7 0

5 10 15

(c)

0.3 0.35 0.4 0.45 0.5 0.55 0

5 10 15

(40)

5.1.2 Porvoo

Figure 29 shows the turbidity at Porvoo on 29.08.2016. The location is a shallow bay fed by the small river Porvoonjoki. The river flows into the bay in the top part in Figure 29.

The grey pixels once again stand for land and the white pixels represent unobserved areas of water. The units are in kilometers. The size of the field in Figure 29 is 172 pixels in the x-direction and 133 pixels in the y-direction.

Figure 29: Turbidity at Porvoo on 29.08.2016.

Figure 30 presents the samples drawn from the posterior of θ using AM in the case of Porvoo on 29.08.2016. The computation of the samples took virtually 48 minutes. Note the ranges of the axes here. The variance inν is actually a lot less than the variance in`.

Figure 31 depicts the chain of samples drawn from the posterior ofθand the marginal histograms of the parameters. The conditional sample mean ofθ, computed after removing the first 100 samples, is [0.102, 0.796].

(41)

0.098 0.1 0.102 0.104 0.106 0.108 0.75

0.8 0.85

101 2000 4000 6000 8000 10000 0.098

0.1 0.102 0.104 0.106 0.108

(a)

101 2000 4000 6000 8000 10000 0.75

0.8 0.85

(b)

0.098 0.1 0.102 0.104 0.106 0

100 200 300 400

(c)

0.76 0.78 0.8 0.82 0.84 0

10 20 30 40

(42)

5.1.3 Pakinainen

Figure 32 shows the turbidity at Pakinainen on 05.12.2016. The location is a much wider strait compared to Ströömi. The grey pixels still represent land and the white pixels still stand for areas we do not have measurements from for some reason. The units are once more in kilometers. The size of the field in Figure 32 is 101 pixels both in the x- and the y-direction.

Figure 32: Turbidity at Pakinainen on 05.12.2016.

Figure 33 illustrates the samples drawn from the posterior ofθ using AM in the case of Pakinainen on 05.12.2016. The computation of the samples took almost 160 minutes.

Figure 34 depicts the chain of samples drawn from the posterior of θ. The chain looks great here also. The conditional sample mean ofθ, computed after removing the first 100 samples, is [0.108, 0.323].

(43)

0.095 0.1 0.105 0.11 0.115 0.12 0.29

0.3 0.31 0.32 0.33 0.34 0.35 0.36

101 2000 4000 6000 8000 10000 0.095

0.1 0.105 0.11 0.115 0.12

(a)

101 2000 4000 6000 8000 10000 0.28

0.3 0.32 0.34 0.36

(b)

0.1 0.105 0.11 0.115 0

50 100 150 200

(c)

0.3 0.31 0.32 0.33 0.34 0.35 0

20 40 60

(44)

5.2 Temporal evolution of the Matérn parameters

In order to study the temporal evolution of the Matérn parameters, a more or less consecu- tive time period was chosen from the data, where each day contains more than a thousand pixel observations. For each selected day, a chain of ten thousand samples was run.

5.2.1 Ströömi

Figure 35 presents the temporal evolution of the smoothness parameterν in the Ströömi case from 13.10.2016 to 17.11.2016. The probability density estimates were calculated using the MATLAB functionksdensityand they were all normalised by dividing by their maximum values. A black curve connecting the maximum values of the density estimates was drawn as a visual aid.

Figure 35: Temporal evolution of the smoothness parameter at Ströömi.

Figure 36 illustrates the temporal evolution of the length-scale parameter`in the Ströömi case from 13.10.2016 to 17.11.2016. Note here that the`-axis is plotted in natural logarithmic scale because of the high variance in the length-scale parameter on certain days.

The computation took all in all almost 40 hours.

(45)

Figure 36: Temporal evolution of the length-scale parameter at Ströömi.

The smoothness parameter evolves continuously for the most part, but there are some more abrupt changes. Also there seems to be some kind of cyclicity. The length-scale parameter on the other hand behaves more chaotically with many abrupt changes and with no clear trends or patterns.

Figure 37 displays two cases of the turbidity at Ströömi, for which the parameter values are similar. The turbidity fields also look similar visually. There is a bit more uncertainty in the parameter values of the turbidity field in Figure 37b most likely partly because of the lower amount of observed pixels.

(a) 1.11.2016 (b) 3.11.2016

Figure 37: Turbidity at Stroomi on 1.11.2016 and 3.11.2016.

(46)

5.2.2 Porvoo

Figures 38 and 39 display the temporal evolution of the smoothnessνand the length-scale

`parameters in the Porvoo case from 29.08.2016 to 8.10.2016. Note that here the `-axis of the length-scale Figure is not in logarithmic scale, since there is no need for that. The computation took altogether around 27.5 hours.

Both the smoothness and the length-scale parameters behave quite chaotically in this case.

Compared to the Ströömi case, both parameters are now confined to smaller intervals. Im- portant thing to note about the Porvoo location is, that the bay alongside the city is shallow and is fed by the river Porvoonjoki to the north. This river brings with it suspended par- ticulate matter and nutrients from the extensive fields bordering the river. This surely has an effect on the turbidity fluctuations of the bay.

Figure 38: Temporal evolution of the smoothness parameter at Porvoo.

(47)

Figure 39: Temporal evolution of the length-scale parameter at Porvoo.

Figure 40 illustrates two cases of the turbidity at Porvoo, for which the value of the smoothness parameterνdiffers notably. It can also be visually verified, that the turbidity field in Figure 40b looks smoother than the turbidity field in Figure 40a.

(a) 4.10.2016 (b) 5.10.2016

Figure 40: Turbidity at Porvoo on 4.10.2016 and 5.10.2016.

5.2.3 Pakinainen

Figures 41 and 42 show the temporal evolution of the smoothnessν and the length-scale

`parameters in the Pakinainen case from 5.12.2016 to 14.01.2017. The computation took in total approximately three days and six hours.

(48)

Figure 41: Temporal evolution of the smoothness parameter at Pakinainen.

The smoothness parameter varies relatively continuously and there seems to be noticeable cyclicity.

Figure 42: Temporal evolution of the length-scale parameter at Pakinainen.

The length-scale parameter seems to evolve quite smoothly rising and falling in turn with two sharper changes.

Figure 43 exhibits two cases of the turbidity at Pakinainen with unalike length-scale parameter ` values and comparable smoothness parameter ν values. Judging visually, the

(49)

smoothness does look alike and the scale of the turbidity field in Figure 43b does look greater compared to the other turbidity field; keeping in mind, that the uncertainty in the length-scale parameter`of the turbidity field in Figure 43b is rather huge.

(a) 11.12.2016 (b) 12.12.2016

Figure 43: Turbidity at Pakinainen on 11.12.2016 and 12.12.2016.

(50)

6 DISCUSSION AND CONCLUSIONS

We conclude that the method described in this thesis is able to characterise regions of turbidity based on the Matérn parameters and to investigate the temporal behaviour of these regions of turbidity. The same methods could perhaps also be extended to satellite measurements of algae related chlorophyll, vegetation related chlorophyll etc.

One possible way to extend what has been done in this thesis, would be to take into account the fact, that when there are barriers (for example, land) between two points, the distance in the Matérn covariance function should not be the shortest Euclidean distance between the two points. It should rather be the shortest distance, that goes around the barrier residing between the two points. Another possibility would be to implement something akin to the method proposed in (Bakka et al., 2019).

The variogram is a traditional tool used in spatial statistics. It indicates the extent of spatial dependence of a spatial random field. The variogram (Cressie, 1993) is defined as 2γ(x₁ −x₂) ≡ Var(F(x₁)−F(x₂)). If the field is stationary and isotropic, then the variogram is only a function of the distanceh= kx₁−x₂k, 2γ(h)(Cressie, 1993). Es- timation of the variogram is usually done best by first estimating the empirical variogram and subsequently fitting a parametric model function to the empirical variogram in order to obtain 2γ as an explicit mathematical function (Cressie and Hawkins, 1980). Now there are various ways of estimating the empirical variogram; Particularly robust estima- tors are addressed in (Cressie and Hawkins, 1980). There are also several ways to fit a model to the empirical variogram, for example, restricted maximum likelihood (REML) and ordinary least-squares (OLS). A mathematical function, which is a valid variogram (the empirical variogram might not be valid), is needed for spatial prediction (kriging) (Cressie, 1993). Minasny and McBratney (2005) consider a Matérn variogram model for spatial variability in soil. The parameters of the Matérn model are determined by utilising REML and weighted nonlinear least squares (WNLS). Marchant and Lark (2007) also treat the use of the Matérn variogram model, but here its performance is, among other things, contrasted with spherical and exponential variogram models. This Matérn variogram model approach to the parameter estimation should be compared with what has been done in this thesis in order to see how well the results of the different methods agree with each other.

(51)

REFERENCES

Robert J. Adler and Jonathan E. Taylor. Random Fields and Geometry. Springer, 2007.

Haakon Bakka, Jarno Vanhatalo, Janine B. Illian, Daniel Simpson, and Håvard Rue. Non- stationary gaussian models with physical barriers.Spatial Statistics, 29:268–288, 2019.

Christopher M. Bishop. Pattern Recognition and Machine Learning. Springer, 2006.

Carsten Brockmann, Roland Doerffer, Marco Peters, Kerstin Stelzer, Sabine Embacher, and Ana Ruescas. Evolution of the C2RCC Neural Network for Sentinel 2 and 3 for the Retrieval of Ocean Colour Products in Normal and Extreme Pptically Complex Waters.

InLiving Planet Symposium, volume 740 ofESA Special Publication, page 54, August 2016.

Noel Cressie and Douglas M. Hawkins. Robust estimation of the Variogram: I. Journal of the International Association for Mathematical Geology, 12:115–125, 1980.

Noel A. C. Cressie. Statistics for Spatial Data. Wiley, 1993.

Andrew Gelman, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Don- ald B. Rubin. Bayesian Data Analysis. Chapman and Hall/CRC, third edition, 2013.

Stuart Geman and Donald Geman. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(6):721–741, 1984.

Heikki Haario, Eero Saksman, and Johanna Tamminen. An adaptive Metropolis algorithm. Bernoulli, 7(2):223–242, 2001.

Wilfred K. Hastings. Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57(1):97–109, 1970.

Dionissios T. Hristopulos. Random Fields for Spatial Data Modeling: A Primer for Sci- entists and Engineers. Springer, 2020.

Jari Kaipio and Erkki Somersalo. Statistical and Computational Inverse Problems.

Springer, 2005.

Finn Lindgren, Rue Håvard, and Johan Lindström. An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. Journal of the Royal Statistical Society Series B (Statistical Methodology), 73(4):423–498, 2011.

(52)

B. P. Marchant and R. M. Lark. The Matérn variogram model: Implications for uncertainty propagation and sampling in geostatistical surveys.Geoderma, 140(4):337–345, 2007.

Bertil Matérn. Spatial Variation: Stochastic models and their application to some problems in forest surveys and other sampling investigations. PhD thesis, Stockholm Uni- versity, Sweden, 1960.

Nicholas Metropolis, Arianna W. Rosenbluth, Marshall N. Rosenbluth, Augusta H. Teller, and Edward Teller. Equation of State Calculations by Fast Computing Machines.Jour- nal of Chemical Physics, 21(6):1087–1092, 1953.

Budiman Minasny and Alex B. McBratney. The Matérn function as a general model for soil variograms. Geoderma, 128(3):192–207, 2005.

Karla Monterrubio-Gómez, Lassi Roininen, Sara Wade, Theo Damoulas, and Mark Giro- lami. Posterior Inference for Sparse Hierarchical Non-stationary Models. Computa- tional Statistics & Data Analysis, 148:106954, 2020.

Carl Edward Rasmussen and Christopher K. I. Williams.Gaussian Processes for Machine Learning. MIT Press, 2006.

Lassi Roininen, Sari Lasanen, and Janne M. J. Huttunen. Whittle-Matérn priors for Bayesian statistical inversion with applications in electrical impedance tomography.

Inverse Problems and Imaging, 8(2):561–586, 2014.

Lassi Roininen, Mark Girolami, Sari Lasanen, and Markku Markkanen. Hyperpriors for Matérn fields with applications in Bayesian inversion. Inverse Problems and Imaging, 13(1):1–29, 2019.

Spatiotemporal Matérn field modelling of Baltic Sea turbidity