• Ei tuloksia

Computational Methods in Inverse Problems, Mat Parametric or non-parametric

N/A
N/A
Info
Lataa
Protected

Academic year: 2022

Jaa "Computational Methods in Inverse Problems, Mat Parametric or non-parametric"

Copied!
36
0
0

Kokoteksti

(1)

Basic Problem of Statistical Inference Assume that we have a set of observations

S = ©

x1, x2, . . . , xNª

, xj Rn.

The problem is to infer on the underlying probability distribution that gives rise to the data S.

Statistical modeling

Statistical analysis.

Computational Methods in Inverse Problems, Mat–1.3626 0-0

(2)

Parametric or non-parametric?

Parametric problem: The underlying probability density has a specified form and depends on a number of parameters. The problem is to infer on those parameters.

Non-parametric problem: No analytic expression for the probability den- sity is available. Description consists of defining the dependency/non- dependency of the data. Numerical exploration.

Typical situation for parametric model: The distribution is the probability density of a random variable X : Ω Rn.

Parametric problem suitable for inverse problems

Model for a learning process

Computational Methods in Inverse Problems, Mat–1.3626 0-1

(3)

Law of Large Numbers General result (“Statistical law of nature”):

Assume that X1, X2, . . . are independent and identically distributed random variables with finite mean µ and variance σ2. Then,

n→∞lim 1 n

¡X1 + X2 + · · · + Xn¢

= µ almost certainly.

Almost certainly means that with probability one,

n→∞lim 1 n

¡x1 + x2 + · · · + xn¢

= µ, xj being a realization of Xj.

Computational Methods in Inverse Problems, Mat–1.3626 0-2

(4)

Example Sample

S = ©

x1, x2, . . . , xNª

, xj R2. Parametric model: xj realizations of

X ∼ N(x0,Γ),

with unknown mean x0 R2 and covariance matrix Γ R2×2. Probability density of X:

π(x | x0,Γ) = 1

2πdet(Γ)1/2 exp µ

1

2(x x0)TΓ−1(x x0)

.

Problem: Estimate the parameters x0 and Γ.

Computational Methods in Inverse Problems, Mat–1.3626 0-3

(5)

The Law of Large Number suggests that we calculate

x0 = E© Xª

1 n

Xn

j=1

xj = xb0. (1)

Covariance matrix: observe that if X1, X2, . . . are i.i.d, so are f(X1), f(X2), . . . for any function f : R2 7→ Rk.

Try

Γ = cov(X) = E©

(X x0)(X x0)Tª

(X xb0)(X xb0)Tª

(2)

1 n

Xn

j=1

(xj xb0)(xj xb0)T = Γ.b

Formulas (1) and (2) are known as empirical mean and covariance, respec- tively.

Computational Methods in Inverse Problems, Mat–1.3626 0-4

(6)

Case 1: Gaussian sample

0 1 2 3 4 5

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

0 1 2 3 4 5

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

Computational Methods in Inverse Problems, Mat–1.3626 0-5

(7)

Sample size N = 200.

Eigenvectors of the covariance matrix:

Γ =e U DUT, (3)

where U R2×2 is an orthogonal matrix and D R2×2 is a diagonal, UT = U−1.

U = £

v1 v2 ¤

, D =

· λ1

λ2

¸ , Γve j = λjv, , j = 1,2.

Scaled eigenvectors,

vj,scaled = 2p

λjvj, where p

λj =standard deviation (STD).

Computational Methods in Inverse Problems, Mat–1.3626 0-6

(8)

Case 2: Non-Gaussian Sample

−1.5 −1 −0.5 0 0.5 1 1.5

−1.5

−1

−0.5 0 0.5 1 1.5

−1.5 −1 −0.5 0 0.5 1 1.5

−1.5

−1

−0.5 0 0.5 1 1.5

Computational Methods in Inverse Problems, Mat–1.3626 0-7

(9)

Estimate of normality/non-normality Consider the sets

Bα = ©

x R2 | π(x) αª

, α > 0.

If π is Gaussian, Bα is an ellipse or ∅.

Calculate the integral

X Bαª

= Z

Bα

π(x)dx. (4)

We call Bα the credibility ellipse with credibility p, 0 < p < 1, if P©

X Bαª

= p, giving α = α(p). (5)

Computational Methods in Inverse Problems, Mat–1.3626 0-8

(10)

Assume that the Gaussian density π has the center of mass and covariance matrix xe0 and Γ estimated from the samplee S of size N.

If S is normally distributed,

xj Bα(p)ª

pN. (6)

Deviations due to non-normality.

Computational Methods in Inverse Problems, Mat–1.3626 0-9

(11)

How do we calculate the quantity?

Eigenvalue decomposition:

(x xe0)TΓe−1(x xe0) = (x xe0)TU D−1UT(x xe0)

= kD−1/2UT(x xe0)k2, since U is orthogonal, i.e., U−1 = UT, and we wrote

D−1/2 =

· 1/ λ1

1/ λ2

¸ . We introduce the change of variables,

w = f(x) = W(x xe0), W = D−1/2UT.

Computational Methods in Inverse Problems, Mat–1.3626 0-10

(12)

Write the the integral in terms of the new variable w, Z

Bα

π(x)dx = 1

2π¡

det(Γ)e ¢1/2 Z

Bα

exp µ

1

2(x xe0)TΓe−1(x xe0)

dx

= 1

2π¡

det(Γ)e ¢1/2 Z

Bα

exp µ

1

2kW(x xe0k2

dx

= 1

2π Z

f(Bα)

exp µ

1

2kwk2

dw, where we used the fact that

dw = det(W)dx = 1

√λ1λ2 dx = 1

det(Γ)e 1/2 dx.

Note:

det(Γ) = det(U DUe T) = det(UTU D) = det(D) = λ1λ2.

Computational Methods in Inverse Problems, Mat–1.3626 0-11

(13)

The equiprobability curves for the density for w are circles centered around the origin, i.e.,

f(Bα) = Dδ = ©

w R2 | kwk < δª for some δ > 0.

Solve δ: Integrate in radial coordinates (r, θ), 1

2π Z

Dδ

exp µ

1

2kwk2

dw =

Z δ

0

exp µ

1 2r2

rdr

= 1 exp µ

1 2δ2

= p, implying that

δ = δ(p) = s

2 log

µ 1 1 p

.

Computational Methods in Inverse Problems, Mat–1.3626 0-12

(14)

To see if the sample points xj is within the confidence ellipse with confidence p, it is enough to check if the condition

kwjk < δ(p), wj = W(xj xe0), 1 j N is valid.

Plot

p 7→ 1

N

xj Bα(p)ª

Computational Methods in Inverse Problems, Mat–1.3626 0-13

(15)

Example

0 20 40 60 80 100

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Gaussian Non−gaussian

Computational Methods in Inverse Problems, Mat–1.3626 0-14

(16)

Matlab code

N = length(S(1,:)); % Size of the sample xmean = (1/N)*(sum(S’)’); % Mean of the sample CS = S - xmean*ones(1,N); % Centered sample

Gamma = 1/N*CS*CS’; % Covariance matrix

% Whitening of the sample

[V,D] = eig(Gamma); % Eigenvalue decomposition W = diag([1/sqrt(D(1,1));1/sqrt(D(2,2))])*V’;

WS = W*CS; % Whitened sample normWS2 = sum(WS.^2);

Computational Methods in Inverse Problems, Mat–1.3626 0-15

(17)

% Calculating percentual amount of scatter points that are

% included in the confidence ellipses

rinside = zeros(11,1);

rinside(11) = N;

for j = 1:9

delta2 = 2*log(1/(1-j/10));

rinside(j+1) = sum(normWS2<delta2);

end

rinside = (1/N)*rinside;

plot([0:10:100],rinside,’k.-’,’MarkerSize’,12)

Computational Methods in Inverse Problems, Mat–1.3626 0-16

(18)

Which one of the following formulae?

Γ =b 1 N

XN

j=1

(xj xb0)(xj xb0)T,

or

Γ =e 1 N

XN

j=1

xjxTj xe0xeT0 .

The former, please.

Computational Methods in Inverse Problems, Mat–1.3626 0-17

(19)

Example Calibration of a measurement instrument:

Measure a dummy load whose output known

Subtract from actual measurement

Analyze the noise

Discrete sampling; Output is a vector of length n.

Noise vector x Rn is a realization of

X : Ω Rn. Estimate mean and variance

x0 = 1 n

Xn

j=1

xj (offset), σ2 = 1 n

Xn

j=1

(xj x0)2.

Computational Methods in Inverse Problems, Mat–1.3626 0-18

(20)

Improving Signal-to-Noise Ratio (SNR):

Repeat the measurement

Average

Hope that the target is stationary Averaged noise:

x = 1 N

XN

k=1

x(k) Rn.

How large must N be to reduce the noise enough?

Computational Methods in Inverse Problems, Mat–1.3626 0-19

(21)

Averaged noise x is a realization of a random variable X = 1

N

XN

k=1

X(k) Rn.

If X(1), X(2), . . . i.i.d., X is asymptotically Gaussian by Central Limit Theo- rem, and its variance is

var(X) = σ2 N .

Repeat until the variance is below a given threshold, σ2

N < τ2.

Computational Methods in Inverse Problems, Mat–1.3626 0-20

(22)

0 10 20 30 40 50

−3

−2

−1 0 1 2 3

Number of averaged signals = 1

0 10 20 30 40 50

−3

−2

−1 0 1 2 3

Number of averaged signals = 5

Computational Methods in Inverse Problems, Mat–1.3626 0-21

(23)

0 10 20 30 40 50

−3

−2

−1 0 1 2 3

Number of averaged signals = 10

0 10 20 30 40 50

−3

−2

−1 0 1 2 3

Number of averaged signals = 25

Computational Methods in Inverse Problems, Mat–1.3626 0-22

(24)

Maximum Likelihood Estimator: frequentist’s approach Parametric problem,

X πθ(x) = π(x | θ), θ Rk.

Independent realizations: Assume that the observations xj are obtained inde- pendently.

More precisely: X1, X2, . . . , XN i.i.d, xj is a realization of Xj. Independency:

π(x1, x2, . . . , xN | θ) = π(x1 | θ)π(x2 | θ)· · · π(xN | θ), or, briefly,

π(S | θ) = YN

j=1

π(xj | θ),

Computational Methods in Inverse Problems, Mat–1.3626 0-23

(25)

Maximum likelihood (ML) estimator of θ = parameter value that maximizes the probability of the outcome:

θML = arg max YN

j=1

π(xj | θ).

Define

L(S | θ) = log(π(S | θ)).

Minimizer of L(S | θ) = maximizer of π(S | θ).

Computational Methods in Inverse Problems, Mat–1.3626 0-24

(26)

Example Gaussian model

π(x | x0, σ2) = 1

2πσ2 exp

µ 1

2 (x x0)2

, θ =

· x0 σ2

¸

=

· θ1 θ2

¸ .

Likelihood function is YN

j=1

π(xj | θ) =

µ 1 2πθ2

N/2

exp

1 2θ2

XN

j=1

(xj θ1)2

= exp

1 2θ2

XN

j=1

(xj θ1)2 N

2 log¡

2πθ2¢

= exp (−L(S | θ)).

Computational Methods in Inverse Problems, Mat–1.3626 0-25

(27)

We have

θL(S | θ) =





∂L

∂θ1

∂L

∂θ2



 =





1 θ22

XN

j=1

xj + N θ22 θ1

1 2θ22

XN

j=1

(xj θ1)2 + N2





.

Setting θL(S | θ) = 0 gives

x0 = θML,1 = 1 N

XN

j=1

xj,

σ2 = θML,2 = 1 N

XN

j=1

(xj θML,1)2.

Computational Methods in Inverse Problems, Mat–1.3626 0-26

(28)

Example Parametric model

π(n | θ) = θn

n! e−θ,

sample S = {n1,· · · , nN}, nk N, obtained by independent sampling.

The likelihood density is

π(S | θ) =

YN

k=1

π(nk) = e−N θ YN

k=1

θnk nk!, and its negative logarithm is

L(S | θ) = logπ(S | θ) =

XN

k=1

¡θ nk logθ + log nk.

Computational Methods in Inverse Problems, Mat–1.3626 0-27

(29)

Derivative with respect to θ to zero:

∂θL(S | θ) =

XN

k=1

³

1 nk θ

´

= 0, (7)

leading to

θML = 1 N

XN

k=1

nk. Warning:

var(N) 1 N

XN

k=1

nk 1 N

XN

j=1

nj

2

,

which is different from the estimate of θML obtained above.

Computational Methods in Inverse Problems, Mat–1.3626 0-28

(30)

Assume that θ is known a priori to be relatively large.

Use Gaussian approximation:

YN

j=1

πPoisson(nj | θ)

µ 1 2πθ

N/2

exp

1 2θ

XN

j=1

(nj θ)2

=

µ 1 2π

N/2

exp

1 2

1 θ

XN

j=1

(nj θ)2 + N log θ

.

L(S | θ) = 1 θ

XN

j=1

(nj θ)2 + N logθ.

Computational Methods in Inverse Problems, Mat–1.3626 0-29

(31)

An approximation for θML: Minimize L(S | θ) = 1

θ

XN

j=1

(nj θ)2 + N logθ.

Write

∂θL(S | θ) = 1 θ2

XN

j=1

(nj θ)2 2 θ

XN

j=1

(nj θ) + N

θ = 0, or

XN

j=1

(nj θ)2 2

XN

j=1

θ(nj θ) + N θ = N θ2 + N θ

XN

j=1

n2j = 0, giving

θ =

1

4 + 1 N

XN

j=1

n2j

1/2

1 2

6= 1 N

XN

j=1

nj

.

Computational Methods in Inverse Problems, Mat–1.3626 0-30

(32)

Example Multivariate Gaussian model,

X ∼ N(x0,Γ),

where x0 Rn is unknown, Γ Rn×n is symmetric positive definite (SPD) and known.

Model reduction: assume that x0 depends on hidden parameters z Rk through a linear equation,

x0 = Az, A Rn×k, z Rk. (8) Model for an inverse problem: z is the true physical quantity that in the ideal case is related to the observable x0 through the linear model (8).

Computational Methods in Inverse Problems, Mat–1.3626 0-31

(33)

Noisy observations:

X = Az + E, E ∼ N(0,Γ).

Obviously,

Xª

= Az + E© Eª

= Az = x0, and

cov(X) = E©

(X Az)(X Az)Tª

= E©

EETª

= Γ.

The probability density of X, given z, is π(x | z) = 1

(2π)n/2det(Γ)1/2 exp µ

1

2(x Az)TΓ−1(x Az)

.

Computational Methods in Inverse Problems, Mat–1.3626 0-32

(34)

Independent observations:

S = ©

x1, . . . , xNª

, xj Rn. Likelihood function

YN

j=1

π(xj | z) exp

1 2

XN

j=1

(xj Az)TΓ−1(xj Az)

is maximized by minimizing L(S | z) = 1

2

XN

j=1

(xj Az)TΓ−1(xj Az)

= N

2 zT£

ATΓ−1A¤

z zT

·

ATΓ−1

XN

j=1

xj

¸

+ 1 2

XN

j=1

xTj Γ−1xj.

Computational Methods in Inverse Problems, Mat–1.3626 0-33

(35)

Zeroing of the gradient gives

zL(S | z) = N£

ATΓ−1A¤

z ATΓ−1

XN

j=1

xj = 0,

i.e., the maximum likelihood estimator zML is the solution of the linear system

£ATΓ−1A¤

z = ATΓ−1x, x = 1 N

XN

j=1

xj.

The solution may not exist; All depends on the properties of the model reduc- tion matrix A Rn×k.

Computational Methods in Inverse Problems, Mat–1.3626 0-34

(36)

Particular case: one observation, S = © xª

,

L(z | x) = (x Az)TΓ−1(x Az).

Eigenvalue decomposition of the covariance matrix, Γ = U DUT,

or,

Γ−1 = WTW, W = D−1/2UT, we have

L(z | x) = kW(Az x)k2.

Hence, the problem reduces to a weighted least squares problem

Computational Methods in Inverse Problems, Mat–1.3626 0-35

Viittaukset

LIITTYVÄT TIEDOSTOT

How many degrees of freedom do you have, i.e., characterize the degree of non-uniqueness of such matrices.. Show that A

(b) What is the conditional probability of her having a malignant tumor, considering the fact that the mammogram resulted positive2. (c) What problems are there with her selection

The different topics analysed in this thesis include local non-parametric growth estimation methods, localizing the non- parametric growth estimates, simultaneous estimation

The methods in [75] and in [79] take Fourier trans- forms of time domain measurement data and solve the inverse obstacle problem by using inclusion detection methods developed

Inverse problems aim to recover a physical quantity (e.g. the density distribution of tissues inside a human body) from inside a target object using a set of data that is collected

The goal of this thesis was to develop and study novel computational inversion methods for such limited-data inverse problems in three cases: sparse-data stationary X-ray

We prove uniqueness and stability for the inverse boundary value problem of the 2D Schrödinger equation.. We assume only that the potentials are in H s,(2,1) (Ω), s &gt; 0, which

Abstract: In this study, we compare six different machine learning methods in the inversion of a stochastic model for light propagation in layered media, and use the inverse models