Numerical Simulation of Stochastic Di erential Equations

(1)

LAPPEENRANTA UNIVERSITY OF TECHNOLOGY Faculty of Technology

Department of Mathematics and Physics

Alain Christian Nsengiyumva

NUMERICAL SIMULATION OF STOCHASTIC DIFFERENTIAL EQUATIONS

Examiners: Professor Heikki Haario.

Professor Matti Heiliö.

(2)

Lappeenranta University of Technology Department of Mathematics and Physics Alain Christian Nsengiyumva

Numerical Simulation of Stochastic Dierential Equations Master's thesis

2013

51 pages, 8 gures, 6 tables

Examiners: Professor Heikki Haario.

Professor Matti Heiliö.

Key words: Stochastic dierential equations, Euler-Maruyama method, Milstein method, Runge-Kutta method, Shoji-Ozaki schemes, Kalman lter, Extended Kalman lter, Epidemic models.

Stochastic dierential equation (SDE) is a dierential equation in which some of the terms and its solution are stochastic processes. SDEs play a central role in modeling physical systems like nance, Biology, Engineering, to mention some. In modeling process, the computation of the trajectories (sample paths) of solutions to SDEs is very important. However, the exact solution to a SDE is generally dicult to obtain due to non-dierentiability character of realizations of the Brownian motion.

There exist approximation methods of solutions of SDE. The solutions will be continuous stochastic processes that represent diusive dynamics, a common modeling assumption for nancial, Biology, physical, environmental systems. This Masters' thesis is an introduction and survey of numerical solution methods for stochastic differential equations. Standard numerical methods, local linearization methods and ltering methods are well described. We compute the root mean square errors for each method from which we propose a better numerical scheme.

Stochastic dierential equations can be formulated from a given ordinary dierential equations. In this thesis, we describe two kind of formulations: parametric and non- parametric techniques. The formulation is based on epidemiological SEIR model.

This methods have a tendency of increasing parameters in the constructed SDEs, hence, it requires more data. We compare the two techniques numerically.

(3)

Acknowledgements

I would like to express my deep appreciation to the Department of Mathematics, Lappeenranta University of Technology (LUT), for the scholarship granted during my studies.

My sincere gratitude to my guide Professor Heikki Haario for supervising this thesis.

I also thank Professor Matti Heiliö for being my second supervisor. Special thanks goes to Isambi Sailon Mbalawata for his assistance and support rendered to me during my studies.

Thanks to all my friends and classmates at LUT and National University of Rwanda who encouraged me a lot in this project.

I owe my gratitude to my late father Mr. Nsengiyumva Léopold, my mother Mrs. Mukanyarwaya Félicité, my sister Gloriose Nsengiyumva and to the Family Twahirwa Manassé; as well as my Wife and Son who supported me a lot; for their blessings and inspiration.

"Mwarakoze cyane"

Lappeenranta; November 18, 2013 Alain Christian Nsengiyumva

(4)

1 Introduction

Stochastic dierential equations (SDEs) were rst initiated and developed by Ito (1942). The theory of (SDE's) provides a useful tool to introduce stochasticity into models and to characterize the evolution of many processes in nance, Mathematics, Physics, Chemistry, Biology, Medical science, and almost all sciences. In many cases, the solutions are not given explicitly due to non-dierentiability character of realizations of the Brownian motion; therefore numerical approximations are used to study the properties of these models.

The convergence of numerical schemes for SDEs needs conditions on the drift and diusion coecients. These conditions are namely linear growth and global Lipschitz conditions (Skorochod, 1965; Kloeden and Platen, 1999; Mao, 1997). We note that Yamada (1978) relaxed the global Lipschitz condition, whilst Kaneko and Nakao (1988) have shown that the Euler scheme converges in the strong sense, to the solution of the stochastic deferential equation whenever path-wise uniqueness of the solution holds. However, both results require the linear growth condition whilst the latter provides no information on the order of the approximation. Marion et al.

(2002) have shown the convergence in Probability, of the Euler scheme under specic conditions on the coecients.

The basic theoretical problems concerned with stochastic dierential equations are, generally speaking, the same as those in the case of deterministic dierential equations, namely: existence and uniqueness of a solution, analytical properties of the solutions, dependence of the solutions on parameters and initial values etc. Yet, the introduction of random elements into appropriate dierential equations leads to new probabilistic problems and specic diculties. For instance, the sense itself of a stochastic dierential equation should be clearly dened since it can be dierent depending on the understanding of a stochastic process and its derivatives. For the analysis of stochastic dierential equations, however, a crucial point is regularity of random functions occurring in a given equation.

In this thesis the aim is to study how to simulate the linear and nonlinear SDE. We review standard numerical methods like Euler-Maruyama, Milstein, Runge-Kutta schemes. Similar we study the local linearization methods namely Ozaki and Shoji- Ozaki schemes. We introduce the ltering techniques, particularly Kalman lter for linear SDEs and extended Kalman lter for nonlinear SDEs. The aim is to see which numerical simulation method is better. The criterion of selecting the better numerical method is by computing the root mean square errors (RMSEs), that is,

(7)

1 INTRODUCTION 2 the lower the RMSE the better the method.

We Also study how to construct the SDE given a deterministic model, here we review two modeling procedures: parametric and non-parametric methods. The emphasize is on epidemiological SEIR models. Of course, it is possible to apply the same techniques to other models. when to use a deterministic model or stochastic model? This is a question that one asks before starting modeling. We give the dierence between deterministic and stochastic models in the context of epidemic models. However in general, one chooses a model that ts the problem. The problem will always determine whether stochastic or deterministic model is needed.

The structure of this thesis is as follows. In Section 2, the stochastic calculus is reviewed where Brownian motion, Stochastic Integrals, Itô formula and Stochas- tic dierential Equation are discussed. Section 3, deals with Standards numerical methods where Euler-Maruyama, Milstein and RungeKutta numerical schemes are reviewed. Section 4, contain local linearlization methods: Ozaki and ShojiOzaki schemes as well as Kalman lter and extended Kalman lter. Section 5 is devoted for construction of epidemiological SEID SDE from from a deterministic SEIR while Section 6 is about numerical simulations of Linear SDE and Nonlinear SDE. The conclusion of this work is found in in Section 7.

(8)

2 Stochastic Calculus

This section is devoted to introduce the stochastic processes and some general def- initions. We discuss stochastic calculus which provides a mathematical foundation for the treatment of stochastic dierential. If we allow some randomness in some of the parameters of a dierential equation, we obtain mathematical model that contain possible uncertainties of the situation.

Let N(t) be the size of the population at time t, and a(t) is the relative rate of growth at timet. Consider the following ordinary dierential equation (ODE) that describes a simple population growth model (Øksendal, 2003).

dN

dt =a(t)N(t), (1)

where the initial population is N(0) = N₀. From the model (1) above, it might happen that the parametera(t)is not known, but subject to some random environmental eects. to include the randomness, we introduce the noise to the parameter, that is we dene

a(t) =r(t) +"noise",

where the behavior of the noise term is unknown but its probability distribution.

We can re-write (1) using new perturbed parameter. This result to.

dN

dt = (r(t) +"noise")N(t) or more generally in the form

dX

dt =b(t,x_t) +σ(t,x_t)"noise", (2) where b and σ are some given functions. We can present a noise term by W_t. Therefore, in this case for uni-variate model, we have

dX

dt =b(t,x_t) +σ(t,x_t)·W_t, (3) The noise term W_t has, at least approximately, these properties:

(i) W_t₁ and W_t₂ are independent fort₁ 6=t₂. (ii) {W_t} is stationary.

(iii) The expectation of Wt is always zero for all t, that is E[Wt] = 0

The assumptions (i), (ii) and (iii) on Wt suggest that Vt should have stationary independent increments with mean 0. It turns out that the only such process with continuous paths is the Brownian motionB_t. (Knight, 1981).

(9)

2 STOCHASTIC CALCULUS 4

2.1 Brownian Motion

Brownian motion, is a fundamental example of a stochastic process. Loosely speaking, Brownian motion can be dened as a continuous time random walk, with the following properties

(i) B₀ = 0.

(ii) B_tis almost surely continuous, that is the event happens with probability one.

The sample trajectoriest 7→Bt are continuous, with probability 1. (iii) For any nite sequence of times t₀ < t₁ < ... < t_n, the increments

B_t₁ −B_t₀, B_t₂ −B_t₁, . . . , B_t_n−B_t_n−1 are independent.

(iv) For any times 0≤s ≤t, B_t−B_s is normally distributed with mean zero and variancet−s.

In particular, this implies that E[B_t− B_s] = 0 and var[B_t− B_s] = t− s, 0≤s≤t.

For convenience we informally regard Brownian motion as a random walk over in- nitesimal time intervals of length ∆t, with increments ∆Bt over the time interval [t, t+ ∆t] given by,

∆B_t =±√

∆t with equal probabilities (1 2,1

2). (4)

The choice of the square root in (4) is not fortuitous. Indeed, any choice of±(∆t)^α with a power α > ¹₂ would lead to explosion of the process as dt tends to zero, whereas a power α ∈(0,¹₂) would lead to a vanishing process.

Note that we have E[∆B_t] = 1

2

√

∆t− 1 2

√

∆t = 0, and var[∆B_t] =E[(∆B_t)²] = 1

2∆t+1 2∆t.

According to this representation, the paths of Brownian motion are not dierentiable, although they are continuous Note that there is no point in computing the value ofB_t as it is a random variable for allt >0, however we can generate samples of B_t, which are distributed according to the centered Gaussian distribution with variance t (Folland, 1999; Revuz and Yor, 1994). Figure 1 shows an example of one-dimensional Brownian motion paths generated from the same initial point. As seen from Figure, at each trajectory has its own path even if they are generated using the same parameter values.

(10)

Figure 1: Three trajectories of one-dimensional sample Brownian motion 2.1.1 Brownian Motion as a Limit of Random Walks

One of the many reasons that Brownian motion is important in probability theory is that it is, in a certain sense, a limit of rescaled simple random walks. Letξ₁, ξ₂, ...be a sequence of independent, identically distributed random variables with mean0and variance 1. For each n ≥ 1 dene a continuoustime stochastic process {B_n(t)}t≥0

by

Bn(t) = 1

√n X

1≤j≤bntc

ξj (5)

This is a random step function with jumps of size ±^√¹_n at times ^k_n, where k ∈ Z+. Since the random variables ξ_j are independent, the increments of B_n(t) are independent. Moreover, for large n the distribution of B_n(t+s)−B_n(s) is close to normal distribution with mean 0 and variance t, by the Central Limit theorem.

Thus, it requires only a small leap of faith to believe that, asn→ ∞, the distribution of the random function B_n(t)approaches (in a sense made precise below) that of a standard Brownian motion.

Then, Why is this important? First, it explains, at least in part, why the Wiener process arises so commonly in nature. Many stochastic processes behave, at least for long stretches of time, like random walks with small but frequent jumps. The argument above suggests that such processes will look, at least approximately, and on the appropriate time scale, like Brownian motion. Second, it suggests that many important statistics of the random walk will have limiting distributions, and that the limiting distributions will be the distributions of the corresponding statistics of Brownian motion. The simplest instance of this principle is the central limit

(11)

2 STOCHASTIC CALCULUS 6 theorem: the distribution ofB_n(1)is, for largen close to that ofB(1)(the Gaussian distribution with mean 0and variance 1). Other important instances do not follow so easily from the central limit theorem. For example, the distribution of

M_n(t) := max

0≤s≤tB_n(t) = max

0≤k≤nt

√1 n

X

1≤j≤k

ξ_j (6)

converges, as n → ∞, to that of

M(t) := max

0≤s≤tB(t) (7)

The distribution of M(t) will be calculated explicitly as below, along with the distributions of several related random variables connected with the Brownian path.

2.2 Itô integral

Now let's consider the Integral

I[f](ω) = Z b

a

f(t, ω)dBt(ω) (8)

whereBtBrownian motion. We dene R

f dBas the limit of R

φdB asφ→f (limit in L²(P)). We now give the details of this construction: A function φ ∈ V is called elementary if it has the form

φ(t, ω) =X

j

e_j(ω)·X_[t_j_,t_j+1_](t). (9) Note that since φ ∈ V each function e_j must be F_t_j-measurable. For elementary functions φ(t, ω)we dene the integral according to (8), this means

Z T S

φ(t, ω)dBt(ω) = X

j≥0

ej(ω)[Btj+1−Btj](ω) (10) From Equation (10), we can deduce a so called Itô isometry, that is, if φ(t, ω) is bounded and elementary then,

E

"

Z T S

φ(t, ω)dB_t(ω) ²#

=E Z T

S

φ(t, ω)²dt

. (11)

The idea now is to use the isometry (11) to extend the denition from elementary functions to functions in V. This is done in several steps:

(12)

1. Let g ∈ V be bounded and g(·, ω) continuous for each ω. Then there exist elementary functionsφ_n ∈ V such that,

E Z T

S

(g−φ_n)²dt

→0 as n → ∞.

2. Let h ∈ V be bounded. Then there exist bounded functions g_n ∈ V such that g_n(·, ω)is continuous for all ω and n, and

E Z T

S

(h−g_n)²dt

→0.

3. Let f ∈ V. Then there exists a sequence h_n ⊂ V such that h_n is bounded for eachn and

E Z T

S

(f −h_n)²dt

→0 as n→ ∞.

Then the Itô integral of f (from S to T) is dened by, Z T

S

f(t, ω)dB_t(ω) = lim

n→∞

Z T S

φ_n(t, ω)dB_t(ω) (12) (limit in L²(P)), where φ_n is a sequence of elementary functions such that,

E Z T

S

(f(t, ω)−φ_n(t, ω))²dt

→0 as n → ∞. (13) Note that such a sequence φ_n satisfying (13) exists by Steps 1−3 above. The Itô integrals do not behave as Rieman integrals though they share some properties.

Some properties of the Itô integral are listed as follows Let f, g ∈ V(0, T) and let 0≤S < U < T. Then

1. RT

S f dB_t =RU

S f dB_t+RT U f dB_t. 2. RT

S(cf +g)dB_t =c·RT

S f dB_t+RT S gdB_t. 3. E[RT

S f dB_t] = 0. 4. RT

S f dBt isFt-measurable.

The above description applies to one dimensional case. It is possible to generalize it to multidimensional case (Kloeden and Platen, 1999). In that case, one can use for instance a famous iterated Itô integrals dened as

n!

Z

0≤u1≤···≤un≤t

· · ·( Z

( Z

dB_u₁)dB_u₂)· · ·dB_u_n =tⁿ²~n(B_t

√t), for n= 0,1,2, . . . (14)

where~is the Hermite polynomial of degreen, dened by~n(x) = (−1ⁿ) exp^x₂²_dx^dⁿn(exp−^x₂²).

(13)

2 STOCHASTIC CALCULUS 8

2.3 Stratonovich integrals

Besides the Itô stochastic integral, the most common stochastic integral is the Stratonovich stochastic integral dened as,

Z b a

f(t)◦dB(t) = limm→ ∞

m−1

X

i=0

1

2(f(t^(m)_i ) +f(t^(m)_i+1))(B(t^(m)_i+1)−B(t^(m)_i )). (15) The symbol "◦" is used to denote that the integral is the Stratonovich integral. In the summation, ¹₂(f(t^(m)_i ) +f(t^(m)_i+1)) and (B(t^(m)_i+1)− B(t^(m)_i )) are not likely to be independent. Consequently, Itô and Stratonovich integrals generally have dierent values. Consider, for example, the integrals

Itô integral: Z t 0

B(s)dB(s) and Stratonovich integral:Z t 0

B(s)◦dB(s). (16) we know that the value of the Itô integral is,

Z t 0

B(s)dB(s) = 1

2(B²(t)−B²(0))− t 2. and of Stratonovich integral is

Z t 0

B(s)◦dB(s) = 1

2(B²(t)−B²(0)).

Hence, for this stochastic integral, the Itô and Stratonovich integrals are not the same as their dierence is

Z t 0

B(s)dB(s)− Z t

0

B(s)◦dB(s) = −t 2.

A Comparison of Itô and Stratonovich integrals

As we have argued that the mathematical interpretation of the white noise equation, dX

dt =b(t,X_t) +σ(t,X_t)·W_t, (17) is that X_t is a solution of the integral equation

X_t=X₀+ Z t

0

b(s,X_s)ds+ Z t

0

σ(s,X_s)dB_s (18) for some suitable interpretation of this above integral. However, as indicated earlier, the Itô interpretation of an integral of the form

Z t 0

f(s, ω)dB_s(ω) (19)

(14)

is just one of several reasonable choices. For example, the Stratonovich integral is another possibility, leading (in general) to a dierent result.

So the question still remains, which interpretation of (19) makes (18) the "right"

mathematical model for the equation (17)? The most used integral is Itô. However, it does not mean that the Stratonovich integrals are not useful. There are few situations where Stratonovich integrals are needed that Itô integrals.

The Stratonovich integral of (18) is dened as X(t) =X₀+

Z t 0

b(s,X_s)ds+ Z t

0

σ(s,X_s)◦dB_s, (20) The solution of Equation (20) is the solution of the following modied Itô equation:

X(t) = X₀+ Z t

0

b(s,X_s)ds+1 2

Z t 0

σ⁰(s,X_s)σ(s,X_s)ds+ Z t

0

σ(s,X_s)dB_s (21) where σ⁰ denotes the derivative of σ(t, x) with respect to x. (Stratonovich, 1966).

Equations (20) and (21) shows how it is possible to switch from Stratonovich integrals to Itô integrals. Therefore, give either of it, it is possible to convert to the other.

In general one can say that the Stratonovich integral has the advantage of leading to ordinary chain rule formulas under a transformation (change of variable), i.e. there are no second order terms in the Stratonovich analogue of the Itô transformation formula. This property makes the Stratonovich integral natural to use for example in connection with stochastic dierential equations on manifolds (Elworthy, 1998;

Ikeda and Watanabe, 1989). However, Stratonovich integrals are not martingales, where as Itô integrals are. This gives the Itô integral an important computational advantage. For our purposes the Itô integral will be most convenient, so we will base our work on that from now on.

2.4 Itô Formula

In evaluating the Itô integrals, the Itô formula plays a great rule. Equation (18) can be written in a dierential form as

dX_t=u(t,X_t)dt+v(t,X_t)dB_t (22)

(15)

2 STOCHASTIC CALCULUS 10 if we dene a twice continuous dierentiable function as Y_t = g(t,X, then the following general Itô formula gives another Itô process.

dY_k= ∂g_k

∂t (t,X) +X

i

∂g_k

∂xi

(t,X)dX_i+1 2

X

i

∂²g_k

∂xi∂xj

(t,X)dX_idX_j. (23) Here, k is a component. Note that dBidBj is dt is i = j and 0 otherwise. Also dB_idt=dtdB_i = 0.

Example 2.1 Suppose we want to compute the integral Z t

0

B_sdB_s. From this integral we dene Y = g(t, B_t) = B_t²/2, where ^∂g_∂t = 0, _∂B^∂g_t = B_t and _∂B^∂2g2

t = 1.

Substituting to Itô formula, we have dY =d(1

2B_t²) =B_tdB_t+1

2(dB_t)² =B_tdB_t+ 1

2dt. (24)

Taking integrals in both side of (24), we have ¹₂B_t² =Rt

0 B_sdB_s+¹₂t, which means Z t

0

BsdBs = 1 2t− 1

2B_t²

2.5 Stochastic Dierential Equations

The dierential equation (22) can be written in a general form, that involves parameters θ, continuous time t and variable Xt,as follows

dx(t) = f(x(t), t,θ) dt+L(x(t), t,θ) dB(t), (25) or in integral form as

x(t) = c+ Z t

t0

f(s,X_s)ds+ Z t

t0

L(s,X_s)dBs (26) The above dierential equation is called Itô stochastic dierential equation where f(x(t), t,θ) and L(x(t), t,θ) are drift and diusion coecients respectively. The random variablecis called the initial value at the instantt₀. A solutionX_tof equation (25) or (26) is called stochastic process The history of stochastic dierential equations (SDEs) starts from the paper of Einstein (1905) who gave a mathematical connection between microscopic random motion of particles (microscopic motion of Brownian particles) and themacroscopic diusion equation. Currently, the SDEs are attracting a lot of attention due to physical processes in real life systems which experience random forcing and stochastic inputs that cannot be captured by ordinary dierential equations. The SDEs are commonly used to model the diverse

(16)

phenomena in neural networks, ecosystem dynamics, population genetics, macroeco- nomic and physical systems. The understanding of SDE theory requires familiarity with advanced probability and stochastic processes (see Chung, 2001), which are not covered in this thesis.

(17)

3 STANDARD NUMERICAL METHODS 12

3 Standard Numerical Methods

In this chapter we study the numerical schemes for stochastic dierential equations which are derived from stochastic Taylor expansions. Here we are going to study Euler-Maruyama scheme, Milstein scheme and Runge-Kutta scheme. For simplicity, we consider a univariate Itô stochastic dierential Equation (25). whose stochastic integral, when the parameter dependence is dropped, is

X(t) = X(t0) + Z t

t0

f(s, X(s))ds+ Z t

t0

L(s, X(s))dB(s). (27) Here, the rst integral is pathwise a deterministic Riemann integral and the second is an Itô stochastic integral, which looks as if it could be dened pathwise as a deterministic Riemann-Stieltjes integral, but this is not possible because the sample paths of the Brownian motion, though continuous, are not dierentiable or even of bounded variation on any nite subinterval. The idea here is to numerically approximate the SDE using the aforementioned schemes.

3.1 EulerMaruyama Numerical Scheme

Numerical schemes can be constructed in several ways. The most common schemes that are often implemented in the approximation of SDEs are based on the stochastic Taylor expansion. The concept is quite similar to that of the deterministic dierential equation. The more terms of Taylor series expansion you include in the series, the higher the order of convergence you attain and thus more accurate scheme.

Both the Stratonovich and the Ito sense can be derived but let us consider only the expansion of the following Ito SDE:

dX(t) =f(X(t), t)dt+g(X(t), t)dB(t), (28) with the solution such as

X(t) =X(t₀) + Z t

t0

f(X(s), s)ds+ Z t

t0

g(X(s), s)dB(s). (29) where X(t0) is the initial value. Let us assume that v is suciently smooth function and by the help of 1-dimensional Ito SDE (28), the dierential of v(X(t), t) is

(18)

evaluated and leads to the following Itô formula:

d[v(X(t), t)] = δv

δt|_X_(t),tdt+f(X(t), t)δv

δx|_X(t),tdt +1

2g²(X(t), t)δ²v

δx²|_X(t),tdt+g(X(t), t)δv

δxdB(t) +odt. (30) Consequently;

d[v(X(t), t)] = [δv

δt] +f(X(t), t)δv δx

1

2g²(X(t), t)δ²v

δx²dt+g(X(t), t)δv

δxdB(t) (31) dv=L⁰vdt+L¹vdW(t), with the following partial operators;

L⁰ = δ

δt +f(X(t), t) δ δx +1

2g²(X(t), t) δ²

δx² (32)

L¹ =g(X(t), t) δ

δx (33)

By applying the dierentiation rule to the function f(X(s), s) in equation (29), it yields

d[f(X(s), s)] =L⁰f dt+L¹f dB(t), (34) whose solution

f(X(s), s) =f(X(t₀), t₀) + Z s

t0

L⁰f dz+ Z s

t0

L¹f dB(z). (35) Similarly for g(X(s),s)we get

d[g(X(s), s)] =L⁰gdt+L¹gdB(t), (36) whose solution is given by

g(X(s), s) = g(X(t₀), t₀) + Z s

t0

L⁰gdz+ Z s

t0

L¹gdB(z). (37) By substituting equations (35) and (37) into (29) we get

X(t) = X(t₀) + Z t

t0

{f(X(t₀), t₀) + Z s

t0

L⁰f(X(z), z)dz+ Z s

t0

L¹f(X(z), z)dB(z)}ds +

Z t t0

{g(X(t0), t0) + Z s

t0

L¹g(X(z), z)dB(z)}dB(s). (38)

X(t) =X(t0) +f(X(t0), t0) Z t

t0

ds+g(X(t0), t0) + Z t

t0

dB(s) + Z t

t0

Z s t0

L⁰f dzds +

Z t t0

Z s t0

L¹f dB(z)ds+ Z s

t0

Z s t0

L⁰gdsdB(s) + Z t

t0

Z s t0

L¹gdB(z)dB(s). (39)

(19)

3 STANDARD NUMERICAL METHODS 14 This leads to a rst approximation of the form;

X(t) = X(t₀) +f(X(t₀), t₀)[t−t₀] +g(X(t₀), t₀)[B(t)−B(t₀)] + Z t

t0

Z s t0

L⁰f(X(z), z)dzds +

Z t t0

Z s t0

L¹f(X(z), z)dB(z)ds+ Z s

t0

Z s t0

L⁰g(X(z), z)dsdB(s) +

Z t t0

Z s t0

L¹g(X(z), z)dB(z)dB(s). (40)

X(t) = X(t0) +f(X(t0), t0)[t−t0] +g(X(t0), t0)[B(t)−B(t0)] +Errt1, (41)

X(t+ ∆t) =X(t) +f(X(t), t)∆t_n+g(X(t), t)[B(t+ ∆t)−B(t)], or with t=n∆t

X_n+1 =X_n+f(X_n, t_n)∆t_n+g(X_n, t_n)∆B_t_n. (42) This is the simplest non-trivial stochastic Taylor expansion called Euler-Maruyama scheme. In this derivation we have assumed that the coecient functions f and g are suciently smooth. The noise increments ∆B_n here are Gaussian random variables with mean 0 and variance ∆_n. They can be generated from uniformly distributed random (or pseudo random) numbers through the Box-Muller method, although more ecient methods are available for very long simulations. In practice one needs to simulate a large number of realizations, which can be done eciently in parallel on a distributed network of computers or processes with a master com- puter coordinating the calculations of the individual realizations on the individual processes (in particular, ensuring the independence of the random variables used) and collecting the results P.E. (2002).

The stochastic Euler scheme was rst investigated by Maruyama in the early 1950s and is often called the Euler-Maruyama scheme. It seems to be consistent with the Itô stochastic calculus because the noise term approximates the Ito stochastic integral over a discretization subinterval[t_n, t_n+1]by evaluating its integrand at the left- hand end point of this interval: Rtn+1

tn L(s, X(s))dB(s)≈Rtn+1

tn L(t_n, X(t_n))dB(s) = L(t_n, X(t_n))Rtn+1

tn dB(s)Convergence for numerical schemes can be dened in a number of useful dierent ways. It is usual to distinguish between strong and weak convergence, depending on whether the realizations or only their probability distributions are required to be close, respectively. Consider a xed interval[t₀, T]and let

∆be the maximum step size of any partition of[t₀, T]. Then a numerical scheme is

(20)

said to converge with strong orderγ if, for suciently small∆,E(|X(T)−X_N_T|)≤ K_T∆^γ and with weak order β if |E(g(X(T)))−E(g(X_N_T))| ≤K_g,T∆^β. These are global discretization errors and the largest possible values of γ and β give the corresponding strong and weak orders, respectively, of the scheme.

The stochastic Euler scheme has strong order γ = ¹₂ and weak order β = 1 . These orders of convergence are with respect to classes of SDEs, e.g., with continuously dierentiable coecients for which the derivatives are uniformly bounded. For re- stricted classes a higher order is sometimes possible, such as SDE with additive noise, i.e., the diusion coecient b is independent of the state variable x , which attain a strong orderγ = 1. The strong order and weak order of the stochastic Euler scheme are quite low, particularly given the fact that a large number of realizations need to be generated for most practical applications. Thus there is a need for higher order numerical schemes.(P.E., 2002)

3.2 Milstein Numerical Scheme

The Milstein scheme is the simplest nontrivial numerical scheme for stochastic deferential equations with a strong order of convergence one (higher than that of the Euler-Maruyama scheme). The scheme has been extended to the stochastic delay deferential equations but the analysis of the convergence is technically complicated due to anticipative integrals in the remainder terms.

It was rst derived by Milstein, who used the Itô formula to expand an integrand involving the solution in one of the error terms of the Euler-Maruyama scheme.

The iterative repetition of this idea underlies the systematic derivation of stochastic Taylor expansions and numerical schemes of arbitrarily high strong and weak orders, as expounded in Kloeden and Platen (1999); Milstein (1995). Consider the homogeneous scalar stochastic dierential equation

dY_t =a(Y_t)dt+b(Y_t)dB_t, (43) and lett_i,t_i+1 be two consecutive points in our time discretization. The Ito formula says, that for a given function f which is two times continuously dierentiable, we can write,

f(Y_s) =f(Y_t_i) + Z s

ti

(f⁰(Y_u)a(Y_u) + 1

2f⁰⁰(Y_u)b(Y_u)²)du+ Z s

ti

f⁰(Y_u)b(Y_u)dB (44) We can apply the Ito formula on the expressions a(Y_s) and b(Y_s), which are the

(21)

3 STANDARD NUMERICAL METHODS 16 coecients in our SDE. We then obtain

Y_t_i+1 =Y_t_i+ Z ti+1

ti

(a(Y_t_i) + Z s

ti

(a⁰(Y_u)a(Y_u) + 1

2a⁰⁰(Y_u)b²(Y_u))du +

Z s ti

a⁰(Y_u)b(Y_u)dB_u)ds+ Z ti+1

ti

(b(Y_t_i) + Z s

ti

(b⁰(Y_u)a(Y_u) + 1

2b⁰⁰(Y_u)b²(Y_u))du +

Z s ti

b⁰(Y_u)b(Y_u)dB_u)dBs (45)

We want to achieve a method which converges strongly of order 1. By using a time discretization, the dierentialsdB anddt are replaced by the corresponding discrete versions ∆B and δt. If we are up for a method which converges strongly of order 1, we can neglect the double integrals above, which are of type dB_s·ds and ds·ds. We then obtain,

Y_t_i+1 ≈Y_t_i+ Z ti+1

ti

a(Y_t_i)ds+ Z ti+1

ti

(b(Y_t_i) + Z s

ti

b⁰(Y_u)b(Y_u)dB_u)dB_s

≈Y_t_i+a(Y_t_i)δt+b(Y_t_i)∆B_i+ Z ti+1

ti

Z s ti

b⁰(Y_u)b(Y_u)dB_udB_s (46) The rst two summands in the equation above are well known from the Euler- Maruyama scheme. The third one is new. We approximate the third term above by;

Z ti+1

ti

Z s ti

b⁰(Y_u)b(Y_u)dB_udB_s ≈b⁰(Y_t_i)b(Y_t_i) Z ti+1

ti

Z s ti

dB_udB_s. (47) The integral on the right hand side of the last equality is well known from Continuous Time Finance. We obtain

b⁰(Y_t_i)b(Y_t_i) Z ti+1

ti

Z s ti

dB_udB_s= 1

2b⁰(Y_t_i)b(Y_t_i)((∆B_i)²−δt) (48) Substituting this in our previous approximation we nally obtain the Milstein scheme

Y_i+1 =Y_i+a(Y_i)δt+b(Y_i)∆B_i+ 1

2b⁰(Y_i)b(Y_i)((∆B_i)²−δt) (49) The Milstein scheme makes use of Itô's lemma to increase the accuracy of the approximation by adding the second-order term. Denoting byσ_x the partial derivative of σ(t, x)with respect to x, the Milstein approximation looks like

Y_i+1 =Y_i+b(t_i, Y_i)(t_i+1−t_i) +σ(t_i, Y_i)(B_i+1−B_i)+

1

2σ(ti, Yi)σx(ti, Yi){(Bi+1−Bi)²−(ti+1−ti)}

(50)

or, in symbolic form,

Y_i+1 =Y_i+L∆t+σ∆B_t+1

2σσ_x{(∆B_t)²−∆t} (51)

(22)

This scheme has strong and weak orders of convergence equal to one. For this process let's consider L(t, x) = θ₁−θ₂×x and σ(t, x) =θ₃, and thus σ_x(t, x) = 0 and the Euler and Milstein schemes coincide. This is one case in which the Euler scheme is of strong order of convergence γ = 1.

Example 3.1 (The geometric Brownian motion) A more interesting case is that of geometric Brownian motion of the stochastic dierential equation

dX_t=θ₁X_tdt+θ₂X_tdB (52) For this process, L(t, x) = θ₁ ·x, σ(t, x) = θ₂ ·x and σ_x(t, x) = θ₂. The Euler- Maruyama discretization for this process IS

Y_i+1Ê =Y_iÊ(1 +θ₁·∆t) +θ₂Y_iÊ∆B_t (53) and the Milstein scheme reads

Y_i+1^M =Y_i^Mθ₁·Y_i^M∆t+θ₂Y_i^M∆B_t+ 1

2θ₂²Y_i^M{(∆B_t)²∆t}

=Y_i^M(1 + (θ₁− 1

2θ₂²)∆t) +θ₂Y_i^M∆B_t+1

2θ₂²Y_i^M(∆B_t)² Recall that ∆B_t ∼√

∆tZ with Z ∼N(0,1). Thus Y_i+1^E =Y_i^E(1 +θ₁·∆t+θ₂√

∆tZ) and

Y_i+1^M =Y_i^M(1 + (θ₁− 1

2θ²₂)∆t) +θ₂Y_i^M√

∆tZ+ 1

2θ₂²Y_i^M∆tZ²

=Y_i^M(1 + (θ₁− 1

2θ²₂(Z²−1))∆t+θ₂√

∆tZ)

The Milstein scheme makes the expansion exact up to order O(∆t). Indeed, formal Taylor expansion leads to

X_t+∆t=X_texp{(θ₁−θ²₂

2)∆t+θ₂√

∆tZ}

=X_t{1 + (θ₁− θ₂²

2)∆t+θ₂√

∆tZ+ 1

2θ²₂∆tZ² +O(∆t)}

=Y_i+1^M

(23)

3 STANDARD NUMERICAL METHODS 18

3.3 RungeKutta Numerical Scheme

The Milstein scheme involves the derivatives. If it happens that the derivatives do not exist, then it leads to diculties in implementing. For that case we need another derivative free method. In this section we shall consider implicit schemes which avoid the use of derivatives. They are obtained from the corresponding implicit strong Taylor schemes by replacing the derivatives there by nite dierences expressed in terms of appropriate supporting values. for this reason we shall call them implicit strong Runge-Kutta schemes, but it must be emphasized that they are not simply heuristic stochastic adaptations of the deterministic Runge-Kutta schemes. (Kloeden and Platen, 1992).

For 1−dimensional case an implicit order-1 strong Runge-Kutta scheme is, Y_n+1 =Y_n+f(Y_n+1)∆ +L∆B_n+ 1

2√

∆[L( ¯Y_n−L)][(∆B_n)²−∆] (54) with supporting value, Y¯_n+1 =Y_n+f∆ +L√

∆

By interpolating between this scheme and the corresponding explicit scheme we can form a family of implicit order −1.0 strong Runge-Kutta schemes. In the d−

dimensional case these have the form

Y_n+1 =Y_n+ [αf(Y_n+1) + (1−α)f]∆ +L∆B_n+ 1 2√

∆[L( ¯Y_n−L)][(∆B_n)²−∆] (55) With the vector supporting value,Y¯_n+1 =Y_n+f∆ +L√

∆ and the parameter α ∈ [0,1]There is also a Stratonovich version

Yn+1 =Yn+ [αf(Yn+1) + (1−α)f]∆ + 1

2[L( ¯Ψn) +L]∆Bn (56) With vector supporting value Ψ¯_n =Y_n+f∆ +L∆B_n

and the parameter α∈[0,1]. We remark that we still obtain convergence with the strong order γ = 1 if we omit the ¯a∆ term in the supporting valueΨ¯_n .

In the d-dimensional case an implicit order- 1.5 strong Runge-Kutta scheme is given by

Y_n+1 =Y_n+ 1

2[f(Y_n+1) +f]∆ +L∆B_n+ 1 2√

∆[f( ¯Y₊)−f( ¯Y₋)](∆Z_n− 1

2∆B_n∆)+

1 4√

∆[L( ¯Y₊)−L( ¯Y−)][(∆B_n)²−∆] + 1

2∆[L( ¯Y₊)−2L+L( ¯Y−)](∆B_n∆−∆Z_n)+

1

4∆{L( ¯Φ₊)−L( ¯Φ−)−[L( ¯Y₊)−L( ¯Y−)]} ×[1

3(∆B_n)²−∆]∆B_n (57)

(24)

With supporting values Y¯± =Y_n+f∆±L√

∆And Φ¯± =Y₊±L( ¯Y₊)√

∆

Here we have chosen the degree of implicitness α = ¹₂ which simplies the scheme.

We note that for additive noise only the rst four terms remain in this scheme. We shall only consider the additive noise case here for the strong orderγ = 2.0 scheme.

Then in d−dimensional case we have the implicit order 2.0 Strong Runge-Kutta scheme for additive noise

Y_n+1 =Y_n+L∆B_n+{f( ¯Y₊) +f( ¯Y−)− 1

2[f(Y_n+1) +f]}∆ (58) withY¯_±=Y_n+¹₂f∆ +_∆¹L(¯η±ξ), whereη¯= ¹₂∆Z_n+¹₄∆B_n∆andξ={J_(1,1,0),n∆−

1

2(∆Z_n)²+¹₈[(∆B_n)²+¹₂(^2∆Z_∆ⁿ−∆B_n)²]∆²}¹². This scheme has a surprisingly simple structure in spite of its strong order γ = 2.0.

(25)

4 OTHER NUMERICAL METHODS 20

4 Other Numerical Methods

We have seen that the Euler based approximation scheme is a constant approximation of the solution of the SDE. In this Section, we discuss other numerical methods that can be used to simulate the SDE. These are local linearization methods which includes Ozaki and Shoji-Ozaki schemes. The ltering based methods are also discussed. The local linearization methods approximate locally the drift of the stochastic dierential equation with a given function. The linearization to the drift part can be done either deterministically which lead to Ozaki scheme (Ozaki, 1993) or stochastically which lead to ShojiOzaki scheme (Shoji and Ozaki, 1997, 1998).

The ltering methods involves the recursive estimation of the SDE realizations by assuming Gaussian approximations of the distributions of the realizations.

4.1 Ozaki Scheme

The Ozaki-scheme is in the form of a linear multivariate autoregressive time series with state-dependent coecients and it does not involve a stochastic Taylor expansions of the solution. The discretized process follows the Gaussian distribution with mean E(tk)and covariance V(tk) given as follows:

E(t_k) =x(t_k) + f(x(t_k),θ)

∂

∂x

f(x(t_k),θ) ∂

∂x

f(x(t_k),θ)

4t)−1

V(t_k) =L(θ)L^T(θ)

exp(2K(t_k)4t)−1 2K(tk)

, where

K(t_k) = 1 4tlog



1 + f(x(t_k),θ) x(t_k)_∂x^∂

f(x(t_k),θ)

exp( ∂

∂x

f(x(t_k),θ)

4t)−1





One disadvantage of Ozaki scheme is that it assumes that the diusion coecient is constant and the drift function depends only on the state variable.

4.2 ShojiOzaki Scheme

The ShojiOzaki scheme is an extension of Ozaki scheme where the drift can depend on time variable and the diusion coecient can vary The discretized process follows

(26)

the Gaussian distribution with mean E(t_k) and covariance V(t_k)given as follows:

E(t_k) = x(t_k) + f(x(t_k), t_k,θ) L(t_k)

exp(L(t_k)4t)−1 + M(t_k)

L²(t_k)

exp(L(t_k)4t)−1−L(t_k)4t V(t_k) = L(x(t_k), t_k,θ)L^T(x(t_k), t_k,θ)

exp(2L(t_k)4t)−1 2L(t_k)

, where

L(tk) = ∂

∂x

f(x(tk), tk,θ)

M(t_k) = 1

2L(x(t_k), t_k,θ)L^T(x(t_k), t_k,θ) ∂²

∂x²

f(x(t_k), t_k,θ) + ∂

∂t

f(x(t_k), t_k,θ) ShojiOzaki scheme is stable even if the time step4t is large. Note that the Ozaki, Shoji-Ozaki and Euler-Maruyama schemes draw the increments from a Gaussian distribution. However, the variance in Euler-Maruyama scheme is independent from the previous process while the Ozaki and Shoji-Ozaki schemes depend. In most cases, the Shoji-Ozaki scheme performs dierently from the Euler and Ozaki methods because it takes into account the stochastic behavior of the discretization. In the linear homogeneous SDEs the Euler, Shoji-Ozaki, and Ozaki methods coincide.

4.3 Kalman lter

For linear SDE, one can use Kalman lter R (1960) to compute the posterior distribution of states, where the predictive dierential equations are analytically solved by any numerical scheme for ordinary dierential equations such as Euler Maruyama scheme or matrix fraction decomposition. The idea behind is that, given an SDE (which is called dynamical process) we dene another process called measurement process. Hence, forming a state space model dened as follows:

dx(t) = F(t,θ)x(t)dt+L(t,θ) dB(t) yk =Hkx(tk) +rk.

(59)

wherex(tk)is the state at time tk,θ∈Φ⊆R^dis the vector of parameters to be es- timated, F: [0,∞)×Φ7→Rⁿ is a linear dynamic model function,L: [0,∞)×Φ7→

R^n×s is a linear matrix valued function, t 7→ B(t) is s-dimension Brownian motion with diusion matrix Q_c ∈ R^s×s, y_k ∈ R^m is the measurement at time t_k, h : Rⁿ 7→ R^m is the measurement model function, r_k ∼ N(0,R_k) is the Gaussian

(27)

4 OTHER NUMERICAL METHODS 22 measurement noise with R_k ∈ R^m×m being the covariance matrix of the measurement error at t_k. At time t₀ the state is assumed to have the prior distribution p(x(t₀)) = N(x(t₀)|m₀,P₀), where m₀ is the predictive initial mean and P₀ is the predictive initial covariance.

Algorithm below is the KF algorithm which provides an recursive ecient computa- tions of dynamic states x(t₁),x(t₂), . . . ,x(t_M) from which the mean of the squared error is minimized. For the derivation of the ltering steps for KF algorithm see, for instance, Särkkä (2013).

1. Initialize the mean m₀ and covariance P₀ 2. Fork = 1,2, . . . perform the following

• Prediction step:

dm⁻_k(t)

dt =F(t,θ)m⁻_k(t) (60)

dP⁻_k(t)

dt =F(t,θ)P⁻_k(t) +P⁻_k(t)F^T(t,θ) +Σ(t,θ), (61) whereΣ(t,θ) =L(t,θ)Q_cL^T(t,θ)and the initial conditions arem⁻_k(tk−1) = mk−1, P⁻_k(tk−1) = Pk−1, and the prediction result is given as m⁻_k = m⁻_k(t_k),P⁻_k =P⁻_k(t_k).

• Update step:

S_k =H_kP⁻_kH^T_k +Rk (62) K_k =P⁻_kH^T_kS⁻¹_k (63) m_k =m⁻_k +K_k y_k−H_km⁻_k

(64) P_k =P⁻_k −K_kS_kK^T_k. (65) where m⁻_k is a priori state estimate, m_k is a posteriori state estimate, P⁻_k is a priori estimate error covariance, P_k is a posteriori estimate error covariance and Σ(t) =L(t,θ)Q_cL^T(t,θ). Note that, sometimes,m⁻_k andP⁻_k are written asm⁻(t_k) and P⁻(t_k) respectively. For continuous dynamic model, the predictive dierential Equations (60) and (61) can be solved by any numerical scheme like RungeKutta scheme (Butcher, 2003) or matrix fraction decomposition (Grewal and Andrews, 2001; Mbalawata et al., 2013). The integrations start from the initial valuesm(tk−1) and P(tk−1).

(28)

4.4 Extended Kalman Filter

This idea of computing sucient statistics using Kalman lter can not be used in simulating the nonlinear SDE because Kalman lter is only for linear dynamic model.

There are various lters that deal with nonlinear SDEs. Examples of such lters are extended Kalman lters, Unscended Kalman lter, Gaussina lters, particle lters (Jazwinski, 1970; Särkkä, 2013). In this section, only extended Kalman lter is discussed.

For non-linear SDEs we consider the following state space model.

dx(t) =f(x(t), t,θ) dt+L(x(t), t,θ) dB(t) (66) y_k=h(x(t_k)) +r_k, (67) wheref :Rⁿ×[0,∞)×Φ→Rⁿis the dynamic model function,L:Rⁿ×[0,∞)×Φ→ R^n×s is a matrix valued function, and h : Rⁿ → R^m is the measurement model function.

The extended Kalman lter forms a Gaussian approximation to distribution of states and measurements using a Taylor series expansion (Jazwinski, 1970; Grewal and Andrews, 2001). The idea is that the nonlinear SDE is linearized with the Taylor series expansion and then the Kalman lter is applied. The estimation of states is thus done recursively as in the following algorithm.

1. Initialize the mean m₀ and covariance P₀ 2. Fork = 1,2, . . . perform the following

• Prediction step:

dm⁻_k(t)

dt =f(m⁻_k(t), t,θ) (68)

dP⁻_k(t)

dt =Fx(m⁻_k(t), t,θ)P⁻_k(t) +P⁻_k(t)F^T_x(m⁻_k(t), t,θ) +Σ(m⁻_k(t), t,θ), (69)

(29)

4 OTHER NUMERICAL METHODS 24

• Update step:

µ_k=h(m⁻k, t) (70)

Sk=Hx(m⁻_k, t)P⁻_kH^T_x(m⁻_k, t) +Rk (71) K_k=P⁻_kH^T_x(m⁻_k, t)S⁻¹_k (72) mk=m⁻_k +Kk(yk−µ_k) (73) P_k=P⁻_k −K_kS_kK^T_k, (74)

where Σ(m, t,θ) = L(m, t,θ)Q_cL^T(m, t,θ), F_x(x,θ, t) is the Jacobian matrix of f(x,θ, t) with respect to x and Hx(x, t) is the Jacobian matrix of h(x, t). The initial conditions are m⁻_k(t_k−1) = m_k−1, P⁻_k(t_k−1) = P_k−1, and the prediction result is given as m⁻_k =m⁻_k(t_k), P⁻_k =P⁻_k(t_k).

(30)

5 Formulation of Stochastic Epidemic Models

Epidemics are often modelled by nonlinear systems observed through partial noisy data. The modeling can be broadly classied into two main categories: deterministic models and stochastic models. It is interesting to note that after a deterministic system of ordinary dierential equations has been formulated for the population dynamics, one can derive several dierent kinds of stochastic models that take into account the random nature of the population processes in a consistent manner. The idea here is to capture unknown inuences such as changing behaviors, public inter- ventions, seasonal eects, social cycles. Note that, the deterministic and stochastic models have complementary strengths. There are many kind of epidemic models, depending to the type of disease. Example SIS, SIR, SEIR models. The SIR model is able to capture most of the features of epidemic processes, its validity is in doubt when applied to diseases where the incubation period is relatively long. The SIR model can be extended to include an exposed state leading to SEIR model. In this extended model, an initially susceptible individual is considered as being exposed to the disease. In this chapter we discuss the SEIR model in deterministic and its corresponding SDEs.

The basic SEIR model represents infection dynamics in total eective population size N_e which is divided into four compartments: Susceptible S, Exposed E, Infec- tious I and Recovered R. Transmission of infection from infectious to susceptible individuals is controlled by a bilinear contact term βSI/Ne. In particular, the scal- ing byNe makes the reproduction ratio proportional to the local density of contacts and independent of population size. The infected individuals move into the exposed (not infectious) class after an average incubation period 1/k and subsequently (if they escape natural mortality) through the infectious class after an average time 1/γ. This deterministic approximation assumes an exponential distribution of incubation and infectious periods; though a tractable approximation for exploring overall dynamics, the observed duration of infection periods are of then much more nearly constant. The model assumes that recovered individuals are immune from infection (strictly to the ability to retransmit) for life. The transition process is modeled by

Numerical Simulation of Stochastic Di erential Equations

Acknowledgements

Contents

1 Introduction

2 Stochastic Calculus

2.1 Brownian Motion

2.2 Itô integral

2.3 Stratonovich integrals

A Comparison of Itô and Stratonovich integrals

2.4 Itô Formula

2.5 Stochastic Dierential Equations

3 Standard Numerical Methods

3.1 EulerMaruyama Numerical Scheme

3.2 Milstein Numerical Scheme

3.3 RungeKutta Numerical Scheme

4 Other Numerical Methods

4.1 Ozaki Scheme

4.2 ShojiOzaki Scheme

4.3 Kalman lter

4.4 Extended Kalman Filter

5 Formulation of Stochastic Epidemic Models