Existence and uniqueness of a solution

2.4 Stochastic calculus

2.4.4 Existence and uniqueness of a solution

Next we formulate two theorems about the existence and uniqueness of a solution. However, rst we need to dene what we mean by uniqueness.

Assume any two processes X = (X_t)t∈I and Y = (Y_t)t∈I that solve the SDE (2.4). If X and Y are indistinguishable, that is,

P(X_t=Y_t, t ∈I) = 1,

then it is said that the SDE (2.4) has a unique strong solution.

Our rst existence and uniqueness theorem is usually referred as existence under Lipschitz condition, although alongside global Lipschitz condition we also assume that the coecient functions satisfy linear growth condition, which is necessary to make sure the coecients do not grow too fast.

Theorem 2.27 ([Mao07, Section 2.3, theorem 3.1, lemma 3.2]). Suppose that the coecient functions σ and b are continuous and there exists a constant C > 0 such that

(C1)

kb(t, x)k+kσ(t, x)k ≤C(1 +kxk) for all t ∈I and x∈R^d, and

(C2)

kb(t, x)−b(t, y)k+kσ(t, x)−σ(t, y)k ≤Ckx−yk for all t ∈I and x, y ∈R^d.

Under these conditions the SDE (2.4) admits a unique strong solution.

Our second theorem gives the uniqueness of a solution under weaker con-ditions than our previous theorem. However, it should be noted that this theorem does not imply the existence of a solution, just the uniqueness, and the theorem is given in the one-dimensional case.

Theorem 2.28 (Yamada-Tanaka, [Gei19, Proposition 4.2.3]). Let

h: [0,∞)→[0,∞) and K : [0,∞)→R be strictly increasing functions such that K(0) =h(0) = 0, K is concave, and for all >0 it holds that

Z 0

K(u)du= Z

h(u)² du=∞.

If the coecient functions σ and b are continuous and

|σ(t, x)−σ(t, y)| ≤h(|x−y|),

|b(t, x)−b(t, y)| ≤K(|x−y|)

for allx, y ∈R, then any two solutions to the SDE (2.4) are indistinguishable.

3 Wasserstein space

In this thesis our primary objective is to generalize the theory of ordinary stochastic dierential equations to a wider class of equations, where the co-ecients may depend upon the law of the unknown process. However, we need to address the following problems:

(1) The coecients functionsσ and bare required to be Borel measurable.

If X : Ω → R^d is a random variable, then its law PX is a probability measure on (R^d,B(R^d)). It follows that we need to dene a Borel σ -algebra on the space of probability measures on (R^d,B(R^d)).

(2) If we want to formulate the existence and uniqueness Theorem 2.27 for this wider class of stochastic dierential equations, we may need to dene Lipschitz-continuity with respect to the distribution variable, that is, for all probability measures µ, ν on(R^d,B(R^d))one has

kb(t, x, µ)−b(t, x, ν)k ≤d(µ, ν), where d is a distance between two probability measures.

For this purpose we introduce the Wasserstein distance, which is a metric for the space of probability measures with nite p-th moments. In this section we dene the Wasserstein space and prove some important properties.

The denition of marginal distributions follow [Vil06, chapter 1]. The spaceP_p(R^d)and the Wasserstein distanceW_pare dened in [Vil06, denition 6.1 and 6.4].

Let X be a non-empty set. A map d : X ×X → [0,∞) is a metric or distance if for all x, y, z ∈X one has

(M1) d(x, y) = 0 if and only if x=y. (M2) d(x, y) = d(y, x).

(M3) d(x, y)≤d(x, z) +d(z, y).

The pair (X, d) forms a metric space. One important example of a metric space is R^d with Euclidean metric d_E(x, y) := kx−yk, where k·k is the ordinary Euclidean norm. In particular this space is complete and separable.

Let P(R^d) be the space of all the probability measures on (R^d,B(R^d)). For p≥1, let P_p(R^d) be a subspace of P(R^d)such that

P_p(R^d) :=

P∈ P(R^d)| Z

kx−x₀k^pdP<∞

where x₀ ∈R^d is xed.

Denote byΠ(µ, ν)the set of probability measures on(R^d×R^d,B(R^d×R^d)) where the rst and second marginals are µ and ν respectively. This means ξ ∈Π(µ, ν), if

(1) ξ is a measure on (R^d×R^d,B(R^d×R^d)), (2) ξ(R^d×R^d) = 1,

(3) for allA ∈ B(R^d) one has µ(A) =

R^d×R^d

1A(x) dξ(x, y) =ξ(A×R^d) and

ν(A) = Z

R^d×R^d

1A(y) dξ(x, y) =ξ(R^d×A).

Example 3.1. Let X, Y : Ω → R^d be random variables. The law of the random vector (X, Y) is dened by

P(X,Y)(B) :=P((X, Y)∈B) for all B ∈ B(R^d×R^d). For A∈ B(R^d)we have

P(X,Y)(A×R^d) =P((X, Y)∈A×R^d) =P(X ∈A) = PX(A) and in a similar way P(X,Y)(R^d×A) =PY(A). It follows that

P^(X,Y⁾∈Π(P^X,P^Y).

Denition 3.2 ([Vil06, denition 6.1 and 6.4]). For all p≥1, dene W_p :P_p(R^d)× P_p(R^d)→[0,∞),

W_p(µ, ν) := inf

π∈Π(µ,ν)

R^d×R^d

kx−yk^pdπ(x, y) ¹_p

The mapW_p is called thep-Wasserstein distance. The space(P_p(R^d), W_p)is called the Wasserstein space.

Theorem 3.3. Let p ≥ 1. Then the Wasserstein space P_p(R^d), W_p complete and separable metric space. is a

Proof. First we need to show that W_p satises properties (M1), (M2) and (M3). The triangle inequality property (M3) is proven in [Vil06, chapter 6, p. 77]. The remaining parts we prove here.

To prove the symmetry property (M2) we let µ, ν ∈ P(R^d). For all A∈ B(R^d),π ∈Π(µ, ν) and ξ∈Π(ν, µ)one has

π(A×R^d) =µ(A) =ξ(R^d×A) and π(R^d×A) = ν(A) = ξ(A×R^d).

Now we may dene a map ρ :P_p(R^d×R^d)→ P_p(R^d×R^d) such that for all π ∈ P_p(R^d×R^d)and all B ∈ B(R^d×R^d)one has

(ρ(π))(B) =π(

(x, y)∈R^d×R^d|(y, x)∈A ) = ξ(B).

In particular ρ(Π(µ, ν)) = Π(ν, µ). We see that ρ⁻¹ =ρ becauseρ(ρ(π)) = π for all π∈ P_p(R^d×R^d). Hence ρ⁻¹(Π(ν, µ)) =ρ(Π(ν, µ)) = Π(µ, ν). Now

W_p(µ, ν)^p = inf

π∈Π(µ,ν)

R^d×R^d

kx−yk^pdπ(x, y)

= inf

π∈ρ(Π(ν,µ))

R^d×R^d

kx−yk^pdπ(x, y)

= inf

ρ⁻¹(π)∈Π(ν,µ)

R^d×R^d

kx−yk^pdπ(x, y)

= inf

ξ∈Π(ν,µ)

R^d×R^d

kx−yk^pdρ(ξ)(x, y)

= inf

ξ∈Π(ν,µ)

R^d×R^d

ky−xk^pdξ(y, x)

=W_p(ν, µ)^p.

We prove the nal property (M1) in two steps. First we prove thatµ=ν implies that W_p(µ, ν) = 0. We dene a measure

π₀(B) :=

R^d

1{^x∈R^d|(x,x)∈B}(y) dµ(y) for B ∈ B(R^d×R^d). Clearly,

π0(A×R^d) = Z

R^d

1{^x∈R^d|(x,x)∈A×R^d}(y) dµ(y)

= Z

R^d

1A(y) dµ(y) = µ(A).

The same arguments can be used to show that π₀(R^d×A) = µ(A). Hence π₀ ∈Π(µ, µ).

We see that for all setsB ∈ B(R^d×R^d) with B∩

(x, x)|x∈R^d =∅ one has π₀(B) = 0. Therefore

W_p(µ, µ)^p ≤ Z

R^d×R^d

kx−yk^pdπ₀(x, y)

= Z

{^(x,x)|x∈R^d}

kx−yk^pdπ₀(x, y)

= Z

{^(x,x)|x∈R^d}kx−xk^pdπ₀(x, x) = 0.

To prove the converse implication, we let µ, ν ∈ P(R^d) and assume that Wp(µ, ν) = 0. By [Vil06, Theorem 4.1] this implies that there exists π0 ∈ Π(µ, ν)such that

R^d×R^d

kx−yk^pdπ0(x, y) = 0.

Since π₀ is a probability measure, it follows that π₀(

(x, y)∈R^d×R^d|x=y ) = 1.

In particular

π₀(

(x, y)∈R^d×R^d|x6=y ) = 0.

Then for all A ∈ B(R^d) it holds that µ(A) =π0(A×R^d)

=π₀(

(x, y)∈A×R^d |x=y ∪

(x, y)∈A×R^d|x6=y )

=π₀(

(x, y)∈A×R^d |x=y )

=π₀(A×A).

With similar arguments we obtain

ν(A) = π₀(A×A).

Hence µ=ν.

In [Vil06, Theorem 6.16] it is proven that if X is a complete separable metric space, then the space (P_p(X), W_p) is also a complete separable met-ric space. We use the fact that R^d with Euclidean metric is complete and separable.

We recall some more results concerning the Wasserstein distance. The following lemma and its proof follow [BMM19, Section 2.2].

Lemma 3.4. Let X, Y : Ω→R^d be random variables. Then W_p(PX,PY)^p ≤E[kX−Yk^p].

Proof. We have shown in example 3.1 that P(X,Y) ∈Π(PX,PY). Then

W_p(PX,PY)^p = inf

π∈Π(PX,PY)

R^d×R^d

kx−yk^pdπ(x, y) ¹_p!p

= inf

π∈Π(PX,PY)

R^d×R^d

kx−yk^pdπ(x, y)

≤ Z

R^d×R^d

kx−yk^pdP(X,Y)(x, y).

By lettingϕ(u, v) :=ku−vk^p we may use change of variable formula [GG18, Proposition 5.6.1] to conclude that

EkX−Yk^p =Eϕ(X, Y) = Z

Ω

ϕ(X(ω), Y(ω)) dP(ω)

= Z

R^d×R^d

ϕ(x, y) dP(X,Y)(x, y)

= Z

R^d×R^d

kx−yk^pdP(X,Y)(x, y).

Hence

W_p(PX,PY)^p ≤EkX−Yk^p.

Due to its complex nature, computing an explicit value for the Wasserstein distance might not be possible. However, in the case p= 1, we may apply a theorem known as Kantovich-Rubinstein duality.

Proposition 3.5 (Kantovich-Rubinstein, [CD18a, corollary 5.4]). Forµ, ν ∈ P₁(R^d) one has

W₁(µ, ν) = sup

R^d

hd(µ−ν)

h∈Lips1(R^d)

where Lips1(R^d) consists of all the functions h:R^d→R satisfying

|h(x)−h(y)| ≤ kx−yk for all x, y ∈R^d.

The following example demonstrates how the Wasserstein distance can be computed in a simple case using Theorem 3.5.

Example 3.6. We dene the Dirac measure on (R^d,B(R^d))by δ_c(A) =

(1, c∈A 0, c /∈A

for some xed constant c∈R. It is clearly a probability measure. In partic-ular, if we integrate an integrable function f with respect toδ_c, we obtain

R^d

fdδc =f(c).

This implies, for any p≥1, Z

kuk^pdδ_c(u) = |c|^p <∞.

Hence δ_c ∈ P_p(R)for all p≥1.

We leta, b∈R^d. Let f :R →R be a1-Lipschitz function. Then

fd(δ_a−δ_b)

= Z

fdδ_a− Z

fdδ_b

=|f(a)−f(b)| ≤ ka−bk. Next we dene an orthogonal projection

P :R^d→ hb−ai=

x∈R^d|x=λ(b−a) for some λ∈R . It holds that kP(x)k ≤ kxk. Now we may let f(x) =kP(x−a)ksince

|f(a)−f(b)|=|kP(a−a)k − kP(b−a)k|

=|kP(0)k − kP(b−a)k|

=kb−ak. Furthermore, for all x, y ∈R^d it holds that

|f(x)−f(y)|=|kP(x−a)k − kP(y−a)k|

=|kP(x−a)−P(y−a)k|

=kP(x−y)k

≤ kx−yk, implying that f ∈Lips1R^d.

Now we may apply Theorem 3.5 to conclude that W₁(δ_a, δ_b) = sup

f∈Lips1(R)

fd(δ_a−δ_b) =ka−bk.

The Wasserstein distances with dierent p have the following relation.

This property is mentioned in [CD18a, p. 353], but it is not proven there.

Lemma 3.7. Let µ, ν ∈ Pq(R^d). Then

W_p(µ, ν)≤W_q(µ, ν) for all 1≤p < q < ∞.

Proof. Fixµ, ν ∈ Pq(R^d)and choose anyπ∈Π(µ, ν). Letr = ^q_p ands = _q−p^q . Now ¹_r + ¹_s = 1, so we may apply Hölder inequality 2.9 to obtain

R^d

kx−yk^pdπ(x, y)≤ Z

R^d

|kx−yk^p·1|dπ(x, y)

≤ Z

R^d

1^sdπ(x, y) ¹_s Z

R^d

kx−yk^p^q^pdπ(x, y) ¹_r

= Z

R^d

kx−yk^qdπ(x, y) ^p_q

. Hence

R^d

kx−yk^pdπ(x, y) ¹_p

≤ Z

R^d

kx−yk^qdπ(x, y) ¹_q

. Now we have that

inf

˜ π∈Π(µ,ν)

R^d

kx−yk^pd˜π(x, y) ¹_p

≤ Z

R^d

kx−yk^qdπ(x, y) ¹_q

This inequality holds for arbitrary π ∈Π(µ, ν), therefore inf

˜π∈Π(µ,ν)

R^d

kx−yk^pd˜π(x, y) ¹_p

≤ inf

π⁰∈Π(µ,ν)

R^d

kx−yk^qdπ⁰(x, y) ¹_q

4 McKean-Vlasov stochastic dierential equa-tions

In this thesis we consider a broader class of stochastic dierential equations than what we have mentioned in Section 2.4.3. We add a third parameter to the coecients, a so called distribution parameter, which allows us to

make the coecients depend on the law of the random process, and therefore the expected value. This class of stochastic dierential equations is called McKean-Vlasov stochastic dierential equations. We use the abbreviation MVSDE. It should be noted that the ordinary SDEs are a special subset of MVSDEs.

Our goal is to generalize some known results of ordinary SDEs to the con-text of MVSDEs. First we consider theorems for the existence and unique-ness of a solution, generalizing the results we introduced in Section 2.4.4.

We present some elementary examples to demonstrate how to apply these results.

Throughout this and the following sections, we assume a nite time hori-zon T > 0and a stochastic basis Ω,F,P,(F_t)t∈[0,T]

that satises the usual conditions. Let B = (B)t∈[0,T] be a d-dimensional (F_t)t∈[0,T] Brownian mo-tion, where d≥1.

4.1 Motivation

To give a motivation for McKean-Vlasov stochastic dierential equations, we consider an example related to physics. This is a natural choice for a motivation since, as mentioned in the introduction, the theory of MVSDEs was initiated by physics. This example is inspired by [CD18b, 2.1.2].

We want to model a system of N weakly interacting particles on some time interval [0, T], where T > 0. For every i = 1,2, ..., N we model the position of a particle by a stochastic process Xⁱ = (X_tⁱ)t∈[0,T]. We denote by an (B1, B2, ..., BN) N-dimensional Brownian motion. In our model we assume that each particle solves the following stochastic dierent equation

(dX_tⁱ =σ(t, X_tⁱ, µN) dB_tⁱ+b(t, X_tⁱ, µN) dt X₀ⁱ =xⁱ₀,

where xⁱ₀ is the initial position and µ_N := 1

i=1

X_i.

The term µ_N gives the dependence on the positions of the other particles.

To model the weak interaction, we assume that, whenN is large enough, for all t ∈[0, T] the particles (X_tⁱ)^N_i=1 are behaving approximately like inde-pendent particles with identical distributions. This lets us use the strong law of large numbers [GG18, Proposition 8.2.6] to obtain

µ_N −→

a.s. EX_t¹.

as N tends to innity. Since we assume that the particles have identical distribution, we have that EX_tⁱ = EX_t¹ for all i = 1,2, .... This means that, ifN is large enough, we may approximate individual particles with the following stochastic dierential equation:

(4.1)

(dX_tⁱ =σ(t, X_tⁱ,EX_tⁱ) dB_tⁱ+b(t, X_tⁱ,EX_tⁱ) dt X₀ⁱ =xⁱ₀.

This equation is a special case of McKean-Vlasov stochastic dierent equa-tions, as we will see in the next section.

In document On the uniqueness of a solution and stability of McKean-Vlasov stochastic differential equations (sivua 22-31)