INRIABordeaux&UBCVancouver AnewclassofinteractingMarkovChainMonteCarlomethods

(1)

Introduction Some classical distribution flows Interacting stochastic sampling technology Functional fluctuation & comparisons Some references

A new class of interacting Markov Chain Monte Carlo methods

P. Del Moral, A. Doucet INRIA Bordeaux & UBC Vancouver

Workshop on Numerics and Stochastics, Helsinki, August 2008

(2)

Outline

1 Introduction

Stochastic sampling problems Some stochastic engineering models Interacting sampling methods

2 Some classical distribution flows Feynman-Kac models

”Static” Boltzmann-Gibbs models

3 Interacting stochastic sampling technology Mean field particle methods

Interacting Markov chain Monte Carlo models (i-MCMC)

4 Functional fluctuation & comparisons

5 Some references

(3)

Introduction Some classical distribution flows Interacting stochastic sampling technology Functional fluctuation & comparisons Some references Stochastic sampling problems

Stochastic sampling problems

”Nonlinear” distribution flow with ↑ level of complexity.

η _n (dx _n ) = γ n (dx n )

γ _n (1) Time index n ∈ N State var. x _n ∈ E _n

Two objectives :

1

∼ ”Sampling independent ” random variables w.r.t. η n

2

Computation of the normalizing constants γ _n (1)

(= Z n Partition functions).

(4)

Introduction Some classical distribution flows Interacting stochastic sampling technology Functional fluctuation & comparisons Some references Some stochastic engineering models

Stochastic engineering Conditional & Boltzmann-Gibbs’ measures Filtering: Signal-Observation (X _n , Y _n ) [Radar, Sonar, GPS, ...]

η _n = Law(X _n | (Y ₀ , . . . , Y _n ))

Rare events: [Overflows, ruin processes, epidemic propagations,...]

η _n = Law(X _n | n intermediate events) & Z n = P (Rare event) Molecular simulation: [ground state energies, directed polymers...]

η n := Feynman-Kac/Boltzmann-Gibbs

∼ Free Markov motion in an absorbing medium Combinatorial counting, Global optimization, HMM

η n = 1 Z n

e

^−βⁿ

^V ^(x) λ(dx) or η n = 1 Z n

1 A

_n

(x) λ(dx)

(5)

Introduction Some classical distribution flows Interacting stochastic sampling technology Functional fluctuation & comparisons Some references Interacting sampling methods

Two simple ingredients

Find or Understand the probability mass transformation η n = Φ n (η n−1 )

∼ Cooling schemes, temp. variations, constraints sequences, subset restrictions, observation data, conditional events,...

Natural interacting sampling idea :

Use η _n−1 or its empirical approx. to sample w.r.t. η n

1

Monte-Carlo/ Mean Field models :

η n = Law(X n ) with Markov : X _n−1

∼ηn−1

−−−−− −−→ X n

2

Interacting MCMC models :

Use the occupation measures of an MCMC with target η n−1

MCMC target η n

(6)

Introduction Some classical distribution flows Interacting stochastic sampling technology Functional fluctuation & comparisons Some references Feynman-Kac models

Feynman-Kac distribution flows

Weak representation: [f n test funct. on a state space E n ]

η n (f n ) = γ _n (f _n )

γ n (1) with γ n (f n ) = E



f n (X n ) Y

0≤p<n

G p (X p )





A Key Formula: Z n = E Q

0≤p<n G p (X p )

= Q

0≤p<n η p (G p ) Path space models X _n = (X ₀

⁰

, . . . , X _n

⁰

)

Examples

G _n ∈ [0, 1] particle absorption models.

G _n = Observation likelihood function Filtering models.

G _n = 1 _A

_n

Conditional/Restriction models.

(7)

Introduction Some classical distribution flows Interacting stochastic sampling technology Functional fluctuation & comparisons Some references Feynman-Kac models

Nonlinear distribution flows Evolution equation:

η n+1 = Φ n+1 (η n ) = Ψ G

_n

(η n )M n+1

With the only 2 transformations :

X -Free Markov transport eq. : [M n (x n−1 , dx n ) from E n−1 into E n ] (η _n−1 M _n )(dx _n ) :=

Z

E

n−1

η _n−1 (dx _n−1 ) M _n (x _n−1 , dx _n ) Bayes-Boltzmann-Gibbs transformation :

Ψ G

_n

(η n )(dx n ) := 1

η _n (G _n ) G n (x n ) η n (dx n )

(8)

”Static” Boltzmann-Gibbs models

Boltzmann-Gibbs distribution flows

Target distribution flow : η n (dx) ∝ g n (x) λ(dx) Product hypothesis :

g _n = g _n−1 × G _n−1 = ⇒ η _n = Ψ _G

_n−1

(η _n−1 ) Running Ex.:

g n = 1 A

_n

with A n ↓ ⇒ G _n−1 = 1 A

_n

g n = e

^−βⁿ

^V with β n ↑ ⇒ G _n−1 = e

^−(βⁿ^−βⁿ⁻¹

^)V

Problem : η _n = Ψ _G

_n−1

(η _n−1 ) = unstable equation.

(9)

”Static” Boltzmann-Gibbs models

Feynman-Kac modeling

Choose M

_n

(x, dy) s.t. local fixed point eq. → η _n = η _n M _n (Metropolis, Gibbs,...)

Stable equation :

g n = g _n−1 × G _n−1 = ⇒ η n = Ψ G

_n−1

(η _n−1 )

= ⇒ η

_n

= η

_n

M _n = Ψ

_G_n−1

(η

_n−1

)M _n = FK-model Feynman-Kac ”dynamical” formulation (X n Markov M n )

Z

f (x) g n (x) λ(dx) ∝ E



f (X n ) Y

0≤p<n

G p (X p )





Interacting Metropolis/Gibbs/... stochastic algorithms.

(10)

Introduction Some classical distribution flows Interacting stochastic sampling technology Functional fluctuation & comparisons Some references Mean field particle methods

Mean field interpretation

Nonlinear Markov models : Always ∃K n,η (x , dy ) Markov s.t.

η n = Φ n (η n−1 ) = η n−1 K n,η

n−1

=Law X n

i.e. :

P (X _n ∈ dx _n | X _n−1 ) = K _n,η

_n−1

(X _n−1 , dx _n ) Mean field particle interpretation

Markov chain ξ

_n

= (ξ

¹_n

, . . . , ξ

_n^N

) ∈ E

_n^N

s.t.

η _n ^N := 1 N

X

1≤i≤N

δ

_ξi

n

' _N↑∞ η _n

Particle approximation transitions (∀1 ≤ i ≤ N) ξ _n−1 ⁱ ξ _n ⁱ ∼ K _n,η

N

n−1

(ξ _n−1 ⁱ , dx n )

(11)

Mean field particle methods

Discrete generation mean field particle model

Schematic picture : ξ

n

∈ E

_n^N

ξ

n+1

∈ E

_n+1^N

ξ _n ¹

K

_n+1,η_N

−−−−−−−−−−→

n

.. .

ξ _n ⁱ −−−−−−−−−−→

.. .

ξ _n ^N −−−−−−−−−−→

ξ ¹ _n+1 .. . ξ ⁱ _n+1

.. . ξ ^N _n+1

Rationale :

η ^N _n ' _N↑∞ η _n = ⇒ K _n+1,η

N

n

' _N↑∞ K _n+1,η

_n

= ⇒ ξ ⁱ _n almost iid copies of X n

(12)

Ex.: Feynman-Kac distribution flows

FK-Nonlinear Markov models :

_n = _n (η _n ) ≥ 0 s.t. η _n -a.e. _n G _n ∈ [0, 1] ( _n = 0 not excluded) K n+1,η

_n

(x, dz) =

Z

S n,η

_n

(x, dy ) M n+1 (y , dz ) S _n,η

_n

(x, dy) := _n G _n (x) δ _x (dy) + (1 − _n G _n (x)) Ψ _G

_n

(η _n )(dy) Mean field genetic type particle model :

ξ _n ⁱ ∈ E n

accept/reject/selection

−−−−−−−−− −−→ ξ b _n ⁱ ∈ E n

proposal/mutation

−−−−−−−−− −−→ ξ _n+1 ⁱ ∈ E n+1

Examples :

G n = 1 A killing with uniform replacement.

M _n -Metropolis/Gibbs moves G _n -interaction function

(subsets fitting or change of temperatures)

(13)

Mean field particle methods

Mean field genetic type particle model :

ξ ¹ _n .. . ξ ⁱ _n

.. . ξ _n ^N







S

_n,η_N

−−−−−−−−−−→

n





 ξ b _n ¹

M

_n+1

−−−−−−−−−−→

.. .

ξ b _n ⁱ −−−−−−−−−−→

.. .

ξ b _n ^N −−−−−−−−−−→

ξ _n+1 ¹ .. . ξ _n+1 ⁱ

.. . ξ _n+1 ^N







Accept/Reject/Selection transition : S _n,η

N

n

(ξ _n ⁱ , dx) := _n G _n (ξ ⁱ _n ) δ

_ξi

n

(dx) + 1 − _n G _n (ξ _n ⁱ ) P N j=1

G

n

(ξ

^j_n

)

PN

k=1

G

n

(ξ

_n^k

) δ

_ξj n

(dx)

Ex. : G _n = 1 _A , _n = 1 G _n (ξ _n ⁱ ) = 1 _A (ξ _n ⁱ )

(14)

Path space models

X

n

= (X

₀⁰

, . . . , X

_n⁰

) genealogical tree/ancestral lines η _n ^N := 1

N X

1≤i≤N

δ

_ξi

n

= 1

N X

1≤i≤N

δ _(ξ

i

0,n,ξⁱ_1,n,...,ξⁱ_n,n

) ' _N↑∞ η n

Unbias particle approximations : γ _n ^N (1) = Y

0≤p<n

η ^N _p (G p ) ' N↑∞ γ n (1) = Y

0≤p<n

η p (G p )

Ex. G _n = 1 _A :

⇒ γ ^N _n (1) = Y

0≤p<n

(success % at p)

FK-Mean field particle models = sequential Monte Carlo,

population Monte Carlo, particle filters, pruning, spawning,

reconfiguration, quantum Monte carlo, go with the winner...

(15)

Introduction Some classical distribution flows Interacting stochastic sampling technology Functional fluctuation & comparisons Some references Interacting Markov chain Monte Carlo models (i-MCMC)

Objective

Find a series of MCMC models X ⁽ⁿ⁾ := (X _k ⁽ⁿ⁾ ) _k≥0 s.t.

η ⁽ⁿ⁾ _k = 1 k + 1

X

0≤l≤k

δ _X

(n) l

' _k↑∞ η n

⇒ Use η

⁽ⁿ⁾_k

' η

n

to define X

⁽ⁿ⁺¹⁾

with target η

n+1

Advantages

Using η n the sampling η n+1 is often easier.

Improve the proposition step in any Metropolis type model with target η n+1 ( enters the stability prop. of the flow η

n

) Increases the precision at every time step.

But CLT variance often ≥ CLT variance mean field models.

Easy to combine with mean field stochastic algorithms.

(16)

Interacting Markov chain Monte Carlo models Find M 0 and a collection of transitions M n,µ s.t.

η 0 = η 0 M 0 and Φ n (µ) = Φ n (µ)M n,µ

(X _k ⁽⁰⁾ ) _k≥0 Markov chain ∼ M ₀ .

Given X ⁽ⁿ⁾ , we let X _k ⁽ⁿ⁺¹⁾ with Markov transtions M _n+1,η

(n) k

Rationale : η ⁽ⁿ⁾ _k ' η _n = ⇒

( Φ n+1 (η _k ⁽ⁿ⁾ ) ' Φ n+1 (η n ) = η n+1

M _n+1,η

(n) k

' M n+1,η

n

with fixed point η n+1

= ⇒ η _k ⁽ⁿ⁺¹⁾ ' η n+1

Example : M n,µ (x, dy) = Φ n (µ)(dy) X _k ⁽ⁿ⁺¹⁾ r.v. ∼ Φ n+1

η _k ⁽ⁿ⁾

(17)

((n − 1)-th chain) X ₀ ⁽ⁿ⁻¹⁾

↓ X ₁ ⁽ⁿ⁻¹⁾

↓ .. .

↓ X _k ⁽ⁿ⁻¹⁾

η_k⁽ⁿ⁻¹⁾'ηn−1

−−−−−−−−−−− −−→

↓ .. .

(n-th chain) X ₀ ⁽ⁿ⁾

↓ .. . .. .

↓ X _k ⁽ⁿ⁾

↓ M _n,η

(n−1)

k

' M n,η

n−1

↓

X _k+1 ⁽ⁿ⁾

(18)

[MEAN FIELD PARTICLE MODEL]Nonlinear semigroup−→Φ_p,n(η_p) :=η_n

Local fluctuation theorem : W_n^N:=

√ Nh

η_n^N−Φ_n“ η^N_n−1”i

'W_n ⊥Centered Gaussian field Local transport formulation :

η₀ → η₁= Φ₁(η₀) → η₂= Φ_0,2(η₀) → · · · → Φ_0,n(η₀)

⇓

η^N₀ → Φ₁(η^N₀) → Φ_0,2(η^N₀) → · · · → Φ_0,n(η^N₀)

⇓

η^N₁ → Φ2(η^N₁) → · · · → Φ1,n(η^N₁)

⇓

η₂^N → · · · → Φ_2,n(η^N₂)

⇓

.. . η_n−1^N → Φ_n(η^N_n−1)

⇓ η_n^N Key decomposition formula :

η_n^N−ηn = n X

q=0

[Φq,n(η_q^N)−Φq,n(Φq(η^N_q−1))]

' 1

√ N

n X

q=0

W_q^NDq,n←-First order decomp.Φp,n(η)−Φp,n(µ)'(η−µ)Dp,n+ (η−µ)^⊗2. . .

⇒ Example Functional CLT :

√ N

h η^N_n−ηn

i '

n X

q=0 WqDq,n

(19)

[i-MCMC] Nonlinear sgΦp,n(ηp) =ηnwith a first order decomp. :

Φ_p,n(η)−Φ_p,n(µ)'(η−µ)D_p,n+ (η−µ)^⊗2. . .

⇓ Functional CLT for correlated/interacting MCMC models :

√ k h

η_k⁽ⁿ⁾−η_ni '

n X

q=0

p(2(n−q))!

(n−q)! V_qD_q,n

with` Vq´

q≥0⊥ Centered Gaussian field

E

“ V_q(f)²”

=η_qh

(f−η_q(f))²i + 2P

m≥1η_q

»

(f−η_q(f))M_q,η^m

q−1(f−η_q(f)) –

”Comparisons” :[Mean field case]` W_q´

q≥0⊥ Centered Gaussian field

E

“ W_q(f)²”

=η_q−1n

K_q,η_q−1(f−K_q,η_q−1(f))²o

Case :Kq,η(x,dy) =Mq,η(x,dy) = Φq(η)(dy) =⇒(Vq=Wq) =⇒[Mean field]>[i-MCMC]

(20)

Some references

Interacting stochastic simulation algorithms

Mean field and Feynman-Kac particle models :

Feynman-Kac formulae. Genealogical and

interacting particle systems, Springer (2004) ⊕ Refs.

joint work with L. Miclo. A Moran particle system

approximation of Feynman-Kac formulae. Stochastic Processes and their Applications, Vol. 86, 193-216 (2000).

joint work with L. Miclo. Branching and Interacting Particle Systems Approximations of Feynman-Kac Formulae. S´ eminaire de Probabilit´ es XXXIV, Lecture Notes in Mathematics, Springer-Verlag Berlin, Vol. 1729, 1-145 (2000).

Sequential Monte Carlo models :

joint work with Doucet A., Jasra A. Sequential Monte Carlo Samplers. JRSS B (2006).

joint work with A. Doucet. On a class of genealogical and

interacting Metropolis models. S´ em. de Proba. 37 (2003).

(21)

INRIABordeaux&UBCVancouver AnewclassofinteractingMarkovChainMonteCarlomethods

A new class of interacting Markov Chain Monte Carlo methods

P. Del Moral, A. Doucet INRIA Bordeaux & UBC Vancouver

Workshop on Numerics and Stochastics, Helsinki, August 2008

Outline

1 Introduction

Stochastic sampling problems Some stochastic engineering models Interacting sampling methods

2 Some classical distribution flows Feynman-Kac models

”Static” Boltzmann-Gibbs models

3 Interacting stochastic sampling technology Mean field particle methods

Interacting Markov chain Monte Carlo models (i-MCMC)

4 Functional fluctuation & comparisons

5 Some references

Stochastic sampling problems

”Nonlinear” distribution flow with ↑ level of complexity.

η n (dx n ) = γ n (dx n )

γ n (1) Time index n ∈ N State var. x n ∈ E n

Two objectives :

∼ ”Sampling independent ” random variables w.r.t. η n

Computation of the normalizing constants γ n (1)

(= Z n Partition functions).

Stochastic engineering Conditional & Boltzmann-Gibbs’ measures Filtering: Signal-Observation (X n , Y n ) [Radar, Sonar, GPS, ...]

η n = Law(X n | (Y 0 , . . . , Y n ))

Rare events: [Overflows, ruin processes, epidemic propagations,...]

η n = Law(X n | n intermediate events) & Z n = P (Rare event) Molecular simulation: [ground state energies, directed polymers...]

η n := Feynman-Kac/Boltzmann-Gibbs

∼ Free Markov motion in an absorbing medium Combinatorial counting, Global optimization, HMM

η n = 1 Z n

e

V (x) λ(dx) or η n = 1 Z n

1 A

(x) λ(dx)

Two simple ingredients

Find or Understand the probability mass transformation η n = Φ n (η n−1 )

∼ Cooling schemes, temp. variations, constraints sequences, subset restrictions, observation data, conditional events,...

Natural interacting sampling idea :

Use η n−1 or its empirical approx. to sample w.r.t. η n

Monte-Carlo/ Mean Field models :

η n = Law(X n ) with Markov : X n−1

−−−−− −−→ X n

Interacting MCMC models :

Use the occupation measures of an MCMC with target η n−1

MCMC target η n

Feynman-Kac distribution flows

Weak representation: [f n test funct. on a state space E n ]

η n (f n ) = γ n (f n )

γ n (1) with γ n (f n ) = E



f n (X n ) Y

0≤p<n

G p (X p )





A Key Formula: Z n = E Q

0≤p<n G p (X p )

= Q

0≤p<n η p (G p ) Path space models X n = (X 0

, . . . , X n

)

Examples

G n ∈ [0, 1] particle absorption models.

G n = Observation likelihood function Filtering models.

G n = 1 A

Conditional/Restriction models.

Nonlinear distribution flows Evolution equation:

η n+1 = Φ n+1 (η n ) = Ψ G

(η n )M n+1

With the only 2 transformations :

X -Free Markov transport eq. : [M n (x n−1 , dx n ) from E n−1 into E n ] (η n−1 M n )(dx n ) :=

Z

E

η n−1 (dx n−1 ) M n (x n−1 , dx n ) Bayes-Boltzmann-Gibbs transformation :

Ψ G

(η n )(dx n ) := 1

η n (G n ) G n (x n ) η n (dx n )

Boltzmann-Gibbs distribution flows

Target distribution flow : η n (dx) ∝ g n (x) λ(dx) Product hypothesis :

g n = g n−1 × G n−1 = ⇒ η n = Ψ G

(η n−1 ) Running Ex.:

g n = 1 A

η _n (dx _n ) = γ n (dx n )

γ _n (1) Time index n ∈ N State var. x _n ∈ E _n

Computation of the normalizing constants γ _n (1)

Stochastic engineering Conditional & Boltzmann-Gibbs’ measures Filtering: Signal-Observation (X _n , Y _n ) [Radar, Sonar, GPS, ...]

η _n = Law(X _n | (Y ₀ , . . . , Y _n ))

η _n = Law(X _n | n intermediate events) & Z n = P (Rare event) Molecular simulation: [ground state energies, directed polymers...]

^V ^(x) λ(dx) or η n = 1 Z n

Use η _n−1 or its empirical approx. to sample w.r.t. η n

η n = Law(X n ) with Markov : X _n−1

η n (f n ) = γ _n (f _n )

0≤p<n η p (G p ) Path space models X _n = (X ₀

, . . . , X _n

G _n ∈ [0, 1] particle absorption models.

G _n = Observation likelihood function Filtering models.

G _n = 1 _A

X -Free Markov transport eq. : [M n (x n−1 , dx n ) from E n−1 into E n ] (η _n−1 M _n )(dx _n ) :=

η _n−1 (dx _n−1 ) M _n (x _n−1 , dx _n ) Bayes-Boltzmann-Gibbs transformation :

η _n (G _n ) G n (x n ) η n (dx n )

g _n = g _n−1 × G _n−1 = ⇒ η _n = Ψ _G

(η _n−1 ) Running Ex.:

with A n ↓ ⇒ G _n−1 = 1 A

^V with β n ↑ ⇒ G _n−1 = e

^)V

Problem : η _n = Ψ _G

(η _n−1 ) = unstable equation.

(x, dy) s.t. local fixed point eq. → η _n = η _n M _n (Metropolis, Gibbs,...)

g n = g _n−1 × G _n−1 = ⇒ η n = Ψ G

(η _n−1 )

M _n = Ψ

)M _n = FK-model Feynman-Kac ”dynamical” formulation (X n Markov M n )

P (X _n ∈ dx _n | X _n−1 ) = K _n,η

(X _n−1 , dx _n ) Mean field particle interpretation

η _n ^N := 1 N

' _N↑∞ η _n

Particle approximation transitions (∀1 ≤ i ≤ N) ξ _n−1 ⁱ ξ _n ⁱ ∼ K _n,η

(ξ _n−1 ⁱ , dx n )

ξ _n ¹

ξ _n ⁱ −−−−−−−−−−→

ξ _n ^N −−−−−−−−−−→

ξ ¹ _n+1 .. . ξ ⁱ _n+1

.. . ξ ^N _n+1

η ^N _n ' _N↑∞ η _n = ⇒ K _n+1,η

' _N↑∞ K _n+1,η

= ⇒ ξ ⁱ _n almost iid copies of X n

_n = _n (η _n ) ≥ 0 s.t. η _n -a.e. _n G _n ∈ [0, 1] ( _n = 0 not excluded) K n+1,η