Large Random Systems

(1)

Large Random Systems

Antti Kemppainen and Kalle Kyt¨ ol¨ a

1

(2)

(3)

One of the principal aims of the course is to learn to apply probability theory to interesting probabilistic models. We thus assume familiarity with measure theo- retic probability as well as various undergraduate level topics in mathematics. To facilitate the study of the main content of the lectures, we nevertheless also recall some of the needed background in the Appendices. Our preference is to include a large number of different models in the lectures, but therefore none of the models can be studied in great depth. We devote no more than two lectures for any given model, and we can therefore only establish some basic results about each model. We refer the interested reader to more specialized texts about further results. Besides treating specific models, the course contains some development of general theory, in particular related to weak convergence and tightness.

The notes are still in a very preliminary and incomplete form, and it is our goal to gradually improve and extend them. The notes will in particular be frequently up- dated during the current course. Please send all feedback about mistakes, misprints, needs for clarification etc. to Kalle Kyt¨ol¨a (kalle.kytola@aalto.fi).

(7)

Lecture I

Introduction

In this introductory lecture we will discuss two examples:

• Random permutations: different applications

• Random walk: recurrence and transience depending on dimension The two examples are treated in different fashion.

For random permutations the objective is not to prove any theorems, but rather to illustrate how such apparently very simple random objects are relevant to modeling various interesting phenomena. Along as we describe the modeling, we also state a few known mathematical results relevant to the analysis of the models.

For random walks, instead, our focus is on proving P´olya’s theorem: a basic result which shows a qualitative difference in the long time behavior properties of the random walk depending on the dimensionality of the space. Some applications of random walks and P´olya’s theorem will be encountered later on in this course, and with some imagination the reader will have no trouble finding other applications.

The examples in this lecture are thus intended as introductions to the topic of large random systems from the perspectives of modeling and of mathematical analysis.

1. Random permutations as large random systems As a preparation, we recall some facts about permutations and fix notation.

A permutation of a set X is a bijective function σ: X → X. The set of permutations of X is denoted by S(X), and it naturally has the structure of a group: the group operation is the composition of functions. We are commonly interested in permutations of a finite set X, and if the number of elements of that set is n, it is conventional to choose X = {1,2, . . . , n} for simplicity. For this case, we use the special notation

S_n =n

σ: {1, . . . , n} → {1, . . . , n} bijectiveo

. (I.1)

The groupSn is called the symmetric group onn symbols. It is a finite group: the number of different permutations ofn symbols is

#S_n =n! =n·(n−1)· · ·2·1. (I.2) SinceS_nis a finite group, there is one particularly natural probability measure on it:

the uniform measure, which associates probability _n!¹ to each permutation σ ∈S_n. Ifσ is a random variable with values in S_n, whose law is this uniform measure on S_n, we say that σ is a uniform random permutation of n symbols.

To get a feeling for uniform random permutations, here are two problems that you should think about.

1

(8)

2 I. INTRODUCTION

Exercise I.1. Let σ be a uniformly distributed random permutation of the set {1,2, . . . , n}.

Compute the following quantities about its cycle decomposition.¹

(a) LetL be the length of the cycle that contains the element 1. What is the distribution ofL, i.e. probabilitiesP[L=`]? Calculate alsoE[L].

(b) LetS be the number of cycles in the cycle decomposition. CalculateE[S].

(c) What is the probability that elements 1 and 2 belong to the same cycle?

Exercise I.2.

(a) LetE1, . . . , En⊂Ω be events. Prove the inclusion-exclusion formula:

Ph[ⁿ

j=1

E_ji

= − X

J⊂{1,...,n}

J6=∅

(−1)^#J Ph \

j∈J

E_ji

= X

1≤j1≤n

P[E_j₁]− X

1≤j1<j₂≤n

P[E_j₁∩E_j₂] + X

1≤j1<j₂<j₃≤n

P[E_j₁∩E_j₂∩E_j₃]− · · · .

(b) What is the probability that a uniformly distributed random permutation of the set {1,2, . . . , n} has a fixed point, i.e., a cycle of length 1? Compute the limit of this probability asn→ ∞.

Hint. In part (a), you may want to use indicator random variables and consider the complementary event. In part (b), setEj={the pointjis a fixed point}.

1.1. Sorting algorithms

A common problem in programming and computer science is to sort a list of n elements, often fornvery large. There are various sorting algorithms for this purpose, e.g., “Quicksort”, “Merge sort”, “Insertion sort”, . . . . Roughly speaking, a sorting algorithm is a procedure which makes pairwise comparisons between the order of elements in the list, and then makes rearrangements of the order of the list accord- ing to the results of the comparisons, so that eventually the list becomes completely sorted, i.e., its elements appear in an increasing order.

The performance of an algorithm is measured by its use of computational resources, mainly by the amount of processor time used before the algorithm outputs a sorted list (one could also consider other aspects such as memory requirement etc.). For sorting algorithms, the required processor time is (usually) well approximated by the number C of pairwise comparisons that were needed.

The number C of comparisons depends on the input, i.e. the original list provided to the algorithm, which was to be sorted. In the absence of any further information about the input, it is reasonable to assume that the input list is equally likely to be in any possible order — thus represented by a uniform random permutation of n elements. We thus model the input as being random. Although there may be nothing random about the behavior of the algorithm for a given input, for example the required number of comparisons needed depends on the random input, and as such becomes random.

To give some concreteness to the above discussion of sorting algorithm performance, in Example I.1 below we briefly consider the average case performance of a widely used Quicksort algorithm. The interested reader will find more details both about

1Recall: A permutation can be written as a composition of disjoint cycles so that each element appears in exactly one cycle, and up to the order of cycles this cycle decomposition is unique.

(9)

1. RANDOM PERMUTATIONS AS LARGE RANDOM SYSTEMS 3

sorting algorithms in general and also specifically on the analysis of QuickSort in the excellent book [Knu97].

Example I.1 (QuickSort). Quicksort is a recursive algorithm, the simplest variant of which is informally described as follows:

• The input to the algorithm QuickSortis a list of element a= (a1, . . . , an) in some set Xwith a total order relation≤.

• If the input list contains no more than one element, then just output the list itself.

Specifically, for a one element list (a1) returnQuickSort(a1) := (a1), and for an empty list∅ returnQuickSort(∅) :=∅.

• Otherwise the input list contains more than one element. Then choose one element ar

from the list, and compare it to all other elements: the other elementsaj with aj ≤ar

form a lista⁻ of length at mostn−1, and those withaj > ar form another list a⁺ of length at mostn−1.

• Apply the Quicksort to the (shorter) sublists a⁻ and a⁺, to get the sorted sublists QuickSort(a⁻) andQuickSort(a⁺).

• Output the list constructed from the sorted sublista⁻, the elementa_r, and the sorted sublista⁻

QuickSort(a) := QuickSort(a⁻);a_r;QuickSort(a⁺) .

As the comparison element a_r one could take the first elementa₁, but it is often safer to actually take a randomly chosen element of the list.

Consider the performance of the QuickSort algorithm above for an input ofnelements in a uniformly random order, with the simplifying assumption that the list can not contain two equal elements. Denote the random number of comparisons by Cn. We can first of all ask about the expected number of comparisons needed,

mn=E[Cn].

This represents the average case performance of QuickSort. It is also important to know how big are the random fluctuations around the average case, and for this purpose one can compute the variance

vn=Var(Cn).

The numbersm_nandv_ncan be calculated as solutions to recursive equations. The recursion form_n, for example, is the following. To sort a list ofnelements, the algorithm first needs to compare one chosen element a_r to n−1 other elements. Then we obtain two sublists, of lengths k and n−1−k, wherek ∈ {0,1, . . . , n−1} is uniformly random, because the comparison element was the k+ 1:st smallest with probability _n¹. The total number of comparisons isn−1 plus the number of comparisons needed to sort the sublists, and the expected value is

m_n=n−1 + 1 n

n−1

X

k=0

m_k+m_n−1−k

. (I.3)

With initial conditions m0 = 0 and m1 = 0, the solution to this recursion is (verify by yourself)

mn= 2(n+ 1)

n

X

j=1

1

j −4n, (I.4)

whose asymptotic behavior for large nis

mn ∼2nlog(n). (I.5)

One can similarly show that

vn∼c×n², wherec≈0.4202.

In particular, for largen, the typical random fluctuations ofCn around the expected value mn ∼const.×nlog(n) are on a scale√

vn ∼const.×n. Since √

vn/mn →0, the random number of comparisonsCn is well concentrated around the average case valuemn.

(10)

4 I. INTRODUCTION

Exercise I.3. Check thatmn given by (I.4) satisfies the recursion (I.3)

Exercise I.4. Check that the asymptotical behavior ofmn given by (I.4) is as stated in (I.5), or more precisely, show that

n→∞lim mn

nlog(n) = 2.

1.2. Interface in a disordered material

As our second example of random permutations in interesting applications, we will discuss a model of disordered material, with an interface in it.

The model we will consider is visualized in Figure I.1, and is formally described as follows. Consider the unit squareS = [0,1]×[0,1] in the plane. Let Z_j = (X_j, Y_j), j = 1, . . . , n, be n independent random points uniformly distributed in the square S. Our object of interest will be a certain directed path γ from the bottom left corner (0,0) ∈ S to the top right corner (1,1) ∈ S whose both coordinates are non-decreasing functions, i.e., the direction of the path is restricted to the North- East quadrant. We want this pathγ to go through as many of the random points Z₁, . . . , Z_n∈S as possible. There may not be a unique such path, but the maximal number`of points on any such path is well defined, given the locations of the points.

Optimal pathsγ in various samples of random points are drawn in the illustrations of Figure I.1.

This is a simple model used in disordered materials physics. The easiest interpretation is that the square S represents a piece of some material, and the the random pointsZ₁, . . . , Z_n∈S represent impurities in this material. The material is consid- ered to be weaker at the locations of the impurities, so it is easier to break at the locations of the impurities: the energy needed to break the material along a pathγ could have a negative contribution from each of the impurities on γ, so that it re- quires less energy to break along paths with many impurities. The optimal interface γ is then the fracture that will actually be formed when the material is torn apart, and the number of impurities on it is related to the amount of energy needed to tear the material. Alternatively the interface could be modeling a directed polymer in a disordered environment or a domain wall between two phases that are enforced by boundary conditions. The reader can find more about modeling disordered materials and interfaces in them for example in [KZ87].

The maximal number of points that the directed path can pass through depends on the randomly chosen pointsZ1, . . . , Zn, and is therefore itself a random variable, which we denote byL_n, to emphasize the underlying choice of n random points. In the disordered materials interpretation this random number represents the energy advantage that the interface can achieve by making use of the randomly located impurities. We will next explain that L_n arises in a simple way from random permutations.

Fix the random points Z₁ = (X₁, Y₁), . . . , Z_n = (X_n, Y_n). It is convenient to order these points by the values of theirx-coordinates, so let us agree to relabel them so that 0 < X1 < X2 < · · · < Xn < 1 (note that the x-coordinate values are almost surely different, so no equalities arise with probability one). Then there is no reason for the y-coordinates to be in any specific order, and instead the rearrangement

Large Random Systems