Outline Calculus Probability Inequalities
Information-Theoretic Modeling
Lecture 2: Mathematical Preliminaries
Jyrki Kivinen
Department of Computer Science, University of Helsinki
Autumn 2012
Outline Calculus Probability Inequalities
Lecture 2: Mathematical Preliminaries
Outline Calculus Probability Inequalities
1 Calculus
Limits and Convergence Convexity
2 Probability
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
3 Inequalities
Jensen’s Inequality Gibbs’s Inequality
Outline Calculus Probability Inequalities
1 Calculus
Limits and Convergence Convexity
2 Probability
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
3 Inequalities
Jensen’s Inequality Gibbs’s Inequality
Outline Calculus Probability Inequalities
1 Calculus
Limits and Convergence Convexity
2 Probability
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
3 Inequalities
Jensen’s Inequality Gibbs’s Inequality
Outline Calculus Probability Inequalities
Limits and Convergence Convexity
1 Calculus
Limits and Convergence Convexity
2 Probability
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
3 Inequalities
Jensen’s Inequality Gibbs’s Inequality
Outline Calculus Probability Inequalities
Limits and Convergence Convexity
Exponent Function
-10 -5 0 5 10
-10 -5 0 5 10
exp x
Exponent function exp : R→ R+, expk =ek =
k
z }| { e×e×. . .×e:
multiplicative growth (nuclear reaction, “interest on interest”, ...)
expx·expy = exp(x+y) Derivative dexpx
dx = expx.
Outline Calculus Probability Inequalities
Limits and Convergence Convexity
Exponent Function
-10 -5 0 5 10
-10 -5 0 5 10
exp x
Exponent function exp : R→ R+, expk =ek =
k
z }| { e×e×. . .×e:
multiplicative growth (nuclear reaction, “interest on interest”, ...) expx·expy = exp(x+y)
Derivative dexpx
dx = expx.
Outline Calculus Probability Inequalities
Limits and Convergence Convexity
Exponent Function
-10 -5 0 5 10
-10 -5 0 5 10
exp x
Exponent function exp : R→ R+, expk =ek =
k
z }| { e×e×. . .×e:
multiplicative growth (nuclear reaction, “interest on interest”, ...) expx·expy = exp(x+y) Derivative dexpx
dx = expx.
Outline Calculus Probability Inequalities
Limits and Convergence Convexity
Examples: Logarithm
-10 -5 0 5 10
-10 -5 0 5 10
exp x ln x
Natural logarithm ln : R+→R, ln expx =x:
time to grow tox, number of digits (log10).
General (basea) logarithm, logaax =x: logax= 1 lnalnx
Outline Calculus Probability Inequalities
Limits and Convergence Convexity
Examples: Logarithm
-10 -5 0 5 10
-10 -5 0 5 10
exp x ln x
Natural logarithm ln : R+→R, ln expx =x:
time to grow tox, number of digits (log10).
General (basea) logarithm, logaax =x: logax= 1 lnalnx
Outline Calculus Probability Inequalities
Limits and Convergence Convexity
Logarithm Function
-10 -5 0 5 10
-10 -5 0 5 10
exp x ln x
lnxy = lnx+ lny
lnxr =rlnx ln 1
x =−lnx lnx
y = lnx−lny lnx≤x−1 with equality if and only ifx = 1
(NB: doesn’t work with logax ifa6=e)
dlnx dx = 1
x
Outline Calculus Probability Inequalities
Limits and Convergence Convexity
Logarithm Function
-10 -5 0 5 10
-10 -5 0 5 10
exp x ln x
lnxy = lnx+ lny lnxr =rlnx
ln 1
x =−lnx lnx
y = lnx−lny lnx≤x−1 with equality if and only ifx = 1
(NB: doesn’t work with logax ifa6=e)
dlnx dx = 1
x
Outline Calculus Probability Inequalities
Limits and Convergence Convexity
Logarithm Function
-10 -5 0 5 10
-10 -5 0 5 10
exp x ln x
lnxy = lnx+ lny lnxr =rlnx ln 1
x =−lnx
lnx
y = lnx−lny lnx≤x−1 with equality if and only ifx = 1
(NB: doesn’t work with logax ifa6=e)
dlnx dx = 1
x
Outline Calculus Probability Inequalities
Limits and Convergence Convexity
Logarithm Function
-10 -5 0 5 10
-10 -5 0 5 10
exp x ln x
lnxy = lnx+ lny lnxr =rlnx ln 1
x =−lnx lnx
y = lnx−lny
lnx≤x−1 with equality if and only ifx = 1 (NB: doesn’t work with logax ifa6=e)
dlnx dx = 1
x
Outline Calculus Probability Inequalities
Limits and Convergence Convexity
Logarithm Function
-10 -5 0 5 10
-10 -5 0 5 10
exp x ln x x-1
lnxy = lnx+ lny lnxr =rlnx ln 1
x =−lnx lnx
y = lnx−lny lnx≤x−1 with equality if and only ifx = 1
(NB: doesn’t work with logax ifa6=e)
dlnx dx = 1
x
Outline Calculus Probability Inequalities
Limits and Convergence Convexity
Logarithm Function
-10 -5 0 5 10
-10 -5 0 5 10
exp x ln x x-1
lnxy = lnx+ lny lnxr =rlnx ln 1
x =−lnx lnx
y = lnx−lny lnx≤x−1 with equality if and only ifx = 1
(NB: doesn’t work with logax ifa6=e)
dlnx dx = 1
x
Outline Calculus Probability Inequalities
Limits and Convergence Convexity
Limits and Convergence
A sequence of values (xi : i ∈N) convergesto limit L, limi→∞xi =L, iff for any >0 there exists a numberN ∈N such that
|xi −L|< for all i ≥N .
f(x) has alimit L asx approaches c, limx→cf(x) =L, (from above c+/belowc−) iff for any >0 there exists a number δ >0 such that
|f(x)−L|< for all
c <x <c+δ ‘above’ c−δ <x <c ’below’ 0<|x−c|< δ —
Outline Calculus Probability Inequalities
Limits and Convergence Convexity
Limits and Convergence
A sequence of values (xi : i ∈N) convergesto limit L, limi→∞xi =L, iff for any >0 there exists a numberN ∈N such that
|xi −L|< for all i ≥N .
f(x) has alimit L asx approaches c, limx→cf(x) =L, (from abovec+/belowc−) iff for any >0 there exists a number δ >0 such that
|f(x)−L|< for all
c <x <c+δ ‘above’
c−δ <x <c ’below’
0<|x−c|< δ —
Outline Calculus Probability Inequalities
Limits and Convergence Convexity
Example: Logarithm Again
-0.5 -0.4 -0.3 -0.2 -0.1 0 0.1
0 0.2 0.4 0.6 0.8 1 x ln x
Even thoughxlnx is undefined atx = 0, we have(by l’Hˆopital’s rule):
x→0lim+xlnx = 0 .
Outline Calculus Probability Inequalities
Limits and Convergence Convexity
Convexity
Functionf : X →Ris said to be convexiff for any x,y ∈ X and any 0≤t≤1 we have
f(tx+ (1−t)y)≤tf(x) + (1−t)f(y) .
0 2 4 6 8 10
-4 -2 0 2 4
exp x
Function f isstrictly convexiff the above inequality holds strictly (‘<’ instead of ‘≤’) when 0<t <1.
Function f is (strictly)concaveiff the above holds for −f.
Outline Calculus Probability Inequalities
Limits and Convergence Convexity
Convexity
Functionf : X →Ris said to be convexiff for any x,y ∈ X and any 0≤t≤1 we have
f(tx+ (1−t)y)≤tf(x) + (1−t)f(y) .
0 2 4 6 8 10
-4 -2 0 2 4
exp x
Functionf is strictly convexiff the above inequality holds strictly (‘<’ instead of ‘≤’) when 0<t<1.
Function f is (strictly)concaveiff the above holds for −f.
Outline Calculus Probability Inequalities
Limits and Convergence Convexity
Convexity
Functionf : X →Ris said to be convexiff for any x,y ∈ X and any 0≤t≤1 we have
f(tx+ (1−t)y)≤tf(x) + (1−t)f(y) .
0 2 4 6 8 10
-4 -2 0 2 4
exp x
Functionf is strictly convexiff the above inequality holds strictly (‘<’ instead of ‘≤’) when 0<t<1.
Functionf is (strictly) concaveiff the above holds for −f.
Outline Calculus Probability Inequalities
Limits and Convergence Convexity
Convexity and Derivatives
Theorem
If functionf has a second derivativef00, andf00 is non-negative (≥0) for all x, then f is convex. Iff00 is positive (>0) for all x, thenf is strictlyconvex.
0 2 4 6 8 10
-4 -2 0 2 4
exp x
ex is convex!
Example: f0(x) = dexpx
dx = expx ⇒f00(x) = expx >0. Hence exp is strictly convex.
Outline Calculus Probability Inequalities
Limits and Convergence Convexity
Convexity and Derivatives
Theorem
If functionf has a second derivativef00, andf00 is non-negative (≥0) for all x, then f is convex. Iff00 is positive (>0) for all x, thenf is strictlyconvex.
0 2 4 6 8 10
-4 -2 0 2 4
exp x
ex is convex!
Example: f0(x) = dexpx
dx = expx
⇒f00(x) = expx >0. Hence exp is strictly convex.
Outline Calculus Probability Inequalities
Limits and Convergence Convexity
Convexity and Derivatives
Theorem
If functionf has a second derivativef00, andf00 is non-negative (≥0) for all x, then f is convex. Iff00 is positive (>0) for all x, thenf is strictlyconvex.
0 2 4 6 8 10
-4 -2 0 2 4
exp x
ex is convex!
Example: f0(x) = dexpx
dx = expx ⇒f00(x) = expx >0.
Hence exp is strictly convex.
Outline Calculus Probability Inequalities
Limits and Convergence Convexity
Convexity and Derivatives
Theorem
If functionf has a second derivativef00, andf00 is non-negative (≥0) for all x, then f is convex. Iff00 is positive (>0) for all x, thenf is strictlyconvex.
0 2 4 6 8 10
-4 -2 0 2 4
exp x
ex is convex!
Example: f0(x) = dexpx
dx = expx ⇒f00(x) = expx >0. Hence exp is strictly convex.
Outline Calculus Probability Inequalities
Limits and Convergence Convexity
Convexity and Derivatives
Theorem
If functionf has a second derivativef00, andf00 is non-negative (≥0) for all x, then f is convex. Iff00 is positive (>0) for all x, thenf is strictlyconvex.
0 2 4 6 8 10
-4 -2 0 2 4
exp x
ex is convex!
Example: f0(x) = dexpx
dx = expx ⇒f00(x) = expx >0. Hence exp is strictly convex.
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Probability
A.N. Kolmogorov, 1903–1987
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
1 Calculus
Limits and Convergence Convexity
2 Probability
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
3 Inequalities
Jensen’s Inequality Gibbs’s Inequality
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Probability Space
A probability space (Ω,F,P) is defined by
thesample space Ω whose elements are called outcomes ω, a sigma algebraF of subsets of Ω, whose elements are called eventsE, and
a measure P which determines the probabilities of events, P : F →[0,1].
MeasureP has to satisfy the probability axioms: P(E)≥0 for all E ∈ F,P(Ω) = 1, and P(E1∪E2∪. . .) =P
iP(Ei) if (Ei) is a countable sequence ofdisjoint events.
These axioms imply the usual rules of probability calculus, e.g., P(A∪B) =P(A) +P(B)−P(A∩B),P(Ω\E) = 1−P(E), etc.
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Probability Space
A probability space (Ω,F,P) is defined by
thesample space Ω whose elements are called outcomes ω,
a sigma algebraF of subsets of Ω, whose elements are called eventsE, and
a measure P which determines the probabilities of events, P : F →[0,1].
MeasureP has to satisfy the probability axioms: P(E)≥0 for all E ∈ F,P(Ω) = 1, and P(E1∪E2∪. . .) =P
iP(Ei) if (Ei) is a countable sequence ofdisjoint events.
These axioms imply the usual rules of probability calculus, e.g., P(A∪B) =P(A) +P(B)−P(A∩B),P(Ω\E) = 1−P(E), etc.
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Probability Space
A probability space (Ω,F,P) is defined by
thesample space Ω whose elements are called outcomes ω, a sigma algebraF of subsets of Ω, whose elements are called eventsE, and
a measure P which determines the probabilities of events, P : F →[0,1].
MeasureP has to satisfy the probability axioms: P(E)≥0 for all E ∈ F,P(Ω) = 1, and P(E1∪E2∪. . .) =P
iP(Ei) if (Ei) is a countable sequence ofdisjoint events.
These axioms imply the usual rules of probability calculus, e.g., P(A∪B) =P(A) +P(B)−P(A∩B),P(Ω\E) = 1−P(E), etc.
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Probability Space
A probability space (Ω,F,P) is defined by
thesample space Ω whose elements are called outcomes ω, a sigma algebraF of subsets of Ω, whose elements are called eventsE, and
a measure P which determines the probabilities of events, P : F →[0,1].
MeasureP has to satisfy the probability axioms: P(E)≥0 for all E ∈ F,P(Ω) = 1, and P(E1∪E2∪. . .) =P
iP(Ei) if (Ei) is a countable sequence ofdisjoint events.
These axioms imply the usual rules of probability calculus, e.g., P(A∪B) =P(A) +P(B)−P(A∩B),P(Ω\E) = 1−P(E), etc.
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Probability Space
A probability space (Ω,F,P) is defined by
thesample space Ω whose elements are called outcomes ω, a sigma algebraF of subsets of Ω, whose elements are called eventsE, and
a measure P which determines the probabilities of events, P : F →[0,1].
MeasureP has to satisfy the probability axioms: P(E)≥0 for all E ∈ F,P(Ω) = 1, and P(E1∪E2∪. . .) =P
iP(Ei) if (Ei) is a countable sequence ofdisjointevents.
These axioms imply the usual rules of probability calculus, e.g., P(A∪B) =P(A) +P(B)−P(A∩B),P(Ω\E) = 1−P(E), etc.
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Probability Space
A probability space (Ω,F,P) is defined by
thesample space Ω whose elements are called outcomes ω, a sigma algebraF of subsets of Ω, whose elements are called eventsE, and
a measure P which determines the probabilities of events, P : F →[0,1].
MeasureP has to satisfy the probability axioms: P(E)≥0 for all E ∈ F,P(Ω) = 1, and P(E1∪E2∪. . .) =P
iP(Ei) if (Ei) is a countable sequence ofdisjointevents.
These axioms imply the usual rules of probability calculus, e.g., P(A∪B) =P(A) +P(B)−P(A∩B),P(Ω\E) = 1−P(E), etc.
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Venn Diagrams
0000000 0000000 0000000 0000000 0000000 0000000 0000000 0000000 0000000 0000000 0000000
1111111 1111111 1111111 1111111 1111111 1111111 1111111 1111111 1111111 1111111 1111111
A B
Ω
A B
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Probability Calculus
1 The conditional probabilityof event B given that event A occurs is defined as
P(B |A) = P(A∩B)
P(A) for A such thatP(A)>0.
2 P(A∩B) =P(A)·P(B |A) =P(B)·P(A|B) .
3 Bayes’ rule: P(B |A) = P(A|B)·P(B)
P(A) .
4 Chain rule: P(∩Ni=1Ei) =
N
Y
i=1
P(Ei | ∩ij−1=1Ej)
=P(E1)·P(E2 |E1)·P(E3 |E1∩E2)·. . .
·P(EN |E1∩. . .∩EN−1) .
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Probability Calculus
1 The conditional probabilityof event B given that event A occurs is defined as
P(B |A) = P(A∩B)
P(A) for A such thatP(A)>0.
2 P(A∩B) =P(A)·P(B |A) =P(B)·P(A|B) .
3 Bayes’ rule: P(B |A) = P(A|B)·P(B)
P(A) .
4 Chain rule: P(∩Ni=1Ei) =
N
Y
i=1
P(Ei | ∩ij−1=1Ej)
=P(E1)·P(E2 |E1)·P(E3 |E1∩E2)·. . .
·P(EN |E1∩. . .∩EN−1) .
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Probability Calculus
1 The conditional probabilityof event B given that event A occurs is defined as
P(B |A) = P(A∩B)
P(A) for A such thatP(A)>0.
2 P(A∩B) =P(A)·P(B |A) =P(B)·P(A|B) .
3 Bayes’ rule: P(B |A) = P(A|B)·P(B) P(A) .
4 Chain rule: P(∩Ni=1Ei) =
N
Y
i=1
P(Ei | ∩ij−1=1Ej)
=P(E1)·P(E2 |E1)·P(E3 |E1∩E2)·. . .
·P(EN |E1∩. . .∩EN−1) .
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Probability Calculus
1 The conditional probabilityof event B given that event A occurs is defined as
P(B |A) = P(A∩B)
P(A) for A such thatP(A)>0.
2 P(A∩B) =P(A)·P(B |A) =P(B)·P(A|B) .
3 Bayes’ rule: P(B |A) = P(A|B)·P(B) P(A) .
4 Chain rule:
P(∩Ni=1Ei) =
N
Y
i=1
P(Ei | ∩ij−1=1Ej)
=P(E1)·P(E2 |E1)·P(E3 |E1∩E2)·. . .
·P(EN |E1∩. . .∩EN−1) .
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Random Variables
Technically, a random variable is a (measurable) function X : Ω→Rfrom the sample space to the reals.
The probability measureP on Ω determines the distribution of X: PX(A) = Pr[X ∈A] =P({ω : X(ω)∈A}) ,
whereA⊆R.
It is often more natural to relabel the outcomes and denote them, for instance, by letters,A,B,C,..., or words red,black, ... In practice, we often forget about the underlying probability space Ω, and just speak of random variableX and its distribution PX.
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Random Variables
Technically, a random variable is a (measurable) function X : Ω→Rfrom the sample space to the reals.
The probability measureP on Ω determines the distribution of X: PX(A) = Pr[X ∈A] =P({ω : X(ω)∈A}) ,
whereA⊆R.
It is often more natural to relabel the outcomes and denote them, for instance, by letters,A,B,C,..., or words red,black, ... In practice, we often forget about the underlying probability space Ω, and just speak of random variableX and its distribution PX.
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Random Variables
Technically, a random variable is a (measurable) function X : Ω→Rfrom the sample space to the reals.
The probability measureP on Ω determines the distribution of X: PX(A) = Pr[X ∈A] =P({ω : X(ω)∈A}) ,
whereA⊆R.
It is often more natural to relabel the outcomes and denote them, for instance, by letters,A,B,C,..., or words red,black, ...
In practice, we often forget about the underlying probability space Ω, and just speak of random variableX and its distribution PX.
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Random Variables
Technically, a random variable is a (measurable) function X : Ω→Rfrom the sample space to the reals.
The probability measureP on Ω determines the distribution of X: PX(A) = Pr[X ∈A] =P({ω : X(ω)∈A}) ,
whereA⊆R.
It is often more natural to relabel the outcomes and denote them, for instance, by letters,A,B,C,..., or words red,black, ...
In practice, we often forget about the underlying probability space Ω, and just speak of random variableX and its distribution PX.
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Random Variables
The distribution of a random variable canalwaysbe represented as acumulative distribution function(cdf)FX(x) = Pr[X ≤x].
In addition:
A discreterandom variableX with countable alphabetX has a probability mass function(pmf) pX such that
Pr[X =x] =pX(x).
A continuous random variableY has aprobability density function(pdf) fY such that Pr[Y ∈A] =R
AfY(x)dy.
There are alsomixed random variables that are neither discrete nor continuous. They don’t have a pmf or pdf, but they do have a cdf. We often omit the subscriptsX,Y, . . .and write p(x),f(y), etc.
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Random Variables
The distribution of a random variable canalwaysbe represented as acumulative distribution function(cdf)FX(x) = Pr[X ≤x].
In addition:
A discreterandom variableX with countable alphabetX has a probability mass function(pmf) pX such that
Pr[X =x] =pX(x).
A continuous random variableY has aprobability density function(pdf) fY such that Pr[Y ∈A] =R
AfY(x)dy.
There are alsomixed random variables that are neither discrete nor continuous. They don’t have a pmf or pdf, but they do have a cdf. We often omit the subscriptsX,Y, . . .and write p(x),f(y), etc.
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Random Variables
The distribution of a random variable canalwaysbe represented as acumulative distribution function(cdf)FX(x) = Pr[X ≤x].
In addition:
A discreterandom variableX with countable alphabetX has a probability mass function(pmf) pX such that
Pr[X =x] =pX(x).
A continuousrandom variable Y has aprobability density function(pdf) fY such that Pr[Y ∈A] =R
AfY(x)dy.
There are alsomixed random variables that are neither discrete nor continuous. They don’t have a pmf or pdf, but they do have a cdf. We often omit the subscriptsX,Y, . . .and write p(x),f(y), etc.
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Random Variables
The distribution of a random variable canalwaysbe represented as acumulative distribution function(cdf)FX(x) = Pr[X ≤x].
In addition:
A discreterandom variableX with countable alphabetX has a probability mass function(pmf) pX such that
Pr[X =x] =pX(x).
A continuousrandom variable Y has aprobability density function(pdf) fY such that Pr[Y ∈A] =R
AfY(x)dy.
There are alsomixed random variables that are neither discrete nor continuous. They don’t have a pmf or pdf, but they do have a cdf.
We often omit the subscriptsX,Y, . . .and write p(x),f(y), etc.
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Random Variables
The distribution of a random variable canalwaysbe represented as acumulative distribution function(cdf)FX(x) = Pr[X ≤x].
In addition:
A discreterandom variableX with countable alphabetX has a probability mass function(pmf) pX such that
Pr[X =x] =pX(x).
A continuousrandom variable Y has aprobability density function(pdf) fY such that Pr[Y ∈A] =R
AfY(x)dy.
There are alsomixed random variables that are neither discrete nor continuous. They don’t have a pmf or pdf, but they do have a cdf.
We often omit the subscriptsX,Y, . . . and write p(x),f(y), etc.
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Random Variables
Since random variables are functions, we can define more random variables as functions of random variables: iff is a function, andX andY are r.v.’s, thenf(X) : Ω→Ris a r.v.,X +Y is a r.v., etc.
Example: Let r.v.X be the outcome of a die.
The pmf of X is given bypX(x) = 1/6 for all x ∈ {1,2,3,4,5,6}.
The pmf of r.v. X2 is given bypX2(x) = 1/6 for all x ∈ {1,4,9,16,25,36}.
!
In particular, a pmfpX is a function, and hence, pX(X) is also a random variable. Further, pX2(X),lnpX(X), etc. are random variables.Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Random Variables
Since random variables are functions, we can define more random variables as functions of random variables: iff is a function, andX andY are r.v.’s, thenf(X) : Ω→Ris a r.v.,X +Y is a r.v., etc.
Example: Let r.v.X be the outcome of a die.
The pmf of X is given bypX(x) = 1/6 for all x ∈ {1,2,3,4,5,6}.
The pmf of r.v. X2 is given by pX2(x) = 1/6 for all x ∈ {1,4,9,16,25,36}.
!
In particular, a pmfpX is a function, and hence, pX(X) is also a random variable. Further, pX2(X),lnpX(X), etc. are random variables.Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Random Variables
Since random variables are functions, we can define more random variables as functions of random variables: iff is a function, andX andY are r.v.’s, thenf(X) : Ω→Ris a r.v.,X +Y is a r.v., etc.
Example: Let r.v.X be the outcome of a die.
The pmf of X is given bypX(x) = 1/6 for all x ∈ {1,2,3,4,5,6}.
The pmf of r.v. X2 is given by pX2(x) = 1/6 for all x ∈ {1,4,9,16,25,36}.
!
In particular, a pmfpX is a function, and hence, pX(X) is also a random variable. Further, pX2(X),lnpX(X), etc. are random variables.Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Random Variables
Since random variables are functions, we can define more random variables as functions of random variables: iff is a function, andX andY are r.v.’s, thenf(X) : Ω→Ris a r.v.,X +Y is a r.v., etc.
Example: Let r.v.X be the outcome of a die.
The pmf of X is given bypX(x) = 1/6 for all x ∈ {1,2,3,4,5,6}.
The pmf of r.v. X2 is given by pX2(x) = 1/6 for all x ∈ {1,4,9,16,25,36}.
!
In particular, a pmfpX is a function, and hence, pX(X) is also a random variable. Further, pX2(X),lnpX(X), etc. are random variables.Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Multivariate Distributions
The probabilistic behavior of two or more random variables is described by multivariate distributions.
Thejoint distributionof r.v.’s X andY is PX,Y(A,B) = Pr[X ∈A ∧ Y ∈B]
=P({ω : X(ω)∈A,Y(ω)∈B}) .
For each multivariate distributionPX,Y, there are uniquemarginal distributionsPX andPY such that
PX(A) =PX,Y(A,R), PY(B) =PX,Y(R,B) ,
pmf:pY(y) = X
x∈X
pX,Y(x,y) pdf:fY(y) = Z
R
fX,Y(x,y)dx .
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Multivariate Distributions
The probabilistic behavior of two or more random variables is described by multivariate distributions.
Thejoint distributionof r.v.’s X andY is PX,Y(A,B) = Pr[X ∈A ∧ Y ∈B]
=P({ω : X(ω)∈A,Y(ω)∈B}) .
For each multivariate distributionPX,Y, there are uniquemarginal distributionsPX andPY such that
PX(A) =PX,Y(A,R), PY(B) =PX,Y(R,B) ,
pmf:pY(y) = X
x∈X
pX,Y(x,y) pdf:fY(y) = Z
R
fX,Y(x,y)dx .
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Multivariate Distributions
The probabilistic behavior of two or more random variables is described by multivariate distributions.
Thejoint distributionof r.v.’s X andY is PX,Y(A,B) = Pr[X ∈A ∧ Y ∈B]
=P({ω : X(ω)∈A,Y(ω)∈B}) .
For each multivariate distributionPX,Y, there are uniquemarginal distributionsPX andPY such that
PX(A) =PX,Y(A,R), PY(B) =PX,Y(R,B) ,
pmf:pY(y) =X
x∈X
pX,Y(x,y) pdf:fY(y) = Z
R
fX,Y(x,y)dx .
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Multivariate Distributions
Theconditional distributionis defined similar to conditional probability:
PY|X(B |A) =PX,Y(A,B)
PX(A) for Asuch that PX(A)>0.
For discrete/continuous variables we have: discrete r.v.’s:
pY|X(y |x) = pX,Y(x,y)
pX(x) , pX(x)>0 , continuousr.v.’s:
fY|X(y|x) = fX,Y(x,y)
fX(x) , fX(x)>0 .
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Multivariate Distributions
Theconditional distributionis defined similar to conditional probability:
PY|X(B |A) =PX,Y(A,B)
PX(A) for Asuch that PX(A)>0.
For discrete/continuous variables we have:
discrete r.v.’s:
pY|X(y |x) = pX,Y(x,y)
pX(x) , pX(x)>0 ,
continuousr.v.’s:
fY|X(y|x) = fX,Y(x,y)
fX(x) , fX(x)>0 .
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Multivariate Distributions
Theconditional distributionis defined similar to conditional probability:
PY|X(B |A) =PX,Y(A,B)
PX(A) for Asuch that PX(A)>0.
For discrete/continuous variables we have:
discrete r.v.’s:
pY|X(y |x) = pX,Y(x,y)
pX(x) , pX(x)>0 , continuousr.v.’s:
fY|X(y|x) = fX,Y(x,y)
fX(x) , fX(x)>0 .
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Independence
VariableX is said to be independent of variableY (X Y) iff PX,Y(A,B) =PX(A)·PY(B) for all A,B⊆R.
This is equivalent to
PX|Y(A|B) =PX(A) for all B such that P(B)>0, and
PY|X(B|A) =PY(B) for all Asuch thatP(A)>0. In words, knowledge about one variable tells nothing about the other. Note that independence is symmetric,X Y ⇔Y X.
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Independence
VariableX is said to be independent of variableY (X Y) iff PX,Y(A,B) =PX(A)·PY(B) for all A,B⊆R. This is equivalent to
PX|Y(A|B) =PX(A) for all B such thatP(B)>0,
and
PY|X(B|A) =PY(B) for all Asuch thatP(A)>0. In words, knowledge about one variable tells nothing about the other. Note that independence is symmetric,X Y ⇔Y X.
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Independence
VariableX is said to be independent of variableY (X Y) iff PX,Y(A,B) =PX(A)·PY(B) for all A,B⊆R. This is equivalent to
PX|Y(A|B) =PX(A) for all B such thatP(B)>0, and
PY|X(B|A) =PY(B) for all Asuch thatP(A)>0.
In words, knowledge about one variable tells nothing about the other. Note that independence is symmetric,X Y ⇔Y X.
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Expectation
Theexpectation(or expected value, or mean) of a discrete random variable is given by
E[X] = X
x∈X
p(x)x .
The expectation of a continuous random variable is given by E[X] =
Z
X
f(x)x dx . In both cases, it is possible thatE[X] =±∞.
E[kX] =kE[X] E[X+Y] =E[X] +E[Y] E[XY] =E[X]E[Y] if X Y
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Expectation
Theexpectation(or expected value, or mean) of a discrete random variable is given by
E[X] = X
x∈X
p(x)x .
The expectation of a continuous random variable is given by E[X] =
Z
X
f(x)x dx .
In both cases, it is possible thatE[X] =±∞.
E[kX] =kE[X] E[X+Y] =E[X] +E[Y] E[XY] =E[X]E[Y] if X Y
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Expectation
Theexpectation(or expected value, or mean) of a discrete random variable is given by
E[X] = X
x∈X
p(x)x .
The expectation of a continuous random variable is given by E[X] =
Z
X
f(x)x dx . In both cases, it is possible thatE[X] =±∞.
E[kX] =kE[X] E[X+Y] =E[X] +E[Y] E[XY] =E[X]E[Y] if X Y
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Expectation
Theexpectation(or expected value, or mean) of a discrete random variable is given by
E[X] = X
x∈X
p(x)x .
The expectation of a continuous random variable is given by E[X] =
Z
X
f(x)x dx . In both cases, it is possible thatE[X] =±∞.
E[kX] =kE[X] E[X+Y] =E[X] +E[Y] E[XY] =E[X]E[Y] if X Y
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Law of Large Numbers
Let X1,X2, . . . be a sequence of independent outcomes of a die, so that pXi(x) = 1/6 for all i ∈N,x ∈ {1,2,3,4,5,6}.
0 0.05 0.1 0.15 0.2 0.25 0.3
1 2 3 4 5 6
E[Xi] =
6
X
x=1
1
6x= 21
6 = 3.5 for all i ∈N.
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Law of Large Numbers
Let X1,X2, . . . be a sequence of independent outcomes of a die, so that pXi(x) = 1/6 for all i ∈N,x ∈ {1,2,3,4,5,6}.
0 0.05 0.1 0.15 0.2 0.25 0.3
1 2 3 4 5 6
E[Xi] =
6
X
x=1
1
6x= 21
6 = 3.5 for alli ∈N.
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Law of Large Numbers
LetSn=Pn
i=1Xnbe the sum of the first n outcomes.
The distribution ofSn is given by
PSn(x) = # of ways to get sum x withn dice 6n
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Law of Large Numbers
LetSn=Pn
i=1Xnbe the sum of the first n outcomes.
The distribution ofSn is given by
PSn(x) = # of ways to get sum x withn dice 6n
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Law of Large Numbers
LetSn=Pn
i=1Xnbe the sum of the first n outcomes.
The distribution ofSn is given by
PSn(x) = # of ways to get sum x withn dice 6n
distribution ofS1
0 0.05 0.1 0.15 0.2 0.25 0.3
1 2 3 4 5 6
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Law of Large Numbers
LetSn=Pn
i=1Xnbe the sum of the first n outcomes.
The distribution ofSn is given by
PSn(x) = # of ways to get sum x withn dice 6n
distribution ofS2
0 0.05 0.1 0.15 0.2 0.25
2 4 6 8 10 12
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Law of Large Numbers
LetSn=Pn
i=1Xnbe the sum of the first n outcomes.
The distribution ofSn is given by
PSn(x) = # of ways to get sum x withn dice 6n
distribution ofS3
0 0.05 0.1 0.15 0.2
4 6 8 10 12 14 16 18
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Law of Large Numbers
LetSn=Pn
i=1Xnbe the sum of the first n outcomes.
The distribution ofSn is given by
PSn(x) = # of ways to get sum x withn dice 6n
distribution ofS4
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14
5 10 15 20
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Law of Large Numbers
LetSn=Pn
i=1Xnbe the sum of the first n outcomes.
The distribution ofSn is given by
PSn(x) = # of ways to get sum x withn dice 6n
distribution ofS5
0 0.02 0.04 0.06 0.08 0.1 0.12
5 10 15 20 25 30
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Law of Large Numbers
LetSn=Pn
i=1Xnbe the sum of the first n outcomes.
The distribution ofSn is given by
PSn(x) = # of ways to get sum x withn dice 6n
distribution ofS10
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
10 20 30 40 50 60
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Law of Large Numbers
LetSn=Pn
i=1Xnbe the sum of the first n outcomes.
The distribution ofSn is given by
PSn(x) = # of ways to get sum x withn dice 6n
distribution ofS20
0 0.01 0.02 0.03 0.04 0.05 0.06
20 40 60 80 100 120
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Law of Large Numbers
LetSn=Pn
i=1Xnbe the sum of the first n outcomes.
The distribution ofSn is given by
PSn(x) = # of ways to get sum x withn dice 6n
distribution of S100
0 0.005 0.01 0.015 0.02 0.025 0.03
100 200 300 400 500 600
Outline Calculus Probability Inequalities
Probability Space and Random Variables Joint and Conditional Distributions Expectation
Law of Large Numbers
Law of Large Numbers
Source: Wikipedia