概率论
参考书:吴昊老师的讲义
Convergences
The convergence of a sequence \(\{X_n\}\):
-
almost sure convergence,
-
convergence in probability,
-
convergence in \(L^p\) with \(p\in [1,\infty)\),
-
convergence in distribution (also called convergence in law, weak convergence).
Almost sure convergence
Definition The sequence of random variables \(\{X_n\}\) converges a.s. to the random variable \(X\) if there exists a null set \(\mathcal{N}\) such that
Lemma The sequence \(\{X_n\}\) converges a.s. to \(X\) if and only if, for any \(\varepsilon>0\),
Convergence in probability
Definition The sequence \(\{X_n\}\) converges in probability to the random variable \(X\) if, for every \(\varepsilon>0\),
Almost sure convergence implies convergence in probability. But the converse is false.
Convergence in \(L^p\)
Definition Assume \(p\ge 1\). The sequence \(\{X_n\}\) converges in \(L^p\) to the random variable \(X\) if \(X_n\in L^p\), \(X\in L^p\) and
Lemma Assume \(p>0\). If \(X_n\to X\) in \(L^p\), then \(X_n\to X\) in probability.
Examples of different convergences
Example In \(([0,1],\mathcal{B},\mathrm{Leb})\), define
We have \(X_n\to 0\) almost surely, but
for any \(p>0\). Thus a.s. convergence does not imply convergence in \(L^p\).
Example In \(([0,1],\mathcal{B},\mathrm{Leb})\), let \(\varphi_{k,j}\) be the indicator function of the interval
Order these functions first according to \(k\) increasing, and then for each \(k\) according to \(j\) increasing, into one sequence \(\varphi_{k_n,j_n}\). Set
Then we have
but \(\{X_n(\omega)\}\) does not converge for any \(\omega\).
Lemma (L2 weak law) Let \(X_1,X_2,\ldots\) be independent random variables with \(\mathbb{E}[X_i]=m\) and \(\operatorname{var}(X_i)\le C<\infty\). Set
Then
Example (Polynomial approximation) Let \(f\) be a continuous function on \([0,1]\). Define the polynomial
This is called the Bernstein polynomial of degree \(n\) associated to \(f\). Then
Example (Coupon collecting) Let \(X_1,X_2,\ldots\) be i.i.d. uniform on \(\{1,2,\ldots,N\}\). Let \(T_N\) be the first time \(n\) that
Then
Borel Cantelli Lemma
Definition Let \(\{E_n\}\) be a sequence of subsets in \(\mathcal{F}\). Define
Lemma A point belongs to \(\limsup_n E_n\) if and only if it belongs to infinitely many terms of the sequence \(\{E_n,n\ge 1\}\).
In more intuitive language: the event \(\limsup_n E_n\) occurs if and only if the events \(E_n\) occur infinitely many often, and we write
Theorem (Borel Cantelli lemma)
For arbitrary sequence \(\{E_n\}\), we have
If the events \(\{E_n\}\) are independent, we have
Corollary Convergence in probability implies almost sure convergence along subsequence.
Example Suppose \(X_1,X_2,\ldots\) are i.i.d. with \(\mathbb{E}[X_j]=m\) and \(\mathbb{E}[X_j^4]<\infty\). Set
Then
Theorem The implication
remains true if the events \(\{E_n\}\) are pairwise independent.
Weak Convergence
Definition A sequence of measures \(\{\mu_n\}\) converges weakly to a measure \(\mu\) if
for all continuity points \(a,b\) of \(\mu\). We denote by
Helly's extraction principle
A measure \(\mu\) on \((\mathbb{R},\mathcal{B})\) is a subprobability measure if
Proposition Given any sequence of subprobability measures, there is a subsequence that converges weakly to a subprobability measure.
Proposition Suppose \(\{\mu_n\}\) is a sequence of subprobability measures. If every weakly convergent subsequence converges to the same limit \(\mu\), then
Relative compact vs. tight
Definition A family of probability measures \(\{\mu_{\alpha},\alpha\in A\}\) is tight if, for any \(\varepsilon>0\), there exists a finite interval \(I\) such that
Theorem Let \(\{\mu_{\alpha},\alpha\in A\}\) be a family of probability measures. In order that any sequence contains a subsequence which converges weakly to a probability measure, it is necessary and sufficient that the family is tight.
This statement can also be phrased as follows: a family of probability measures is relatively compact if and only if it is tight.
Criterion for weak convergence
Let
We have
It is well known that \(C_0\) is the closure of \(C_c\) with respect to uniform convergence.
Proposition Suppose \(\{\mu_n\}\) and \(\mu\) are probability measures. Then \(\mu_n\Rightarrow \mu\) if and only if
A function \(f\) on \(\mathbb{R}\) is lower semicontinuous if
\(f\) is bounded and lower semicontinuous if and only if there exists a sequence \(f_n\in C_b\) which increases to \(f\) everywhere.
Corollary Suppose \(\{\mu_n\}\) and \(\mu\) are probability measures. Then the following statements are equivalent:
-
\(\mu_n\Rightarrow \mu\).
-
\(\lim_{n\to\infty}\int f\,d\mu_n=\int f\,d\mu,\quad \forall f\in C_b\).
-
\(\liminf_{n\to\infty}\int f\,d\mu_n\ge \int f\,d\mu,\quad \forall\) bounded lower semicontinuous \(f\).
-
\(\limsup_{n\to\infty}\int f\,d\mu_n\le \int f\,d\mu,\quad \forall\) bounded upper semicontinuous \(f\).
Corollary Suppose \(\{\mu_n\}\) and \(\mu\) are probability measures. Then the following statements are equivalent:
-
\(\mu_n\Rightarrow \mu\).
-
\(\liminf_{n\to\infty}\mu_n(O)\ge \mu(O)\), for any open set \(O\).
-
\(\limsup_{n\to\infty}\mu_n(K)\le \mu(K)\), for any closed set \(K\).
Combining the criteria above, we can also use the equivalent condition
Convergence in distribution
Definition A sequence of random variables \(\{X_n\}\) converges in distribution to a random variable \(X\) if
We denote by
or \(X_n\to X\) in distribution.
Lemma Convergence in probability implies convergence in distribution.
Proposition If
in distribution where \(a,b\) are constants, then
in distribution.
Lemma Suppose \(X_n\) converges to a constant \(c\) in distribution. Then \(X_n\to c\) in probability.
Lemma Suppose \(X_n\to X\) in distribution, and \(Y_n\to 0\) in distribution, then
-
\(X_n+Y_n\to X\) in distribution.
-
\(X_nY_n\to 0\) in distribution.
Theorem Suppose \(X_n\to X\) in distribution. Then there exists a probability space and random variables in the space \(\{Y_n\}\) and \(Y\) such that
This is the coupling viewpoint: convergence in distribution can be realized as almost sure convergence on another probability space.
Uniform Integrable
Definition A collection \((X_i,i\in I)\) of random variables is Uniform Integrable (UI) if
Lemma If the family contains finitely many random variables in \(L^1\), then it is UI.
Lemma
-
A UI family is bounded in \(L^1\).
-
If a family of random variables is bounded in \(L^p\) for \(p>1\), then it is UI.
Convergence in \(L^1\) vs. almost sure convergence
Proposition Suppose that \(X_n,X\in L^1\) and \(X_n\to X\) a.s. Then
Corollary Suppose that \(\{X_n,n\ge 1\}\) is UI, and that \(X_n\to X\) in distribution. Then
Summary
The basic implication chain is
Also,
Together with UI, almost sure convergence is equivalent to convergence in \(L^1\):
By the coupling theorem, convergence in distribution can be represented as almost sure convergence after moving to a suitable probability space.
Exercises of Chapter 1
Exercise 1.4.2 Suppose \(\{X_1,\ldots,X_n,X_{n+1},\ldots,X_{n+m}\}\) are independent random variables. Then
are independent.
Question: What is \(\mathcal{B}(\mathbb{R}^n)\)?
Exercise 1.7.5 Suppose that \(Z\) is a random variable such that \(Z\) is independent of itself. Show that \(Z\) is almost surely a constant.
Hint: Cauchy inequality?