|
EECS 126 - Probability and Random
Processes - J. Walrand |
RANDOM PROCESSES – BERNOULLI - POISSON
·
Markov
We have looked at a finite number of random variables. In many applications, one is interested in the evolution in time of random variables. For instance, one watches on an oscilloscope the noise across two terminals. One may observe packets that arrive at an Internet router, or cosmic rays as they hit a detector. If w designates the outcome of the random experiment (as usual) and t the time, then one is interested in a collection of random variables X = {X(t, w), t Î T} where T designates the set of times. Typically, T = {0, 1, 2, …} or {…, -2, -1, 0, 1, 2, …} or [0, T] for some T < ¥ or [0, ¥) or (-¥,¥). When T is countable, one says that X is a discrete-time random process. When T is an interval (possibly infinite), one says that X is a continuous-time random process. In this section, we look at two simple random processes.
Let X = {Xn, n ³ 1} be i.i.d. with P(Xn = 1) = p = 1 – P(Xn = 0). The discrete-time random process is called the Bernoulli process. This process models flipping a coin.
Assume we have watched the first n coin flips. How long do we wait for the next 1? That is, let
t = min{m > 0 | Xn + m = 1}.
We want to calculate
P[t > m | X1, X2, …, Xn].
Because of the independence, we find
P[t > m | X1, X2, …, Xn] = P[Xn + 1 = 0, Xn + 2 = 0, …, Xn + m = 0 | X1, X2, …, Xn]
= (1 – p)m.
Thus, the time until the next 1 is geometrically distributed with mean 1/p.
Assume that we have been flipping the coin forever. How has it been since the last 1? We can argue as before and say that it has been more than m steps since the last 1 if Xn = 0, Xn - 1 = 0, …, Xn - m = 0 and the probability of that event is also (1 – p)m + 1. Thus, the time since the last 1 is a geometrically distributed random variable with mean 1/p minus one. That is, the mean number of steps since the last one is (1/p) – 1.
There are two ways of looking at the time between two successive 1s (recall Bertrand’s paradox).
The first way is to choose some time n and to look at the time since the last 1 and the time until the next 1. We know that these times have mean (1/p) – 1 and 1/p, respectivelly. In particular, the average time between these two 1s around some time n is (2/p) - 1. Note that this time is equal to 1 when p = 1, as it should be.
The second way is to start at some time n, wait until the next 1 and count the time until the next 1. We can redefine the time of the first 1 as time 0 and count until the next 1. But in this way, we find that the time between two consecutive 1s is geometrically distributed with mean 1/p.
The two different answers that we just described are called the Saint Petersburg paradox. The way to explain it is to notice that when we pick some time n and we look at the previous and next 1s, we are more likely to have picked an n that falls in a large gap between two 1s than an n that falls in a small gap. Accordingly, we expect the average duration of the interval in which we have picked n to be larger than the typical interval between consecutive 1s. In other words, by picking n we face a sampling bias.
The geometric distribution is memoryless. That is, if t is geometrically distributed with mean 1/p, then if we know that t is larger than n, then t – n is still geometrically distributed with mean 1/p. Indeed, we find that
P[t– n > m | t > n] = P[t > m + n | t > n] = P(t > m + n and t > n)/P(t > n)
= P(t > m + n)/P(t > n) = (1 – p)m + n/(1 – p)n = (1 – p)m.
Intuitively, this result is immediate if we think of t as the time until the next 1. Knowing that we have already flipped the coin n times and got 0s does not change how many more times we still have to flip it to get the next 1. (Remember that we assume that we know the probability of getting a 1.)
Define Yn = Y0 + 2(X1 + … + Xn) – n. The interpretation of Yn is that it represents the accumulated fortune of a gamble who gains 1 with probability p and loses 1 otherwise at each step of a game; the steps are all independent. Two typical evolutions of Yn are shown below when Y0 = 0. The top graph corresponds to p = 0.54, the bottom one to p = 0.46.

Assume the gambler plays the above game and starts with an initial fortune Y0 = A > 0. What is the probability that her fortune will reach B > A before it reaches 0 and the gambler is bankrupt? To solve this problem, we define TB = min{n ³ 0 | Yn = B}. We define T0 in a similar way. We want to compute a(A) = P[TB < T0|Y0 = A]. A moment of reflection shows that if A > 0, then
a(A) = P[TB < T0 and X1 = 1|Y0 = A] + P[TB < T0 and X1 = 1|Y0 = A]
= P[TB < T0 and X1 = 1|Y0 = A] + P[TB < T0 and X1 = -1|Y0 = A]
= p´P[TB < T0|Y0 = A, X1 = 1] + (1 – p)´P[TB < T0 |Y0 = A, X1 = -1]
= p´P[TB < T0|Y1 = A + 1, X1 = 1] + (1 – p)´P[TB < T0 |Y1 = A - 1, X1 = -1]
= p´P[TB < T0|Y1 = A + 1] + (1 – p)´P[TB < T0 |Y1 = A - 1]
= pa(A + 1) + (1 – p)a(A - 1).
The boundary conditions of this ordinary difference equation are a(0) = 0 and a(B) = 1. You can verify that the solution is
a(A) = A/B when p = 0.5 and a(A) = (rA – 1)/(rB – 1) with r = (1 – p)/p when p ¹ 0.5.
Note for instance that with p = 0.48, A = 100, and B = 1000, one finds a(A) = 2´10-35. Remember this on your next trip to Las Vegas.
Assume now that you have a rich uncle who gives you $1.00 every time you go bankrupt, so that you can keep on playing the game forever. To define the game precisely, say that when you hit 0, you can still play the game. If you lose again, you are back at 0, otherwise to have 1, and so on. The figure below shows a typical evolution of your fortune. The top graph corresponds to p = 0.43 and the other one to p = 0.3. Not surprisingly, with p = 0.3 you are always poor.

One interesting question is how much money you have, on average. For instance, looking at the lower graph, we can see that a good fraction of the time your fortune is 0, some other fraction of the time it is 1, and a very small fraction of the time it is larger than 2. How van we calculate these fractions? One way to answer this question is as follows. Let p(k) = P(Yn = k) for k ³ 1. We assume that this probability does not depend on n. Then, for k > 0,
P(Yn+1 = k) = P(Yn+1 = k, Yn = k –1) + P(Yn+1 = k, Yn = k + 1), so that
p(k) = P[Yn+1 = k | Yn = k –1]P(Yn = k –1) + P[Yn+1 = k | Yn = k + 1]P(Yn = k +1)
= pp(k - 1) + (1 – p)p(k + 1), k = 1, 2, ….
For k = 0, one has
P(Yn+1 = 0) = P(Yn+1 = 0, Yn = 0) + P(Yn+1 = 0, Yn = 1), so that
p(0) = P[Yn+1 = 0 | Yn = 0]P(Yn = 0) + P[Yn+1 = 0 | Yn = 1]P(Yn = 1)
= pp(0) + (1 – p)p(1).
You can verify that the solution of these equations is
p(k) = Ark, k ³ 0 where r = p/(1 – p).
Since Skp(k) = 1, we see that a solution is possible if r < 1. i.e.. if p < 0.5. The solution corresponds to A = (1 – r), so that
p(k) = (1 – r)rk, k ³ 0 where r = p/(1 – p) when p < 0.5.
When p ³ 0.5, there is not solution where P(Yn = k) does not depend on n. To understand what happens in that case, let’s look at the case p = 0.6. The evolution of the fortune in that case is shown below.

The graph shows that, as time evolves, you get richer and richer. The case p = 0.5 is more subtle. Here is a simulation result.

In this case, the fortune fluctuates and makes very long excursions. One can show that the fortune comes back to 0 infinitely often but that it takes a very long time to come back. So long in fact that the fraction of time that the fortune is zero is negligible. In fact, the fraction of the time that the fortune is any given value is zero!
How do we show all this? Let p = 0.5. We know that P[TB < T0|Y0 = A] = A/B. Thus, P[T0 < TB|Y0 = A] = (B – A)/B. As B increases to infinity, we see that TB also increases to infinity, to that P[T0 < TB|Y0 = A] increases to P[T0 < ¥ |Y0 = A] (by continuity of probabilities). Thus, P[T0 < ¥ |Y0 = A] = 1 (because (B – A)/B tends to 1 as B increases to infinity). This shows that the fortune comes back to 0 infinitely often. We can calculate how longs it takes to come back to 0. To do this, define b(A) = E[min{T0, TB}|Y0 = A]. Arguing as we have for the gambler’s ruin problem, you can justify that
b(A) = 1 + 0.5b(A - 1) + 0.5b(A + 1) for A > 0.
Also, b(0) = 0 = b(B). You can verify that the solution is b(A) = A(B – A). Now, as B increases to infinity, TB also increases to infinity. Since T0 is finite, it follows that min{T0, TB} increases to T0. Now, there is a result (called the Monotone Convergence Theorem) that says that if positive random variables increase to another random variable, then their expectations increase to that of the limiting random variable. Consequently, E[T0|Y0 = A] = ¥ for all A > 0.
Going back to the case p > 0.5, we see that the fortune appears to be increasing at an approximately constant rate. In fact, we can see that
(Yn + m – Yn)/m = (Xn + 1 + … + Xn + m)/m » E(Xn) = 1´p + (- 1)´(1 – p) = 2p - 1 for m >> 1.
We can use that observation to say that Yn grows at rate 2p – 1. We can make this more precise by defining a scaled process
Zs(t) = Y[st]/s where [ts] is the smallest integer larger than or equal to ts. We can then say that Zs(t) ® (2p – 1)t as s ® ¥. Now, the convergence is certainly true almost surely for any given t. However, we would like to say that this is true for the trajectories of the process. To see what we mean by this convergence, lets look as a few scaled versions.



These three scaled versions show that the fluctuations gets smaller as we scale and that the trajectory becomes closer to a straight line, in some uniform sense.
Of course, if we look very closely at the last graph above, we can still see some fluctuations. One way to see these well is to blow up the Y-axis. We show a portion of this graph below.

The fluctuations are still there, obviously. One way to analyze these fluctuations is to use the central limit theorem. We can look at
[Yn + m – Yn- m(2p – 1)]/m0.5 = [Xn + 1 + … + Xn + m - m(2p – 1)]/m0.5 » N(0, s2)
where s2 = p(1 – p). This shows that, property scaled, the increments of the fortune look Gaussian. The case p = 0.5 is particularly interesting, because then we don’t have to worry about the mean. In that case,
[Yn + m – Yn]/m0.5 » N(0, 1/4).
We can then scale the process
differently and look at Zs(t) = Y[ts]/s0.5. We find that as s becomes very large, the
increments of Zs(t) become independent and Gaussian. In fact, Zs(t + u) - Zs(t)
is N(0, u/4). If we multiply the
process by 2, we end up with a process W(t) with independent increments such
that W(t + u) – W(t) = N(0, u). Such a process is called a standard Brownian motion process
or Wiener process.
[definition, memoryless, number of jumps, scaling LLN, scaling: Bernoulli -> Poisson, sampling, paradox]
Let X = {X(t), t ³ 0} be defined as
follows. For t ³ 0, X(t) is the number of
jumps in [0, t]. The jump times are {Tn,
n ³
1} where {T1, T2 - T1, T3 – T2,
T4 – T3,…} are i.i.d. and exponentially distributed with
mean 1/l where l > 0. Thus, the times
between jumps are exponentially distributed with mean 1/l and are all independent.
The process X is called a Poisson process with rate l.
Recall that the exponential distribution is memoryless. This implies that, for any t > 0, given {X(s), 0 £ s £ t}, the process {X(s) – X(t), s ³ t} is again Poisson with rate l.
The number of jumps in [0, t], X(t), is a Poisson random variable with mean lt. That is, P(X(t) = n) = (lt)nexp{- lt}/n! for n ³ 0. In view of the memoryless property, the increments of the Poisson process are independent and Poisson distributed.
Indeed, the jump times {T1, T2, ...}
are separated by i.i.d. Exp(l) random
variables. That is, for any selection of 0 < t1 < ...< tn
< t,
P(T1 Î (t1,
t1 + e), ..., Tn Î (tn, tn + e), Tn+1 > t)
= le.exp{- lt1}le.exp{-
l(t2 - t1)}...le.exp{-
l(tn - tn-1)}exp{- l(t
- tn)} = (le)nexp{- lt}.
Hence, the probability that there are n jumps in [0, t] is the integral of the above density on the set S = {t1, ... , tn | 0 < t1 < ...< tn < t}, i.e., |S|lnexp{- lt} where |S| designates the volume of the set S. This set is the fraction 1/n! of the cube [0, t]n. Hence |S| = tn/n! and
P(n jumps in [0, t]) = [(lt)n/n!]exp{- lt}.
We know, from the strong law of large numbers, that X(nt)/n ® lt almost surely as n ® ¥. As in the case of the Bernoulli process, the process {X(nt)/n, t ³ 0} converges to {lt, t ³ 0} in a “trajectory” sense that we have not defined.
Imagine a Bernoulli process {Xn, n ³ 1} with probability p. We look at Yn = X1 + … + Xn. The claim is that if p ® 0 and s ® ¥ in a way that sp ® l, then the process {Ws(t) = Yst, t ³ 0} approaches a Poisson process with rate l. This property follows from the corresponding approximation of a Poisson random variable by a binomial random variable.
Imagine customers that arrive as a Poisson process with rate
l.
With probability p, a customer is male, independently of the other
customers and of the arrival times. The
claim is that the processes that count the arrivals of male and female
customers are independent Poisson processes with rates lp and l(1
– p), respectively. To see this, designate
by X(t) and Y(t) be the number of male and female customers who arrive in [0,
t]. We find that P(X(t) = m, Y(t)=n) = P(m + n arrivals)C(m+n, m)pm(1
– p)n = [lm + n/(m
+ n)!][(m + n)!/m!n!] pm(1 – p)n = P(X(t) = m)P(Y(t) = n)
if X and Y are independent Poisson variables ….
The Saint Petersburg paradox holds for Poisson processes. In fact, the Poisson process seen from an arrival time is the Poisson process plus one point at time 0.
Let X = {X(t), t Î Â} be a random process. We say that it is stationary if {X(t + u), t Î Â} has the same distribution for all u Î Â. In other words, the statistics do not change over time. We cannot use this process to measure time. This would be the case of the weather if there were no long-term trend such as global warming. As an exercise, you can show that our reflected fortune process with p < 0.5 and started with P(Y0 = k) = p(k) is stationary. Also, the Poisson process is stationary.
Let X = {X(t), t Î Â} be a random process. We say that it is time-reversible if {X(t), t Î Â} and if {X(u - t), t Î Â} have the same distribution for all u Î Â. In other words, the statistics do not change when we watch the movie in reverse.
You should show that a time-reversible process is necessarily stationary.
Also, you should show that our reflected fortune process with p < 0.5 and started with P(Y0 = k) = p(k) is time-reversible. Also, the Poisson process is time-reversible.
Roughly, a stochastic process is ergodic if statistics that do not depend on the initial phase of the process are constant. That is, such statistics do not depend on the realization of the process. For instance, if you simulate an ergodic process, you need only one simulation run; it is representative of all possible runs.
Let X = {X(t), t Î Â} be a random process. We compute some statistic Z(w, u) = F(X(w, t + u), t Î Â). That is, we perform calculations on the process starting at time u. We are interested in calculations such that Z(w, u) = Z(w, 0) for all u. (We give examples shortly; don’t worry.) Let’s call such calculations “invariant” random variables of the process X. The process X is ergodic if all its invariant random variables are constant.
As an example, let Z(w, u) = lim T ® ¥ (1/T)ò X(u + t)1{0 £ t £ T}dt. This random variable is invariant. If the process is ergodic, then Z is the same for all w.
You should show that our reflected fortune process with p £ 0.5 is ergodic. The trick is that this process must eventually go back to 0. One can then couple two versions of the process that start off independently and merge the first time they meet. Since they remained glued forever, the long-term statistics are the same.
A random process X = {X(t), t Î Â} is Markov if given X(t), the past {X(s), s < t} and the future {X(s), s > t} are independent. Markov chains are examples of Markov process.
A process with independent increments is Markov. For instance, the Poisson process and the Brownian motion process are Markov.
The reflected fortune process is Markov.
For a simple example of a process that is not Markov, let Yn be the reflected fortune process and define Wn = 1{Yn < 2}.
It is not Markov because if you see that Wn – 1 = 0 and Wn = 1, then you know that Yn = 1. Consequently, we can write
P[W2 = 0 | W1 = 1, W0 = 0] = p.
However,
P[W2 = 0 | W1 = 1, W0 = 1] < p.
Note that this example shows that a function of a Markov process may not be a Markov process.
Jean Walrand – November 2003
--- INDEX