|
EECS 126 - Probability and Random
Processes - J. Walrand |
·
Examples
·
Moments
A random variable takes real values. The definition is “a random variable is a measurable real-valued function of the outcome of a random experiment.”
Mathematically, one is given a probability space and some function X: W ® Â := (- ¥, + ¥). If the outcome of the random experiment is w, then the value of the random variable is X(w) Î Â.
Physical examples: noise voltage at a given time and place, temperature at a given time and place, height of the next person to enter the room, and so on. The color of a randomly picked apple is not a random variable since its value is not a real number.
An arbitrary real-valued function defined on W is not necessarily a random variable.
For instance, let W = [0, 1] and A = [0, 0.5]. Assume that the events are [0, 1], [0, 0.5], (0.5, 1], and Æ. For instance, assume that we have defined P([0, 0.5]) = 0.73 and that this is all we know. Consider the function X(w) that takes the value 0 when w is in [0, 0.3] and the value 1 when w is in (0.3, 1]. This function is not a random variable. We cannot determine P(X = 0) from the information we have. Accordingly, the statistical properties of X are not defined. This is what we mean by measurability. Thus, measurability is not a subtle notion. It is a first order idea: what are the functions whose statistics are defined by the model – these are the measurable functions.
Let Á be a collection of events of W. (Recall that Á is closed under countable set operations.) A function X: W ® Â is Á-measurable if X-1((- ¥, a]) Î Á for all a Î Â. Thus, we can define P(X £ a) for all a Î Â.
A random variable X is discrete if it takes values in a countable set. We can define P(X = xn) = pn with pn > 0 and Spn = 1.
A random variable X is continuous if one can write P(X Î (a, b]) = ò f(x)1{a < x £ b}dx for all real numbers a < b. In this expression, f(.) is a nonnegative function called the probability density function of x (p.d.f.) and ò(.)dx represents the definite integral.
A discrete random variable is not continuous.
A random variable may be neither discrete nor continuous.
In general, the function {P(X £ x) = F(x), x real}-- called the cumulative probability distribution function (cpdf) of X – completely characterizes the “statistics” of X.
Bernoulli, Geometric, Uniform, Poisson, Exponential, Gaussian
Methods to generate a random variable X with a given c.p.d.f. from uniform random variables are useful in practice and provide a good insight into the meaning of the c.p.d.f. and of the p.d.f.
The first method is to generate a random variable Z uniform in [0, 1] (using the random number generator of your computer) and to define X(Z) = min{a | F(a) ≥ Z}. Then, for any real number b, X £ b if Z £ F(b), which occurs with probability F(b), as desired.
The second method uses the p.d.f. Assume that X is continuous with p.d.f. f(x) and that P(a < X < b) ≈ 1 and f(x) £ c. Pick a point (X, Y) uniformly in [a, b]×[0, c] (by generating two uniform random variables). If the point falls under the curve f(.), i.e., if Y £ f(X), then keep the value X; otherwise, repeat. Then, P(a < X < a + ε) = P[A|B] where A = {(x,y) | a < x < a + ε, y < f(x)} ≈ [a, a + ε] ×[0, f(a)]. Then, P[A|B] = P(A)/P(B) with P(A) ≈ f(a)ε/[(b – a)c] and P(B) ≈ 1/[(b – a)c]. (The factor 1/[(b – a)c] is to normalize our uniform distribution on [a, b]×[0, c] and the term 1 in P(B) comes from the fact that the p.d.f. f(.) integrates to 1.
For a discrete random variable, E(X) = Sxnpn.
For a continuous random variable, E(X) = ò xf(x)dx.
There are some potential problems. The sums could yield ¥ - ¥. In that case, we say that the expectation of the random variable is not defined.
Let X be a random variable and h:  ®  be a function. Then X is some function from W to  and, consequently, so is h(X). Is h(X) a random variable? Well, there is that measurability question. It could be that h(X)-1((- ¥, a]) Ï Á for some a Î Â. One can show that h(X) is a random variable if h(.) is a nice (Borel-measurable) function. We won’t worry about that point here. All the functions h:  ®  that we will encounter are Borel-measurable. Thus, Borel-measurability is a subtle point. The fact that X is a random variable (being Á-measurable) is not subtle.
In some cases, if X has a p.d.f., Y may also have one. For instance, if X has p.d.f. f(.) and Y = aX + b with a > 0, then P(y < Y < y + dy) = P((y – b)/a < X < (y – b)/a + dy/a) = f((y – b)/a)dy/a, so that the p.d.f. of Y, say g(y), is g(y) = f((y – b)/a)/a.
How do we compute E(h(X))? If X is discrete, then so is h(X). One could then look at the values yn of Y = h(X) and their probabilities, say qn = P(Y = yn). Then E(h(X)) = Synqn. There is a clever observation that is very useful. That observation is that E(h(X)) = Sh(xn)pn where pn = P(X = xn). If you think about it, it looks a bit like magic. However, it is not hard to understand what is going on. This expression is useful because you don’t have to calculate the probability of the different values of Y.
If X has p.d.f. f(.), then E(h(X)) = ò h(x)f(x)dx. Again, you don’t have to calculate the c.p.d.f. of Y.
In the general case, if F(.) is the c.p.d.f. of X, then E(h(X)) = ò h(x)dF(x).
I strongly suggest that you look at many examples to make sure that you understand this section well.
The n-th moment of X is E(Xn).
The variance of X is E(X – E(X))2 = E(X2) – {E(X)} 2. It measures the “spread” around the mean.
Inequalities are often useful to estimate some expected values. Here are a few particularly useful ones.
1 + x £ exp{x}
Chebychev (Markov’s research advisor): P(X ³ a) £ E(X2)/a2.
Markov Inequality: If f(.) is nondecreasing and nonnegative on [a, ¥), then P(X ³ a) £ {E(f(X))}/f(a).
Jensen: If f(.) is convex, then E(f(X)) ³ f(E(X)).
Jean Walrand – January 2000 – Updated February
2003 --- INDEX