Probstat/notes/random variables
- This is part of probstat.
In many cases, after we perform a random experiment, we are interested in certain quantity from the outcome, not the actual outcome. In that case, we can define a random variable, which is a function from the sample space to real numbers, to represent the random quantity that we are interested in.
For example, consider the following experiment. We toss two dice. Let a random variable X be the sum of the values of these two dice. The table below shows the outcomes and probabilities related to X.
i | Outcomes for which X = i | Probability P{ X = i } |
2 | (1,1) | 1/36 |
3 | (1,2), (2,1) | 2/36 |
4 | (1,3), (2,2), (3,1) | 3/36 |
5 | (1,4), (2,3), (3,2), (4,1) | 4/36 |
6 | (1,5), (2,4), (3,3), (4,2), (5,1) | 5/36 |
7 | (1,6), (2,5), (3,4), (4,3), (5,2), (6,1) | 6/36 |
8 | (2,6), (3,5), (4,4), (5,3), (6,2) | 5/36 |
9 | (3,6), (4,5), (5,4), (6,4) | 4/36 |
10 | (4,6), (5,5), (6,4) | 3/36 |
11 | (5,6), (6,5) | 2/36 |
12 | (6,6) | 1/36 |
A random variable X also induces events related to it. From the previous example, the event that X=10 corresponds to the subset {(4,6), (5,5), 6,4)} of the sample space. Also, if the event X >= 11 corresponds to {(5,6), (6,5), (6,6)}.
Therefore, it is reasonable to consider the probability of events defined by random variables. From the two-dice example, we have P{ X >= 11 } = P({(5,6), (6,5), (6,6)}) = 3/36.
Given a random variable X, a probability mass function p of X is defined as p(i) = P{ X = i }. We usually denote the probability mass function as pmf.
Another example
Suppose that we pick two numbers from the set {1,2,3,4} without replacement. Let Y be the larger number. The following table shows each events defined on various values of Y.
i | Outcomes | Probability P{ Y = i } |
1 | - | 0 |
2 | (1,2), (2,1) | 2/12 = 1/6 |
3 | (1,3), (2,3), (3,1), (3,2) | 4/12 = 1/3 |
4 | (1,4), (2,4), (3,4), (4,1), (4,2), (4,3) | 6/12 = 1/2 |
Expectations
The whole point of having probability models is that we want to say "something" about the experiments without having to perform them or exhaustively trying all their possible outcomes. For a given random variable, we would like to have "some number" that represents it on average. From this motivation, we have the definition of the expectation as follows.
For a integer random variables X, the expected value of X (or the expectation of X), denoted by E[X] is defined as