- This is part of probstat
Consider a certain distribution. The mean
of the distribution is the expected value of a random variable
sample from the distribution. I.e.,
.
Also recall that the variance of the distribution is
And finally, the standard deviation is
.
Sample Statistics
Suppose that you take
samples
independently from this distribution. (Note that
are random variables.)
Sample means
The statistic
is called a sample mean. Since
are random variables, the mean
is also a random variable.
We hope that
approximates
well. We can compute:
and since
are independent, we have that
Sample variances and sample standard deviations
We can also use the sample to estimate
.
The statistic
is called a sample variance. The sample standard deviation is
.
Note that the denominator is
instead of
.
We can show that
.
We note that since
and
are independent, we have that
.
Let's deal with the middle term here.
Let's work on the third term which ends up being the same as the middle term.
Let's put everything together:
Summary
Sample means:
Sample variance:
Properties of sample means and sample variances
![{\displaystyle E[{\bar {X}}]=\mu }](https://wikimedia.org/api/rest_v1/media/math/render/svg/0b24da826c471c03227c6d06047b99ea9209638c)
![{\displaystyle Var[{\bar {X}}]=\sigma ^{2}/n}](https://wikimedia.org/api/rest_v1/media/math/render/svg/7b4eb60e5404746f2244eb4ec7a4c9d07beb2773)
![{\displaystyle E[S^{2}]=\sigma ^{2}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/7163f09dd5cfeb7bb5f38d475062cc5676f072e5)
Distribution of sample means
While we know basic properties of sample means
, if we want to perform other statistical calculation (i.e., computing confidence intervals or testing hypotheses), it is very useful to know the exact distribution of
.
For a general population, it will be hard to deal the the distribution of
exactly. However, if the population is normal, we are in a very good shape.
Recall the definition of
:
Therefore,
is a sum of independent normally distributed random variables. A nice property of normal random variables is that the sum of normally distributed random variables remains a normal random variable. Since a normal random variable is uniquely determined by its mean and variance, we have the following observation.
Examples
Ex1. Suppose that the population has mean
and variance
. If you select a sample of size 20, what is the probability that the sample mean
is greater than 17?
Solution:
The sample mean
is normal with mean
and variance
. Therefore,
is unit normal.
Note that
We can look at the standard normal table and find out that
, for a unit normal random variable Z. Thus, the probability
which is roughly 1%.
Ex2.
- To be added...
Why do we use normal distributions?
Normal random variables appear very often in our treatment of statistics. This is not just a coincidence. See limit theorems.