Distribution of Sample Means
- Brian Sterr
In the diagram above, we start with a simple data set: -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5 Then we take a sample of 3 out of the 11 elements. There are different 3-element subsets we could choose for this sample. Once we take our three element subset, we then average them to get the sample mean, shown above and also indicated by the dark blue point on the dot plot at right. The slider will run through all 165 possible samples, plotting them on the right as well. We see that the 165 possible sample means appear to end up being normally distributed, despite the fact that our original data set was just 11 consecutive integers, uniformly distributed. The Standard Error here would be: , where is the size of the sample. This yields: In theory, the standard error should be the standard deviation of the sample means, which is shown above as . The reason for the difference is that the formula used for standard error assumes the population is essentially infinite, rather than just having 11 values as we have here. There is a formula to correct it for finite populations, but that is beyond the scope of our course.