# ZvsT

- Author:
- William C. Evans

- Topic:
- Statistics

This is just a simple plot to illustrate the relation between the standard Normal PDF (zero mean, unity variance), or Z-statistic, and the "Student's" T PDF, or T-statistic, for various degrees of freedom (which is the sample size minus one).
It is often stated that we switch from the T to the Z at a sample size of 30. This is incorrect. The determining factor for T vs. Z is whether or not the standard deviation of the data is to be estimated from the same dataset as used for estimating the mean. It either is, or it isn't. If it is, use T, regardless of sample size. If not- that is, if the standard deviation (or variance) is considered "known" and it is NOT estimated from the dataset, then use Z. That's all there is to it.
What happened in olden times, before computers, calculators, etc., was that the tables in the back of your stat textbook only had a T-table up to about 30 or so, after which you used the Z-table. Why? Because, as this graphic shows, the numerical difference becomes negligibly small after this. Why print two tables with essentially the same numbers in them? So, teachers got used to instructing their students to "switch from T to Z" at sample sizes of 30 (ish). This is no longer necessary, and hasn't been for many years now.
It can be shown that the formal mathematical expression for the T PDF does indeed approach the form for Z as the sample size approaches infinity (via the calculus process of "taking a limit"). Intuitively, what is happening is that our estimate of the standard deviation gets better and better as the sample size increases; it is pretty poor at small sample sizes, and the T distribution accounts for this. At some point of increasing sample size, we might consider the standard deviation (or, equivalently, here) the variance to be "known." If that is the case, then in effect we are using Z, but only because the T looks almost exactly like Z at larger sample sizes. You can clearly see that in this graphic.

One more time: use T if the population standard deviation (variance) is estimated from the same dataset used to estimate the mean. Use Z if the population standard deviation (variance) is considered "known." *Sample size has nothing to do with it! **
*

In the graphic, select the degrees of freedom slider with the mouse, then use the arrow keys to change the degrees of freedom; it will be easier to see the changes in the graphs.