8.3: Hypothesis Tests for the Mean (sigma unknown)

Colorado Online

8.3: Hypothesis Tests for the Mean (sigma unknown)

Tests About μ When σ is Unknown—The t-test for the Population Mean

As we mentioned earlier, only in a few cases is it reasonable to assume that the population standard deviation, σ, is known. The case where σ is unknown is much more common in practice. What can we use to replace σ? If you don’t know the population standard deviation, the best you can do is find the sample standard deviation, S, and use it instead of σ. (Note that this is exactly what we did when we discussed confidence intervals).

A large circle represents the population of interest. μ is unknown and σ is unknown. From the population we create a SRS of size n, represented by a smaller circle. We can find x-bar for this SRS, and we can also obtain S. We use this instead of the unknown σ.

Is that it? Can we just use S instead of σ, and the rest is the same as the previous case? Unfortunately, it’s not that simple, but not very complicated either.

We will first go through the four steps of the t-test for the population mean and explain in what way this test is different from the z-test in the previous case. For comparison purposes, we will then apply the t-test to a variation of the two examples we used in the previous case, and end with an activity where you’ll get to carry out the t-test yourself.

Let’s start by describing the four steps for the t-test:

I. Stating the hypotheses.

In this step there are no changes:

* The null hypothesis has the form:

H₀ : μ = μ₀

(where μ₀is the null value).

* The alternative hypothesis takes one of the following three forms (depending on the context):

H_a : μ < μ₀ (one-sided)

H_a : μ > μ₀ (one-sided)

Ha : μ ≠ μ₀ (two-sided)

II. Checking the conditions under which the t-test can be safely used and summarizing the data.

Technically, this step only changes slightly compared to what we do in the z-test. However, as you’ll see, this small change has important implications. The conditions under which the t-test can be safely carried out are exactly the same as those for the z-test:

(i) The sample is random (or at least can be considered random in context).

(ii) We are in one of the three situations marked with a green check mark in the following table (which ensure that [latex]\overline{X}[/latex] is at least approximately normal):

Assuming that the conditions are met, we calculate the sample mean [latex]\overline{x}[/latex] and the sample standard deviation, S (which replaces σ), and summarize the data with a test statistic. As in the z-test, our test statistic will be the standardized score of [latex]\overline{X}[/latex] assuming that μ = μ₀(H_o is true). The difference here is that we don’t know σ, so we use S instead. The test statistic for the t-test for the population mean is therefore:

[latex]t=\frac{\overline{x} - u_{0}}{\frac{s}{\sqrt{n}}}[/latex]

The change is in the denominator: while in the z-test we divided by the standard deviation of [latex]\overline{x}[/latex], namely [latex]\frac{\sigma}{\sqrt{n}}[/latex], here we divide by the standard error of [latex]\overline{X}[/latex], namely [latex]\frac{s}{\sqrt{n}}[/latex]. Does this have an effect on the rest of the test? Yes. The t-test statistic in the test for the mean does not follow a standard normal distribution. Rather, it follows another bell-shaped distribution called the t distribution. So we first need to introduce you to this new distribution as a general object. Then, we’ll come back to our discussion of the t-test for the mean and how the t-distribution arises in that context.

The t Distribution

We have seen that variables can be visually modeled by many different sorts of shapes, and we call these shapes distributions. Several distributions arise so frequently that they have been given special names, and they have been studied mathematically. So far in the course, the only one we’ve named is the normal distribution, but there are others. One of them is called the t distribution.

The t distribution is another bell-shaped (unimodal and symmetric) distribution, like the normal distribution; and the center of the t distribution is standardized at zero, like the center of the normal distribution.

Like all distributions that are used as probability models, the normal and the t distribution are both scaled, so the total area under each of them is 1.

So how is the t distribution fundamentally different from the normal distribution?

The spread.

The following picture illustrates the fundamental difference between the normal distribution and the t distribution:

You can see in the picture that the t distribution has slightly less area near the expected central value than the normal distribution does, and you can see that the t distribution has correspondingly more area in the “tails” than the normal distribution does. (It’s often said that the t distribution has “fatter tails” or “heavier tails” than the normal distribution.)

This reflects the fact that the t distribution has a larger spread than the normal distribution. The same total area of 1 is spread out over a slightly wider range on the t distribution, making it a bit lower near the center compared to the normal distribution, and giving the t distribution slightly more probability in the ‘tails’ compared to the normal distribution.

Therefore, the t distribution ends up being the appropriate model in certain cases where there is more variability than would be predicted by the normal distribution. One of these cases is stock values, which have more variability (or “volatility,” to use the economic term) than would be predicted by the normal distribution.

There’s actually an entire family of t distributions. They all have similar formulas (but the math is beyond the scope of this introductory course in statistics), and they all have slightly “fatter tails” than the normal distribution. But some are closer to normal than others. The t distributions that are closer to normal are said to have higher “degrees of freedom” (that’s a mathematical concept that we won’t use in this course, beyond merely mentioning it here). So, there’s a t distribution “with one degree of freedom,” another t distribution “with 2 degrees of freedom” which is slightly closer to normal, another t distribution “with 3 degrees of freedom.” which is a bit closer to normal than the previous ones, and so on.

The following picture illustrates this idea with just a couple of t distributions (note that “degrees of freedom” is abbreviated “d.f.” on the picture):

The standard normal z-distribution curve overlaid with a t-distribution with 5 d.f., and a t-distribution with 2 d.f. The distribution with 2 t.f. is shorter and has more spread than the t-distribution with 5 d.f., which in turn is shorter and wider than the standard normal distribution.

Learn by Doing

The following figure of the standard normal distribution together with a t distribution will visually help you answer the following questions.

The standard normal Z distribution curve and the t-distribution curve overlaid on top of each other, centered at a z-score of 0. At z-score = 3, a blue vertical line has been drawn. Here, the t distribution's wider spread causes it to be higher than the standard normal curve. Going right, we see that the standard normal curve reaches zero much sooner compared to the t distribution curve.

Did I get this?

The following figure of the standard normal distribution together with a t distribution will visually help you answer the following questions.

Now let’s return to our discussion of the test for the mean, and let’s see how and why the t distribution arises in that context.

Recall that we were discussing the situation of testing for a mean, in the case when sigma is unknown. We’ve seen previously that when sigma is known, the test statistic is [latex]z=\frac{\overline{x}-\mu_0}{\frac{\sigma}{\sqrt{n}}}[/latex] (note the sigma (σ) in the formula), which follows a normal distribution. But when sigma is unknown, the test statistic in the test for a mean becomes [latex]t=\frac{\overline{x}-\mu_0}{\frac{s}{\sqrt{n}}}[/latex] (note the use of “s” in the formula, in place of the unknown sigma). Here is where the t-distribution arises in the context of a test for a mean, because [latex]t=\frac{\overline{x}-\mu_0}{\frac{s}{\sqrt{n}}}[/latex] (with “s” in the formula in place of the unknown sigma) follows a t distribution.

Notice the only difference between the formula for the Z statistic and the formula for the t statistic: In the formula for the Z statistic, sigma (the standard deviation of the population) must be known; whereas, when sigma isn’t known, then “s” (the standard deviation of the sample data) is used in place of the unknown sigma. That’s the change that causes the statistic to be a t statistic.

Why would this single change (using “s” in place of “sigma”) result in a sampling distribution that is the t distribution instead of the standard normal (Z) distribution? Remember that the t distribution is more appropriate in cases where there is more variability. So why is there more variability when s is used in place of the unknown sigma?

Well, remember that sigma (σ) is a parameter (it’s the standard deviation of the population), whose value therefore never changes. Whereas, s (the standard deviation of the sample data) varies from sample to sample, and therefore it’s another source of variation. So, using s in place of sigma causes the sampling distribution to be the t distribution because of that extra source of variation:

In the formula [latex]z=\frac{\overline{x}-\mu_0}{\frac{\sigma}{\sqrt{n}}}[/latex], the only source of variation is the sampling variability of the sample mean [latex]\overline{X}[/latex] (none of the other terms in that formula vary randomly in a given study);

Whereas in the formula [latex]t=\frac{\overline{x}-\mu_0}{\frac{s}{\sqrt{n}}}[/latex], there are two sources of variation: One source is the sampling variability of the sample mean [latex]\overline{X}[/latex]; The other source is the sampling variability of sample standard deviation s.

So, in a test for a mean, if sigma isn’t known, then s is used in place of the unknown sigma and that results in the test statistic being a t score.

The t score, in the context of a test for a mean, is summarized by the following figure:

In fact, the t score that arises in the context of a test for a mean is a t score with (n – 1) degrees of freedom. Recall that each t distribution is indexed according to “degrees of freedom.” Notice that, in the context of a test for a mean, the degrees of freedom depend on the sample size in the study. Remember that we said that higher degrees of freedom indicate that the t distribution is closer to normal. So in the context of a test for the mean, the larger the sample size, the higher the degrees of freedom, and the closer the t distribution is to a normal z distribution. This is summarized with the notation near the bottom on the following image:

The larger the sample size n, the closer the t-distribution gets to the standard normal.

As a result, in the context of a test for a mean, the effect of the t distribution is most important for a study with a relatively small sample size.

We are now done introducing the t distribution. What are implications of all of this?

1. The null distribution of our t-test statistic: [latex]t=\frac{\overline{x}-\mu_0}{\frac{s}{\sqrt{n}}}[/latex] is the t distribution with (n-1) d.f. In other words, when H_o is true (i.e., when μ=μ₀), our test statistic has a t distribution with (n-1) d.f., and this is the distribution under which we find p-values.

2. For a large sample size (n), the null distribution of the test statistic is approximately Z, so whether we use t(n-1) or Z to calculate the p-values should not make a big difference. Here is another practical way to look at this point. If we have a large n, our sample has more information about the population. Therefore, we can expect the sample standard deviation s to be close enough to the population standard deviation, σ, so that for practical purposes we can use s as the known σ, and we’re back to the z-test.

3. Finding the p-value

The p-value of the t-test is found exactly the same way as it is found for the z-test, except that the t distribution is used instead of the Z distribution, as the figures below illustrate.

Comment:

Even though tables exist for the different t distributions, we will only use software to do the calculation for us.

H_a: μ < μ_0 ⇒ p-value = P(t(n-1) ≤ t) A t(n-1) distribution with t-scores on its horizontal axis. T-scores of 0 and t have been marked, with t to the left of 0. t has been generated from a observed test statistic. The area to the left of t under the curve is the p-value.

H_a: μ > μ_0 ⇒ p-value = P(t(n-1) ≥ t) A t(n-1) distribution with t-scores on its horizontal axis. T-scores of 0 and t have been marked, with t to the right of 0. t has been generated from a observed test statistic. The area to the right of t under the curve is the p-value.

Ha: μ ≠ μ_0 ⇒ p-value = P(t(n-1) ≤ -|t|) + P(t(n-1) ≥ |t|) = 2P(t(n-1) ≥ |t|) A t(n-1) distribution with t-scores on its horizontal axis. T-scores of -|t|, 0, and |t| have been marked. -|t| is to the left of 0, and |t| is to the right. t has been generated from a observed test statistic. The sum of the area under the curve to the left of -|t| and to the right of |t| is the p-value.

Comment

Note that due to the symmetry of the t distribution, for a given value of the test statistic t, the p-value for the two-sided test is twice as large as the p-value of either of the one-sided tests. The same thing happens when p-values are calculated under the t distribution as when they are calculated under the Z distribution.

4. Drawing Conclusions

As usual, based on the p-value (and some significance level of choice) we assess the significance of results, and draw our conclusions in context.

To summarize:

The main difference between the z-test and the t-test for the population mean is that we use the sample standard deviation s instead of the unknown population standard deviation σ. As a result, the p-values are calculated under the t distribution instead of under the Z distribution. Since we are using software, this doesn’t really impact us practically. However, it is important to understand what is going on behind the scenes, and not just use the software mechanically. This is why we went through the trouble of explaining the t distribution.

We are now ready to look at two examples.

For comparison purposes, we will use a modified version of the two problems we used in the previous case. We’ll first introduce the modified versions and explain the changes.

Example

1

The SAT is constructed so that scores have a national average of 500. The distribution is close to normal. The dean of students of Ross College suspects that in recent years the college attracts students who are more quantitatively inclined. A random sample of 4 students entering Ross college had an average math SAT (SAT-M) score of 550, and a sample standard deviation of 100. Does this provide enough evidence for the dean to conclude that the mean SAT-M of all Ross College students is higher than the national mean of 500?

Here is a figure that represents this example where the changes are marked in blue:

A large circle represents all of the Students at Ross College. We are interested in finding μ, or the mean of the SAT-M scores, which has a normal distribution. The question we need to answer is "is the mean SAT-M 500 (national mean) or is it higher?" We take a sample from the population of size n = 4, represented by a smaller circle. For this sample, x-bar = 550, and S = 100.

Note that the problem was changed so that the population standard deviation (which was assumed to be 100 before) is now unknown, and instead we assume that the sample of 4 students produced a sample mean of 550 (no change) and a sample standard deviation of s=100. (Sample standard deviations are never such nice rounded numbers, but for the sake of comparison we left it as 100.) Note that due to the changes, the z-test for the population mean is no longer appropriate, and we need to use the t-test.

Example

2

A certain prescription medicine is supposed to contain an average of 250 parts per million (ppm) of a certain chemical. If the concentration is higher than this, the drug may cause harmful side effects; if it is lower, the drug may be ineffective. The manufacturer runs a check to see if the mean concentration in a large shipment conforms to the target level of 250 ppm or not. A simple random sample of 100 portions is tested, and the sample mean concentration is found to be 247 ppm with a sample standard deviation of 12 ppm. Again, here is a figure that represents this example where the changes are marked in blue:

A large circle represents the population, which is the shipment. μ represents the concentration of the chemical. We need to answer "is the mean concentration the required 250ppm or not?" Selected from the population is a sample of size n=100, represented by a smaller circle. x-bar for this sample is 247 and S=12.

The changes are similar to example 1: we no longer assume that the population standard deviation is known, and instead use the sample standard deviation of 12. Again, the problem was thus changed from a z-test problem to a t-test problem.

However, as we mentioned earlier, due to the large sample size (n = 100) there should not be much difference whether we use the z-test or the t-test. The sample standard deviation, s, is expected to be close enough to the population standard deviation $σ$ . We’ll see this as we solve the problem.

Let’s carry out the t-test for both of these problems:

Example 1:

1. There are no changes in the hypotheses being tested:

H_0: μ = 500, H_a: μ > 500

2. The conditions that allow us to use the t-test are met since:

(i) The sample is random.

(ii) SAT-M is known to vary normally in the population (which is crucial here, since the sample size is only 4).

In other words, we are in the following situation:

The test statistic is [latex]t=\frac{\bar{x}-u_{o}}{\frac{s}{\sqrt{n}}}=\frac{550-500}{\frac{100}{\sqrt{4}}}=1[/latex]

The data (represented by the sample mean) are 1 standard error above the null value.

3. Finding the p-value.

Recall that in general the p-value is calculated under the null distribution of the test statistic, which,

in the t-test case, is t(n-1). In our case, in which n = 4, the p-value is calculated under the t(3) distribution:

A t(3) distribution with t-scores 0 and 1 marked. The p-value is the area under the curve to the right of t-score 1.

Using statistical software, we find that the p-value is 0.196. For comparison purposes, the p-value that we got when we carried out the z-test for this problem (when we assumed that 100 is the known $σ$ rather the calculated sample standard deviation, s) was 0.159.

It is not surprising that the p-value of the t-test is larger, since the t distribution has fatter tails. Even though in this particular case the difference between the two values does not have practical implications (since both are large and will lead to the same conclusion), the difference is not trivial.

4. Making conclusions.

The p-value (0.196) is large, indicating that the results are not significant. The data do not provide enough evidence to conclude that the mean SAT-M among Ross College students is higher than the national mean (500).

Here is a summary:

Example 2:

1. There are no changes in the hypotheses being tested:

2. The conditions that allow us to use the t-test are met:

(i) The sample is random

(ii) The sample size is large enough for the Central Limit Theorem to apply and ensure the normality of [latex]\overline{X}[/latex]. In other words, we are in the following situation:

The test statistic is: [latex]t=\frac{x-u_{0}}{\frac{s}{\sqrt{n}}}=\frac{247-250}{\frac{12}{\sqrt{100}}}=-2.5[/latex]

The data (represented by the sample mean) are 2.5 standard errors below the null value.

3. Finding the p-value.

A t(99) curve, for which the horizontal axis has been labeled with t-scores of -2.5 and 2.5 . The area under the curve and to the left of -2.5 and to the right of 2.5 is the p-value.

To find the p-value we use statistical software, and we calculate a p-value of 0.014 with a 95% confidence interval of (244.619, 249.381). For comparison purposes, the output we got when we carried out the z-test for the same problem was a p-value of 0.012 with a 95% confidence interval of (244.648, 249.352).

Note that here the difference between the p-values is quite negligible (.002). This is not surprising, since the sample size is quite large (n = 100) in which case, as we mentioned, the z-test (in which we are treating s as the known $σ$ ) is a very good approximation to the t-test. Note also how the two 95% confidence intervals are similar (for the same reason).

4. Conclusions:

The p-value is small (.014) indicating that at the 5% significance level, the results are significant. The data therefore provide evidence to conclude that the mean concentration in entire shipment is not the required 250.

Here is a summary:

Comments

The 95% confidence interval for $μ$ can be used here in the same way it is used when $σ$ is known: either as a way to conduct the two-sided test (checking whether the null value falls inside or outside the confidence interval) or following a t-test where H_o was rejected (in order to get insight into the value of $μ$ ).
While it is true that when $σ$ is unknown and for large sample sizes the z-test is a good approximation for the t-test, since we are using software to carry out the t-test anyway, there is not much gain in using the z-test as an approximation instead. We might as well use the more exact t-test regardless of the sample size.

However, it is always worthwhile knowing what happens behind the scenes.

Did I get this?

A group of Internet users 50-65 years of age were randomly chosen and asked to report the weekly number of hours they spend online. The purpose of the study was to determine whether the mean weekly number of hours that Internet users in that age group spend online differs from the mean for Internet users in general, which is 12.5 (as reported by “The Digital Future Report: Surveying the Digital Future, Year Four”). The following information is available:

One-Sample T: hr. online. Test of mu = 12.5 vs mu not = 12.5 Variable: hr. online N: 125 Mean: 12.008 StDev: 3.214 SE Mean: 0.287 95% CI: (11,439, ) T: -1.71 P: 0.090

To Summarize

1. In hypothesis testing for the population mean ( $μ$ ), we distinguish between two cases:

I. The less common case when the population standard deviation ( $σ$ ) is known.

II. The more practical case when the population standard deviation is unknown and the sample standard deviation (s) is used instead.

2. In the case when $σ$ is known, the test for $μ$ is called the z-test, and in case when $σ$ is unknown and s is used instead, the test is called the t-test.

3. In both cases, the null hypothesis is: $H_{0} : μ = μ_{0}$

and the alternative, depending on the context, is one of the following:

$H_{a} : μ < μ_{0}$ , or $H_{a} : μ > μ_{0}$ , or $H_{a} : μ \neq μ_{0}$

4. Both tests can be safely used as long as the following two conditions are met:

(i) The sample is random (or can at least be considered random in context).

(ii) Either the sample size is large (n > 30) or, if not, the variable of interest can be assumed to vary normally in the population.

5. In the z-test, the test statistic is:

[latex]z=\frac{\overline{x}-\mu_0}{\frac{\sigma}{\sqrt{n}}}[/latex]

whose null distribution is the standard normal distribution (under which the p-values are calculated).

6. In the t-test, the test statistic is:

[latex]t=\frac{x-\mu_0}{\frac{s}{\sqrt{n}}}[/latex]

whose null distribution is t(n – 1) (under which the p-values are calculated).

7. For large sample sizes, the z-test is a good approximation for the t-test.

8. Confidence intervals can be used to carry out the two-sided test H_0: μ = μ_0 vs. $H_{a} : μ \neq μ_{0}$ , and in cases where H_o is rejected, the confidence interval can give insight into the value of the population mean ( $μ$ ).

9. Here is a summary of which test to use under which conditions:

Learn by Doing

Scenario: The Intel Corporation is conducting quality control on its circuit boards. Thickness of the manufactured circuit boards varies unavoidably from board to board. Suppose the thickness of the boards produced by a certain factory process varies normally. The distribution of thickness of the circuit boards is supposed to have the mean μ = 12 mm if the manufacturing process is working correctly. A random sample of five circuit boards is selected and measured, and the average thickness is found to be 9.13 mm, and the standard deviation for the sample is computed to be 1.11 mm.

Now, suppose that Intel is testing a brand new manufacturing process, for which prior information wasn’t available. In particular, for this new process, the population distribution’s shape isn’t known. Use the following histograms to help you answer the question below.

4 Histograms, all titled "Histogram of thickness (mm)," with a vertical axis for frequency and horizontal axis for thickness (mm). Histogram A roughly follows a normal shape and has the following data, organized in "thickness: frequency" order: 8: 1.0 9: 2.0 10: 3.0 11: 2.0 12: 1.0 Histogram B is right-skewed: 8: 7 9: 8 10: 5 11: 3 12: 2 13: 2 14: 2 15: 1 16: 1 17: 1 18: 1 19: 1 20: 1 Histogram C is also right skewed: (same data as Histogram B) Histogram D is right skewed: 8: 3.0 9: 2.0 10: 1.0 11: 1.0 12: 1.0 13: 1.0

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Tests About μ When σ is Unknown—The t-test for the Population Mean

The t Distribution

3. Finding the p-value

Comment:

Comment

4. Drawing Conclusions

1

2

Comments

To Summarize

License

Share This Book