{"id":555,"date":"2024-10-18T02:38:30","date_gmt":"2024-10-18T02:38:30","guid":{"rendered":"https:\/\/pressbooks.ccconline.org\/mat1260\/?post_type=chapter&#038;p=555"},"modified":"2024-12-19T16:49:59","modified_gmt":"2024-12-19T16:49:59","slug":"8-1-introduction-to-hypothesis-testing","status":"publish","type":"chapter","link":"https:\/\/pressbooks.ccconline.org\/mat1260\/chapter\/8-1-introduction-to-hypothesis-testing\/","title":{"raw":"8.1: Introduction to Hypothesis Testing","rendered":"8.1: Introduction to Hypothesis Testing"},"content":{"raw":"<section class=\"standard post-344 chapter type-chapter status-publish hentry focusable\" data-type=\"chapter\">\r\n<div id=\"lobjh\" class=\"\">\r\n<div class=\"textbox textbox--learning-objectives\"><header class=\"textbox__header\">\r\n<h2 class=\"textbox__title\">Learning Objectives<\/h2>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<ul>\r\n \t<li id=\"explain_hypothesis_testing\">Explain the logic behind and the process of hypotheses testing. In particular, explain what the p-value is and how it is used to draw conclusions.<\/li>\r\n<\/ul>\r\n<\/div>\r\n<\/div>\r\nThe purpose of this section is to gradually build your understanding about how statistical hypothesis testing works. We start by explaining the general logic behind the process of hypothesis testing. Once we are confident that you understand this logic, we will add some more details and terminology.\r\n\r\n<\/div>\r\n<div id=\"N10AF4\" class=\"section\">\r\n<div class=\"sectionContain\">\r\n<h2><span title=\"Quick scroll up\">General Idea and Logic of Hypothesis Testing<\/span><\/h2>\r\n<p id=\"N10AFB\">To start our discussion about the idea behind statistical hypothesis testing, consider the following example:<\/p>\r\n\r\n<div class=\"examplewrap\">\r\n<div class=\"example clearfix\">\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Example<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<p id=\"N10B00\">A case of suspected cheating on an exam is brought in front of the disciplinary committee at a certain university.<\/p>\r\n<p id=\"N10B03\">There are\u00a0<em>two<\/em>\u00a0opposing\u00a0<em>claims<\/em>\u00a0in this case:<\/p>\r\n\r\n<ul class=\"none\">\r\n \t<li>\r\n<p id=\"N10B10\">The\u00a0<em>student\u2019s claim:<\/em>\u00a0I did not cheat on the exam.<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"N10B17\">The\u00a0<em>instructor\u2019s claim:<\/em>\u00a0The student did cheat on the exam.<\/p>\r\n<\/li>\r\n<\/ul>\r\n<p id=\"N10B1D\">Adhering to the principle\u00a0<em>\u201cinnocent until proven guilty,\u201d<\/em>\u00a0the committee asks the instructor for\u00a0<em>evidence<\/em>\u00a0to support his claim. The instructor explains that the exam had two versions, and shows the committee members that on three separate exam questions, the student used in his solution numbers that were given in the other version of the exam.<\/p>\r\n<p id=\"N10B26\">The committee members all agree that\u00a0<em>it would be extremely unlikely to get evidence like that if the student\u2019s claim of not cheating had been true.<\/em>\u00a0In other words, the committee members all agree that the instructor brought forward strong enough evidence to reject the student\u2019s claim, and conclude that the student did cheat on the exam.<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\nWhat does this example have to do with statistics?\r\n\r\n<\/div>\r\n<\/div>\r\n<p id=\"N10B30\">While it is true that this story seems unrelated to statistics, it captures all the elements of hypothesis testing and the logic behind it. Before you read on to understand why, it would be useful to read the example again. Please do so now.<\/p>\r\n<p id=\"N10B33\"><em>Statistical hypothesis testing<\/em>\u00a0is defined as:<\/p>\r\n<p id=\"N10B38\"><em>Assessing evidence provided by the data in favor of or against some claim about the population.<\/em><\/p>\r\n<p id=\"N10B3E\">Here is how the process of statistical hypothesis testing works:<\/p>\r\n\r\n<ol>\r\n \t<li>We have\u00a0<em>two claims<\/em>\u00a0about what is going on in the population. Let\u2019s call them for now\u00a0<em>claim 1<\/em>\u00a0and\u00a0<em>claim 2<\/em>. Much like the story above, where the student\u2019s claim is challenged by the instructor\u2019s claim, claim 1 is challenged by claim 2.\r\n<p id=\"N10B4D\">(<em>Comment:<\/em>\u00a0as you\u2019ll see in the examples that follow, these claims are usually about the value of population parameter(s) or about the existence or nonexistence of a relationship between two variables in the population).<\/p>\r\n<\/li>\r\n \t<li>We choose a sample, collect relevant data and summarize them (this is similar to the instructor collecting evidence from the student\u2019s exam).<\/li>\r\n \t<li>We figure out how likely it is to observe data like the data we got, had claim 1 been true. (Note that the wording \u201chow likely \u2026\u201d implies that this step requires some kind of probability calculation). In the story, the committee members assessed how likely it is to observe the evidence like that which the instructor provided, had the student\u2019s claim of not cheating been true.<\/li>\r\n \t<li>Based on what we found in the previous step, we make our decision:\r\n<ul>\r\n \t<li>If we find that if claim 1 were true it would be extremely unlikely to observe the data that we observed, then we have strong evidence against claim 1, and we reject it in favor of claim 2.<\/li>\r\n \t<li>If we find that if claim 1 were true observing the data that we observed is not very unlikely, then we do not have enough evidence against claim 1, and therefore we cannot reject it in favor of claim 2.<\/li>\r\n<\/ul>\r\n<\/li>\r\n<\/ol>\r\n<p id=\"N10B64\">In our story, the committee decided that it would be extremely unlikely to find the evidence that the instructor provided had the student\u2019s claim of not cheating been true. In other words, the members felt that it is extremely unlikely that it is just a coincidence that the student used the numbers from the other version of the exam on three separate problems. The committee members therefore decided to reject the student\u2019s claim and concluded that the student had, indeed, cheated on the exam. (Wouldn\u2019t you conclude the same?)<\/p>\r\n<p id=\"N10B67\">Hopefully this example helped you understand the logic behind hypothesis testing. To strengthen your understanding of the process of hypothesis testing and the logic behind it, let\u2019s look at three statistical examples.<\/p>\r\n\r\n<div class=\"examplewrap\">\r\n<div class=\"example clearfix\">\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Example 1<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<div>\r\n<p id=\"N10B6F\">A recent study estimated that 20% of all college students in the United States smoke. The head of Health Services at Goodheart University suspects that the proportion of smokers may be lower there. In hopes of confirming her claim, the head of Health Services chooses a random sample of 400 Goodheart students, and finds that 70 of them are smokers.<\/p>\r\n<p id=\"N10B72\">Let\u2019s analyze this example using the 4 steps outlined above:<\/p>\r\n\r\n<ol>\r\n \t<li>\r\n<p id=\"N10B79\"><em>Stating the claims:<\/em><\/p>\r\n<p id=\"N10B7F\">There are two claims here:<\/p>\r\n\r\n<ul class=\"none\">\r\n \t<li>\r\n<p id=\"N10B86\"><em>claim 1:<\/em>\u00a0The proportion of smokers at Goodheart is .20.<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"N10B8D\"><em>claim 2:<\/em>\u00a0The proportion of smokers at Goodheart is less than .20.<\/p>\r\n<\/li>\r\n<\/ul>\r\n<p id=\"N10B93\">Claim 1 basically says \u201cnothing special goes on in Goodheart University; the proportion of smokers there is no different from the proportion in the entire country.\u201d This claim is challenged by the head of Health Services, who suspects that the proportion of smokers at Goodheart is lower.<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"N10B97\"><em>Choosing a sample and collecting data:<\/em><\/p>\r\n<p id=\"N10B9B\">A sample of n = 400 was chosen, and summarizing the data revealed that the sample proportion of smokers is\u00a0[latex]\\hat{\\mathcal{p}}=\\frac{70}{400}=.175[\/latex]<\/p>\r\n<p id=\"N10BC4\">While it is true that .175 is less than .20, it is not clear whether this is strong enough evidence against claim 1.<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"N10BC8\"><em>Assessment of evidence:<\/em><\/p>\r\n<p id=\"N10BCD\">In order to assess whether the data provide strong enough evidence against claim 1, we need to ask ourselves: How surprising is it to get a sample proportion as low as\u00a0<span id=\"MathJax-Element-2-Frame\" class=\"mjx-chtml MathJax_CHTML\"><span id=\"MJXc-Node-14\" class=\"mjx-math\"><span id=\"MJXc-Node-15\" class=\"mjx-mrow\"><span id=\"MJXc-Node-16\" class=\"mjx-mrow\"><span id=\"MJXc-Node-17\" class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span id=\"MJXc-Node-19\" class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">\u02c6<\/span><\/span><\/span><span class=\"mjx-op\"><span id=\"MJXc-Node-18\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">p<\/span><\/span><\/span><\/span><\/span><span id=\"MJXc-Node-20\" class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span id=\"MJXc-Node-21\" class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">.<\/span><\/span><span id=\"MJXc-Node-22\" class=\"mjx-mn MJXc-space1\"><span class=\"mjx-char MJXc-TeX-main-R\">175<\/span><\/span><\/span><\/span><\/span><\/span>\u00a0(or lower), assuming claim 1 is true?<\/p>\r\n<p id=\"N10BEA\">In other words, we need to find how likely it is that in a random sample of size n = 400 taken from a population where the proportion of smokers is p = .20 we\u2019ll get a sample proportion as low as\u00a0<span id=\"MathJax-Element-3-Frame\" class=\"mjx-chtml MathJax_CHTML\"><span id=\"MJXc-Node-23\" class=\"mjx-math\"><span id=\"MJXc-Node-24\" class=\"mjx-mrow\"><span id=\"MJXc-Node-25\" class=\"mjx-mrow\"><span id=\"MJXc-Node-26\" class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span id=\"MJXc-Node-28\" class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">\u02c6<\/span><\/span><\/span><span class=\"mjx-op\"><span id=\"MJXc-Node-27\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">p<\/span><\/span><\/span><\/span><\/span><span id=\"MJXc-Node-29\" class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span id=\"MJXc-Node-30\" class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">.<\/span><\/span><span id=\"MJXc-Node-31\" class=\"mjx-mn MJXc-space1\"><span class=\"mjx-char MJXc-TeX-main-R\">175<\/span><\/span><\/span><\/span><\/span><\/span>\u00a0(or lower).<\/p>\r\n<p id=\"N10C07\">It turns out that the probability that we\u2019ll get a sample proportion as low as\u00a0<span id=\"MathJax-Element-4-Frame\" class=\"mjx-chtml MathJax_CHTML\"><span id=\"MJXc-Node-32\" class=\"mjx-math\"><span id=\"MJXc-Node-33\" class=\"mjx-mrow\"><span id=\"MJXc-Node-34\" class=\"mjx-mrow\"><span id=\"MJXc-Node-35\" class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span id=\"MJXc-Node-37\" class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">\u02c6<\/span><\/span><\/span><span class=\"mjx-op\"><span id=\"MJXc-Node-36\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">p<\/span><\/span><\/span><\/span><\/span><span id=\"MJXc-Node-38\" class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span id=\"MJXc-Node-39\" class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">.<\/span><\/span><span id=\"MJXc-Node-40\" class=\"mjx-mn MJXc-space1\"><span class=\"mjx-char MJXc-TeX-main-R\">175<\/span><\/span><\/span><\/span><\/span><\/span>\u00a0(or lower) in such a sample is roughly .106 (do not worry about how this was calculated at this point).<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"N10C25\"><em>Conclusion:<\/em><\/p>\r\n<p id=\"N10C2A\">Well, we found that if claim 1 were true there is a probability of .106 of observing data like that observed.<\/p>\r\n<p id=\"N10C2D\">Now you have to decide \u2026<\/p>\r\n<p id=\"N10C30\">Do you think that a probability of .106 makes our data rare enough (surprising enough) under claim 1 so that the fact that we\u00a0<em>did<\/em>\u00a0observe it is enough evidence to reject claim 1?<\/p>\r\n<p id=\"N10C36\">Or do you feel that a probability of .106 means that data like we observed are not very likely when claim 1 is true, but they are not unlikely enough to conclude that getting such data is sufficient evidence to reject claim 1.<\/p>\r\n<p id=\"N10C39\">Basically, this is your decision. However, it would be nice to have some kind of guideline about what is generally considered surprising enough.<\/p>\r\n<\/li>\r\n<\/ol>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<div class=\"example clearfix\">\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Example 2<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<div>\r\n<p id=\"N10B0E\">A certain prescription allergy medicine is supposed to contain an average of 245 parts per million (ppm) of a certain chemical. If the concentration is higher than 245 ppm, the drug will likely cause unpleasant side effects, and if the concentration is below 245 ppm, the drug may be ineffective. The manufacturer wants to check whether the mean concentration in a large shipment is the required 245 ppm or not. To this end, a random sample of 64 portions from the large shipment is tested, and it is found that the sample mean concentration is 250 ppm with a sample standard deviation of 12 ppm. Let\u2019s analyze this example according to the four steps of hypotheses testing we outlined on the previous page:<\/p>\r\n\r\n<ol>\r\n \t<li>\r\n<p id=\"N10B13\"><em>Stating the claims:<\/em><\/p>\r\n\r\n<ul class=\"none\">\r\n \t<li>\r\n<p id=\"N10B1C\"><em>Claim 1:<\/em>\u00a0The mean concentration in the shipment is the required 245 ppm.<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"N10B23\"><em>Claim 2:<\/em>\u00a0The mean concentration in the shipment is not the required 245 ppm.<\/p>\r\n<\/li>\r\n<\/ul>\r\n<p id=\"N10B29\">Note that again, claim 1 basically says: \u201cThere is nothing unusual about this shipment, the mean concentration is the required 245 ppm.\u201d This claim is challenged by the manufacturer, who wants to check whether that is, indeed, the case or not.<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"N10B2D\"><em>Choosing a sample and collecting data:<\/em><\/p>\r\n<p id=\"N10B31\">A sample of n = 64 portions is chosen and after summarizing the data it is found that the sample concentration is\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">x<\/span><\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">250<\/span><\/span><\/span><\/span><\/span>\u00a0and the sample standard deviation is s = 12.<\/p>\r\n<p id=\"N10B4A\">Is the fact that\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">x<\/span><\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">250<\/span><\/span><\/span><\/span><\/span>\u00a0is different from 245 strong enough evidence to reject claim 1 and conclude that the mean concentration in the whole shipment is not the required 245? In other words, do the data provide strong enough evidence to reject claim 1?<\/p>\r\n<\/li>\r\n \t<li><em>Assessing the evidence:<\/em>\r\n<p id=\"N10B69\">In order to assess whether the data provide strong enough evidence against claim 1, we need to ask ourselves the following question: If the mean concentration in the whole shipment were really the required 245 ppm (i.e., if claim 1 were true), how surprising would it be to observe a sample of 64 portions where the sample mean concentration is off by 5 ppm or more (as we did)? It turns out that it would be extremely unlikely to get such a result if the mean concentration were really the required 245. There is only a probability of .0007 (i.e., 7 in 10,000) of that happening. (Do not worry about how this was calculated at this point.)<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"N10B6D\"><em>Making conclusions:<\/em><\/p>\r\n<p id=\"N10B73\">Here, it is pretty clear that a sample like the one we observed is extremely rare (or extremely unlikely) if the mean concentration in the shipment were really the required 245 ppm. The fact that we\u00a0<em>did<\/em>\u00a0observe such a sample therefore provides strong evidence against claim 1, so we reject it and conclude with very little doubt that the mean concentration in the shipment is not the required 245 ppm.<\/p>\r\n<\/li>\r\n<\/ol>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<p id=\"N10B7A\">Do you think that you\u2019re getting it? Let\u2019s make sure, and look at another example.<\/p>\r\n\r\n<div class=\"examplewrap\">\r\n<div class=\"exHead\"><\/div>\r\n<div class=\"example clearfix\">\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Example 3<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<div>\r\n<p id=\"N10B82\">Is there a relationship between gender and combined scores (Math + Verbal) on the SAT exam?<\/p>\r\n<p id=\"N10B85\">Following a report on the College Board website, which showed that in 2003, males scored generally higher than females on the SAT exam (http:\/\/www.collegeboard.com\/prod_downloads\/about\/news_info\/cbsenior\/yr2003\/pdf\/2003CBSVM.pdf), an educational researcher wanted to check whether this was also the case in her school district. The researcher chose random samples of 150 males and 150 females from her school district, collected data on their SAT performance and found the following:<\/p>\r\n\r\n<table id=\"N10B88_bx\" class=\"table labeled aligncenter\">\r\n<thead>\r\n<tr>\r\n<th>Males<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td>\r\n<table class=\"grid\">\r\n<thead>\r\n<tr>\r\n<th>n<\/th>\r\n<th>mean<\/th>\r\n<th>standard deviation<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td>150<\/td>\r\n<td>1025<\/td>\r\n<td>212<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<table id=\"N10B9B_bx\" class=\"table labeled aligncenter\">\r\n<thead>\r\n<tr>\r\n<th>Females<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td>\r\n<table class=\"grid\" style=\"height: 28px;\">\r\n<thead>\r\n<tr style=\"height: 14px;\">\r\n<th style=\"height: 14px; width: 22.45px;\">n<\/th>\r\n<th style=\"height: 14px; width: 38.3625px;\">mean<\/th>\r\n<th style=\"height: 14px; width: 128.538px;\">standard deviation<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr style=\"height: 14px;\">\r\n<td style=\"height: 14px; width: 22.85px;\">150<\/td>\r\n<td style=\"height: 14px; width: 39.1625px;\">1010<\/td>\r\n<td style=\"height: 14px; width: 128.938px;\">206<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<p id=\"N10BAE\">Again, let\u2019s see how the process of hypothesis testing works for this example:<\/p>\r\n\r\n<ol>\r\n \t<li>\r\n<p id=\"N10BB3\"><em>Stating the claims:<\/em><\/p>\r\n\r\n<ul class=\"none\">\r\n \t<li>\r\n<p id=\"N10BBD\"><em>Claim 1:<\/em>\u00a0Performance on the SAT is not related to gender (males and females score the same).<\/p>\r\n<\/li>\r\n \t<li><em>Claim 2:<\/em>\u00a0Performance on the SAT is related to gender \u2013 males score higher.<\/li>\r\n<\/ul>\r\n<p id=\"N10BCA\">Note that again, claim 1 basically says: \u201cThere is nothing going on between the variables SAT and gender.\u201d Claim 2 represents what the researcher wants to check, or suspects might actually be the case.<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"N10BCE\"><em>Choosing a sample and collecting data:<\/em><\/p>\r\n<p id=\"N10BD3\">Data were collected and summarized as given above.<\/p>\r\n<p id=\"N10BD6\">Is the fact that the sample mean score of males (1,025) is higher than the sample mean score of females (1,010) by 15 points strong enough information to reject claim 1 and conclude that in this researcher\u2019s school district, males score higher on the SAT than females?<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"N10BDA\"><em>Assessment of evidence:<\/em><\/p>\r\n<p id=\"N10BDE\">In order to assess whether the data provide strong enough evidence against claim 1, we need to ask ourselves: If SAT scores are in fact not related to gender (claim 1 is true), how likely is it to get data like the data we observed, in which the difference between the males\u2019 average and females\u2019 average score is as high as 15 points or higher? It turns out that the probability of observing such a sample result if SAT score is not related to gender is approximately .29 (Again, do not worry about how this was calculated at this point).<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"N10BE2\"><em>Conclusion:<\/em><\/p>\r\n<p id=\"N10BE6\">Here, we have an example where observing a sample like the one we observed is definitely not surprising (roughly 30% chance) if claim 1 were true (i.e., if indeed there is no difference in SAT scores between males and females). We therefore conclude that our data does not provide enough evidence for rejecting claim 1.<\/p>\r\n<\/li>\r\n<\/ol>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<div class=\"section\">\r\n<div class=\"sectionContain\">\r\n<h2><span title=\"Quick scroll up\">Comment<\/span><\/h2>\r\n<p id=\"N10BF1\">Go back and read the conclusion sections of the three examples, and pay attention to the wording. Note that there are two type of conclusions:<\/p>\r\n\r\n<ul class=\"none\">\r\n \t<li>\r\n<p id=\"N10BF7\">\u201cThe data provide enough evidence to reject claim 1 and accept claim 2\u201d; or<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"N10BFB\">\u201cThe data do not provide enough evidence to reject claim 1.\u201d<\/p>\r\n<\/li>\r\n<\/ul>\r\n<p id=\"N10BFE\">In particular, note that in the second type of conclusion\u00a0<em>we did not say:<\/em>\u00a0\u201c<em class=\"highlight\">I accept claim 1<\/em>,\u201d but only \u201c<em>I don\u2019t have enough evidence to reject claim 1<\/em>.\u201d We will come back to this issue later, but this is a good place to make you aware of this subtle difference.<\/p>\r\n<p id=\"N10C0B\">Hopefully by now, you understand the logic behind the statistical hypothesis testing process. Here is a summary:<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"_i_0\" class=\"img-responsive popimg aligncenter\" title=\"A flow chart describing the process. First, we state Claim 1 and Claim 2. Claim 1 says &quot;nothing special is going on&quot; and is challenged by claim 2. Second, we collect relevant data and summarize it. Third, we assess how surprising it woudl be to observe data like that observed if Claim 1 is true. Fourth, we draw conclusions in context.\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m1_inference_one_variable\/webcontent\/image205.gif\" alt=\"A flow chart describing the process. First, we state Claim 1 and Claim 2. Claim 1 says &quot;nothing special is going on&quot; and is challenged by claim 2. Second, we collect relevant data and summarize it. Third, we assess how surprising it woudl be to observe data like that observed if Claim 1 is true. Fourth, we draw conclusions in context.\" \/><\/span><\/span>\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Learn by Doing<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<p id=\"N10C1E\"  >For many years \"working full-time\" has meant 40 hours per week. Nowadays it seems that corporate employers expect their employees to work more than this amount. A researcher decides to investigate this hypothesis.<\/p>\r\n\r\n<ul class=\"none\"  >\r\n \t<li  >\r\n<p id=\"N10C25\"  ><em  >Claim 1:<\/em>\u00a0The average time full-time corporate employees work per week is 40 hours.<\/p>\r\n<\/li>\r\n \t<li  >\r\n<p id=\"N10C2D\"  ><em  >Claim 2:<\/em>\u00a0The average time full-time corporate employees work per week is more than 40 hours.<\/p>\r\n<\/li>\r\n<\/ul>\r\n<p id=\"N10C33\"  >To substantiate his claim, the researcher randomly selects 250 corporate employees and finds that they work an average of 47 hours per week with a standard deviation of 3.2 hours.<\/p>\r\n<p  >[h5p id=\"151\"]<\/p>\r\n<p id=\"N10C55\"  >According to the Center for Disease Control (CDC), roughly 21.5% of all high-school seniors in the United States. have used marijuana. (Comments: The data were collected in 2002. The figure represents those who smoked during the month prior to the survey, so the actual figure might be higher). A sociologist suspects that the rate among African-American high school seniors is lower, and wants to check that. In this case, then,<\/p>\r\n\r\n<ul class=\"none\"  >\r\n \t<li  >\r\n<p id=\"N10C5C\"  ><em  >Claim 1:<\/em>\u00a0The rate of African-American high-school seniors who have used marijuana is 21.5% (same as the overall rate of seniors).<\/p>\r\n<\/li>\r\n \t<li  >\r\n<p id=\"N10C63\"  ><em  >Claim 2:<\/em>\u00a0The rate of African-American high-school seniors who have used marijuana is lower than 21.5%.<\/p>\r\n<\/li>\r\n<\/ul>\r\n<p id=\"N10C69\"  >To check his claim, the sociologist chooses a random sample of 375 African-American high school seniors, and finds that 16.5% of them have used marijuana.<\/p>\r\n<p  >[h5p id=\"152\"]<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<p id=\"N10C95\"  >The most commonly accepted tradition is that college students will study 2 hours outside of class for every hour in class. This means 30 hours\/week for a full-time student taking 15 units (hours of class). An educator suspects that this figure is different now than in the past.<\/p>\r\n\r\n<ul class=\"none\"  >\r\n \t<li  >\r\n<p id=\"N10C9C\"  ><em  >Claim 1:<\/em>\u00a0The average time full-time college students study outside of class per week is 30 hours.<\/p>\r\n<\/li>\r\n \t<li  >\r\n<p id=\"N10CA3\"  ><em  >Claim 2:<\/em>\u00a0The average time full-time college students study outside of class per week is not 30 hours.<\/p>\r\n<\/li>\r\n<\/ul>\r\n<p id=\"N10CA9\"  >To substantiate her claim, the educator randomly selects 1,500 college students and finds that they study an average of 27 hours per week with a standard deviation of 1.7 hours.<\/p>\r\n<p  >[h5p id=\"153\"]<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<div>\r\n<h2><span title=\"Quick scroll up\">More Details and Terminology<\/span><\/h2>\r\nNow that we understand the general idea of how statistical hypothesis testing works, let\u2019s go back to each of the steps and delve slightly deeper, getting more details and learning some terminology.\r\n\r\n<em>Hypothesis testing step 1: Stating the claims.<\/em>\r\n\r\nIn all three examples, our aim is to decide between two opposing points of view, Claim 1 and Claim 2. In hypothesis testing,\u00a0<em>Claim 1<\/em>\u00a0is called the\u00a0<em>null hypothesis<\/em>\u00a0(denoted \u201c<em>H<sub>0<\/sub><\/em>\u201c), and\u00a0<em>Claim 2<\/em>\u00a0plays the role of the\u00a0<em>alternative hypothesis<\/em>\u00a0(denoted \u201c<em>H<sub>a<\/sub><\/em>\u201c). As we saw in the three examples, the null hypothesis suggests nothing special is going on; in other words, there is no change from the status quo, no difference from the traditional state of affairs, no relationship. In contrast, the alternative hypothesis disagrees with this, stating that something is going on, or there is a change from the status quo, or there is a difference from the traditional state of affairs. The alternative hypothesis, H<sub>a<\/sub>, usually represents what we want to check or what we suspect is really going on.\r\n\r\nLet\u2019s go back to our three examples and apply the new notation:\r\n<p id=\"N10B36\"><em>In example 1:<\/em><\/p>\r\n\r\n<ul class=\"none\">\r\n \t<li>\r\n<p id=\"N10B41\"><em>H<sub>0<\/sub>:<\/em>\u00a0The proportion of smokers at Goodheart is .20.<\/p>\r\n<\/li>\r\n \t<li><em>H<sub>a<\/sub>:<\/em>\u00a0The proportion of smokers at Goodheart is less than .20.<\/li>\r\n<\/ul>\r\n<p id=\"N10B58\"><em>In example 2:<\/em><\/p>\r\n\r\n<ul class=\"none\">\r\n \t<li>\r\n<p id=\"N10B63\"><em>H<sub>0<\/sub>:<\/em>\u00a0The mean concentration in the shipment is the required 245 ppm.<\/p>\r\n<\/li>\r\n \t<li><em>H<sub>a<\/sub>:<\/em>\u00a0The mean concentration in the shipment is not the required 245 ppm.<\/li>\r\n<\/ul>\r\n<em>In example 3:<\/em>\r\n<ul class=\"none\">\r\n \t<li><em>H<sub>0<\/sub>:<\/em>\u00a0Performance on the SAT is not related to gender (males and females score the same).<\/li>\r\n \t<li>\r\n<p id=\"N10B91\"><em>H<sub>a<\/sub>:<\/em>\u00a0Performance on the SAT is related to gender \u2013 males score higher.<\/p>\r\n<\/li>\r\n<\/ul>\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Learn by Doing<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\nAccording to the Centers for Disease Control and Prevention, the proportion of U.S. adults age 25 or older who smoke is .22. A researcher suspects that the rate is lower among U.S. adults 25 or older who have a bachelor's degree or higher education level.\r\n\r\n[h5p id=\"154\"]\r\n\r\nA study investigated whether there are differences between the mean IQ level of people who were reared by their biological parents and those who were reared by someone else.\r\n\r\n[h5p id=\"155\"]\r\n\r\n<\/div>\r\n<\/div>\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\nData were collected in order to determine whether there is a relationship between a person's level of education and whether or not the person is a smoker.\r\n\r\n[h5p id=\"156\"]\r\n\r\n<\/div>\r\n<\/div>\r\n<p id=\"N10C18\"><em>Hypothesis testing step 2: Choosing a sample and collecting data<\/em>.<\/p>\r\n<p id=\"N10C1D\">This step is pretty obvious. This is what inference is all about. You look at sampled data in order to draw conclusions about the entire population. In the case of hypothesis testing, based on the data, you draw conclusions about whether or not there is enough evidence to reject H<sub>o<\/sub>.<\/p>\r\n<p id=\"N10C23\">There is, however, one detail that we would like to add here. In this step we collect data and\u00a0<em>summarize<\/em>\u00a0it. Go back and look at the second step in our three examples. Note that in order to summarize the data we used simple sample statistics such as the sample proportion (<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">\u02c6<\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">p<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>), sample mean (<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">x<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>) and the sample standard deviation (s).<\/p>\r\n<p id=\"N10C4A\">In practice, you go a step further and use these sample statistics to summarize the data with what\u2019s called a\u00a0<em>test statistic<\/em>. We are not going to go into any details right now, but we will discuss test statistics when we go through the specific tests.<\/p>\r\n<p id=\"N10C50\"><em>Hypothesis testing step 3: Assessing the evidence.<\/em><\/p>\r\n<p id=\"N10C54\">As we saw, this is the step where we calculate how likely is it to get data like that observed when H<sub>o<\/sub>\u00a0true. In a sense, this is the heart of the process, since we draw our conclusions based on this probability. If this probability is very small (see example 2), then that means that it would be very surprising to get data like that observed if H<sub>0<\/sub>\u00a0were true. The fact that we\u00a0<em>did<\/em>\u00a0observe such data is therefore evidence against H<sub>0<\/sub>, and we should reject it. On the other hand, if this probability is not very small (see example 3) this means that observing data like that observed is not very surprising if H<sub>0<\/sub>\u00a0were true, so the fact that we observed such data does not provide evidence against H<sub>o<\/sub>. This crucial probability, therefore, has a special name. It is called the\u00a0<em>p-value<\/em>\u00a0of the test.<\/p>\r\n<p id=\"N10C6C\">In our three examples, the p-values were given to you (and you were reassured that you didn\u2019t need to worry about how these were derived):<\/p>\r\n\r\n<ul class=\"none\">\r\n \t<li>\r\n<p id=\"N10C74\">Example 1: p-value = .106<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"N10C7A\">Example 2: p-value = .0007<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"N10C80\">Example 3: p-value = .29<\/p>\r\n<\/li>\r\n<\/ul>\r\n<p id=\"N10C85\">Obviously, the smaller the p-value, the more surprising it is to get data like ours when H<sub>0<\/sub>\u00a0is true, and therefore, the stronger the evidence the data provide against H<sub>0<\/sub>. Looking at the three p-values of our three examples, we see that the data that we observed in example 2 provide the strongest evidence against the null hypothesis, followed by example 1, while the data in example 3 provides the least evidence against H<sub>0<\/sub>.<\/p>\r\n\r\n<div id=\"N10C94\" class=\"section\">\r\n<div class=\"sectionContain\">\r\n<h2>Comments:<\/h2>\r\n<p id=\"N10C9B\">Right now we will not go into specific details about p-value calculations, but just mention that since the p-value is the probability of getting\u00a0<em>data<\/em>\u00a0like those observed when H<sub>0<\/sub>\u00a0is true, it would make sense that the calculation of the p-value will be based on the data summary, which, as we mentioned, is the test statistic. Indeed, this is the case. In practice, we will mostly use software to provide the p-value for us.<\/p>\r\n<p id=\"N10CA4\">It should be noted that in the past, before statistical software was such an integral part of intro stats courses it was common to use critical values (rather than p-values) in order to assess the evidence provided by the data. While this courses focuses on p-values, we will provide some details about the critical values approach later in this module for those students who are interested in learning more about it.<\/p>\r\n<em>Hypothesis testing step 4: Making conclusions.<\/em>\r\n<p id=\"N10B35\">Since our conclusion is based on how small the p-value is, or in other words, how surprising our data are when H<sub>o<\/sub>\u00a0is true, it would be nice to have some kind of guideline or cutoff that will help determine how small the p-value must be, or how \u201crare\u201d (unlikely) our data must be when H<sub>o<\/sub>\u00a0is true, for us to conclude that we have enough evidence to reject H<sub>o<\/sub>.<\/p>\r\nThis cutoff exists, and because it is so important, it has a special name. It is called the\u00a0<em>significance level of the test<\/em>\u00a0and is usually denoted by the Greek letter \u03b1. The most commonly used significance level is \u03b1 = .05 (or 5%). This means that:\r\n<ul>\r\n \t<li>if the p-value &lt; \u03b1 (usually .05), then the data we got is considered to be \u201crare (or surprising) enough\u201d when H<sub>o<\/sub>\u00a0is true, and we say that the data provide significant evidence against H<sub>o<\/sub>, so we reject H<sub>o<\/sub>\u00a0and accept H<sub>a<\/sub>.<\/li>\r\n \t<li>if the p-value &gt; \u03b1 (usually .05), then our data are not considered to be \u201csurprising enough\u201d when H<sub>o<\/sub>\u00a0is true, and we say that our data do not provide enough evidence to reject H<sub>o<\/sub>\u00a0(or, equivalently, that the data do not provide enough evidence to accept H<sub>a<\/sub>).<\/li>\r\n<\/ul>\r\n<p id=\"N10B65\"><em>Important comment about wording.<\/em><\/p>\r\n<p id=\"N10B6A\">Another common wording (mostly in scientific journals) is:<\/p>\r\n\u201cThe results are statistically significant\u201d \u2013 when the p-value &lt; \u03b1.\r\n<p id=\"N10B70\">\u201cThe results are not statistically significant\u201d \u2013 when the p-value &gt; \u03b1.<\/p>\r\n\r\n<div class=\"section\">\r\n<div class=\"sectionContain\">\r\n<h2><span title=\"Quick scroll up\">Comments<\/span><\/h2>\r\n<ol>\r\n \t<li>\r\n<p id=\"N10B7E\">Although the significance level provides a good guideline for drawing our conclusions, it should not be treated as an incontrovertible truth. There is a lot of room for personal interpretation. What if your p-value is .052? You might want to stick to the rules and say \u201c.052 &gt; .05 and therefore I don\u2019t have enough evidence to reject H<sub>o<\/sub>\u201c, but you might decide that .052 is small enough for you to believe that H<sub>o<\/sub>\u00a0should be rejected.<\/p>\r\nIt should be noted that scientific journals do consider .05 to be the cutoff point for which any p-value below the cutoff indicates enough evidence against H<sub>o<\/sub>, and any p-value above it,\u00a0<em class=\"italic\">or even equal to it<\/em>, indicates there is not enough evidence against H<sub>o<\/sub>.<\/li>\r\n \t<li>\r\n<p id=\"N10B95\">It is important to draw your conclusions\u00a0<em>in context<\/em>. It is\u00a0<em>never enough<\/em>\u00a0to say:\u00a0<em class=\"italic\">\u201cp-value = \u2026, and therefore I have enough evidence to reject H<sub>o<\/sub>\u00a0at the .05 significance level.\u201d<\/em>You\u00a0<em>should always add:<\/em>\u00a0\u201c\u2026 and conclude that \u2026 (what it means in the context of the problem)\u201d.<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"N10BAA\">Let\u2019s go back to the issue of the nature of the two types of conclusions that I can make.<\/p>\r\n<p id=\"N10BAD\"><em>Either<\/em>\u00a0<em class=\"italic\">I reject H<sub>o<\/sub>\u00a0and accept H<sub>a<\/sub>\u00a0(when the p-value is smaller than the significance level)<\/em>\u00a0<em>or<\/em>\u00a0<em class=\"italic\">I cannot reject H<sub>o<\/sub>\u00a0(when the p-value is larger than the significance level).<\/em><\/p>\r\n<\/li>\r\n<\/ol>\r\n<p id=\"N10BC7\">As we mentioned earlier, note that the second conclusion does not imply that I accept H<sub>o<\/sub>, but just that I don\u2019t have enough evidence to reject it. Saying (by mistake) \u201cI don\u2019t have enough evidence to reject H<sub>o<\/sub>\u00a0so I accept it\u201d indicates that the data provide evidence that H<sub>o<\/sub>\u00a0is true, which is\u00a0<em>not necessarily the case<\/em>. Consider the following slightly artificial yet effective example:<\/p>\r\n\r\n<div class=\"examplewrap\">\r\n<div class=\"example clearfix\">\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Example<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<p id=\"N10BD8\">An employer claims to subscribe to an \u201cequal opportunity\u201d policy, not hiring men any more often than women for managerial positions. Is this credible? You\u2019re not sure, so you want to test the following\u00a0<em>two hypotheses:<\/em><\/p>\r\n\r\n<ul class=\"none\">\r\n \t<li><em>H<sub>o<\/sub>:<\/em>\u00a0The proportion of male managers hired is .5<\/li>\r\n \t<li>\r\n<p id=\"N10BEC\"><em>H<sub>a<\/sub>:<\/em>\u00a0The proportion of male managers hired is more than .5<\/p>\r\n<\/li>\r\n<\/ul>\r\n<p id=\"N10BF6\"><em>Data:<\/em>\u00a0You choose at random three of the new managers who were hired in the last 5 years and find that all 3 are men.<\/p>\r\n<p id=\"N10BFC\"><em>Assessing Evidence:<\/em>\u00a0If the proportion of male managers hired is really .5 (H<sub>o<\/sub>\u00a0is true), then the probability that the random selection of three managers will yield three males is therefore .5 * .5 * .5 = .125. This is the p-value.<\/p>\r\n<p id=\"N10C05\"><em>Conclusion:<\/em>\u00a0Using .05 as the significance level, you conclude that since the p-value = .125 &gt; .05, the fact that the three randomly selected mangers were all males is not enough evidence to reject H<sub>o<\/sub>. In other words, you do not have enough evidence to reject the employer\u2019s claim of subscribing to an equal opportunity policy.<\/p>\r\n<p id=\"N10C0E\">However,\u00a0<em>the data (all three selected are males) definitely does not provide evidence to accept the employer\u2019s claim (H<sub>o<\/sub>).<\/em><\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Learn by Doing<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<p id=\"N10C22\"  >The following two hypotheses are tested:<\/p>\r\n\r\n<ul class=\"none\"  >\r\n \t<li  >\r\n<p id=\"N10C2A\"  >H<sub  >o<\/sub>: The proportion of U.S. adults who oppose gay marriage is roughly 50%.<\/p>\r\n<\/li>\r\n \t<li  >\r\n<p id=\"N10C32\"  >H<sub  >a<\/sub>: The proportion of U.S. adults who oppose gay marriage is above 50% (i.e., the majority oppose).<\/p>\r\n<\/li>\r\n<\/ul>\r\n<p id=\"N10C39\"  >Suppose a survey was conducted in which a random sample of 1,100 U.S. adults was asked about their opinions about gay marriage, and based on the data, the p-value was found to be .002.<\/p>\r\n<p id=\"N10C3C\"  >Comment: Throughout this activity use a .05 (5%) significance level (cutoff).<\/p>\r\n<p  >[h5p id=\"157\"]<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<p id=\"N10CC5\"  >The following two hypotheses are tested:<\/p>\r\n\r\n<ul class=\"none\"  >\r\n \t<li  >\r\n<p id=\"N10CCD\"  >H<sub  >o<\/sub>: The average number of miles driven per year is 12,000.<\/p>\r\n<\/li>\r\n \t<li  >\r\n<p id=\"N10CD5\"  >H<sub  >a<\/sub>: The average number of miles driven per year is less than 12,000.<\/p>\r\n<\/li>\r\n<\/ul>\r\n<p id=\"N10CDC\"  >In a survey, 1,600 randomly selected drivers were asked the number of miles they drive yearly. Based upon the results, the p-value = .068.<\/p>\r\n<p id=\"N10CDF\"  >Comment: Throughout this activity use a .05 (5%) significance level.<\/p>\r\n<p  >[h5p id=\"158\"]<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<h2><span title=\"Quick scroll up\">Let\u2019s summarize<\/span><\/h2>\r\n<p id=\"N10B20\">We learned quite a lot about hypothesis testing. We learned the logic behind it, what the key elements are, and what types of conclusions we can and cannot draw in hypothesis testing. Here is a quick recap:<\/p>\r\n\r\n<div class=\"figurewrap\">\r\n<div class=\"figure clearfix\">\r\n<div id=\"uwrap__i_0\" class=\"youtube\">\r\n\r\n&nbsp;\r\n\r\n[embed]https:\/\/www.youtube.com\/embed\/GzkWcsJyPH4[\/embed]\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\n<em  >Background:<\/em>\u00a0Based on the National Center of Health Statistics, the proportion of babies born at low birth weight (below 2,500 grams) in the United States is roughly .078, or 7.8% (based on all the births in the United States in the year 2002). A study was done in order to check whether smoking by pregnant women increases the risk of low birth weight. In other words, the researchers wanted to check whether the proportion of babies born at low birth weight among women who smoked during their pregnancy is higher than the proportion in the general population. The researchers followed a sample of 400 women who had smoked during their pregnancy and recorded the birth weight of the newborns. Based on the data, the p-value was found to be .016.\r\n\r\n[h5p id=\"159\"]\r\n\r\n<\/div>\r\n<\/div>\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\nThe same researchers also wanted to examine whether second-hand smoking (exposure to a another person smoking) by pregnant women increases the risk of low birth weight (i.e., the proportion of babies born at a low birth weight among women who were second-hand smokers during their pregnancy is higher than the proportion in the general population). The researchers obtained a sample of 175 pregnant women who were second-hand smokers, followed them during their pregnancies, and found that 10.2% of the newborns had low birth weight. Based on these data, the p-value was found to be .119.\r\n\r\n[h5p id=\"160\"]\r\n\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/section>","rendered":"<section class=\"standard post-344 chapter type-chapter status-publish hentry focusable\" data-type=\"chapter\">\n<div id=\"lobjh\" class=\"\">\n<div class=\"textbox textbox--learning-objectives\">\n<header class=\"textbox__header\">\n<h2 class=\"textbox__title\">Learning Objectives<\/h2>\n<\/header>\n<div class=\"textbox__content\">\n<ul>\n<li id=\"explain_hypothesis_testing\">Explain the logic behind and the process of hypotheses testing. In particular, explain what the p-value is and how it is used to draw conclusions.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<p>The purpose of this section is to gradually build your understanding about how statistical hypothesis testing works. We start by explaining the general logic behind the process of hypothesis testing. Once we are confident that you understand this logic, we will add some more details and terminology.<\/p>\n<\/div>\n<div id=\"N10AF4\" class=\"section\">\n<div class=\"sectionContain\">\n<h2><span title=\"Quick scroll up\">General Idea and Logic of Hypothesis Testing<\/span><\/h2>\n<p id=\"N10AFB\">To start our discussion about the idea behind statistical hypothesis testing, consider the following example:<\/p>\n<div class=\"examplewrap\">\n<div class=\"example clearfix\">\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Example<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p id=\"N10B00\">A case of suspected cheating on an exam is brought in front of the disciplinary committee at a certain university.<\/p>\n<p id=\"N10B03\">There are\u00a0<em>two<\/em>\u00a0opposing\u00a0<em>claims<\/em>\u00a0in this case:<\/p>\n<ul class=\"none\">\n<li>\n<p id=\"N10B10\">The\u00a0<em>student\u2019s claim:<\/em>\u00a0I did not cheat on the exam.<\/p>\n<\/li>\n<li>\n<p id=\"N10B17\">The\u00a0<em>instructor\u2019s claim:<\/em>\u00a0The student did cheat on the exam.<\/p>\n<\/li>\n<\/ul>\n<p id=\"N10B1D\">Adhering to the principle\u00a0<em>\u201cinnocent until proven guilty,\u201d<\/em>\u00a0the committee asks the instructor for\u00a0<em>evidence<\/em>\u00a0to support his claim. The instructor explains that the exam had two versions, and shows the committee members that on three separate exam questions, the student used in his solution numbers that were given in the other version of the exam.<\/p>\n<p id=\"N10B26\">The committee members all agree that\u00a0<em>it would be extremely unlikely to get evidence like that if the student\u2019s claim of not cheating had been true.<\/em>\u00a0In other words, the committee members all agree that the instructor brought forward strong enough evidence to reject the student\u2019s claim, and conclude that the student did cheat on the exam.<\/p>\n<\/div>\n<\/div>\n<p>What does this example have to do with statistics?<\/p>\n<\/div>\n<\/div>\n<p id=\"N10B30\">While it is true that this story seems unrelated to statistics, it captures all the elements of hypothesis testing and the logic behind it. Before you read on to understand why, it would be useful to read the example again. Please do so now.<\/p>\n<p id=\"N10B33\"><em>Statistical hypothesis testing<\/em>\u00a0is defined as:<\/p>\n<p id=\"N10B38\"><em>Assessing evidence provided by the data in favor of or against some claim about the population.<\/em><\/p>\n<p id=\"N10B3E\">Here is how the process of statistical hypothesis testing works:<\/p>\n<ol>\n<li>We have\u00a0<em>two claims<\/em>\u00a0about what is going on in the population. Let\u2019s call them for now\u00a0<em>claim 1<\/em>\u00a0and\u00a0<em>claim 2<\/em>. Much like the story above, where the student\u2019s claim is challenged by the instructor\u2019s claim, claim 1 is challenged by claim 2.\n<p id=\"N10B4D\">(<em>Comment:<\/em>\u00a0as you\u2019ll see in the examples that follow, these claims are usually about the value of population parameter(s) or about the existence or nonexistence of a relationship between two variables in the population).<\/p>\n<\/li>\n<li>We choose a sample, collect relevant data and summarize them (this is similar to the instructor collecting evidence from the student\u2019s exam).<\/li>\n<li>We figure out how likely it is to observe data like the data we got, had claim 1 been true. (Note that the wording \u201chow likely \u2026\u201d implies that this step requires some kind of probability calculation). In the story, the committee members assessed how likely it is to observe the evidence like that which the instructor provided, had the student\u2019s claim of not cheating been true.<\/li>\n<li>Based on what we found in the previous step, we make our decision:\n<ul>\n<li>If we find that if claim 1 were true it would be extremely unlikely to observe the data that we observed, then we have strong evidence against claim 1, and we reject it in favor of claim 2.<\/li>\n<li>If we find that if claim 1 were true observing the data that we observed is not very unlikely, then we do not have enough evidence against claim 1, and therefore we cannot reject it in favor of claim 2.<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n<p id=\"N10B64\">In our story, the committee decided that it would be extremely unlikely to find the evidence that the instructor provided had the student\u2019s claim of not cheating been true. In other words, the members felt that it is extremely unlikely that it is just a coincidence that the student used the numbers from the other version of the exam on three separate problems. The committee members therefore decided to reject the student\u2019s claim and concluded that the student had, indeed, cheated on the exam. (Wouldn\u2019t you conclude the same?)<\/p>\n<p id=\"N10B67\">Hopefully this example helped you understand the logic behind hypothesis testing. To strengthen your understanding of the process of hypothesis testing and the logic behind it, let\u2019s look at three statistical examples.<\/p>\n<div class=\"examplewrap\">\n<div class=\"example clearfix\">\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Example 1<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<div>\n<p id=\"N10B6F\">A recent study estimated that 20% of all college students in the United States smoke. The head of Health Services at Goodheart University suspects that the proportion of smokers may be lower there. In hopes of confirming her claim, the head of Health Services chooses a random sample of 400 Goodheart students, and finds that 70 of them are smokers.<\/p>\n<p id=\"N10B72\">Let\u2019s analyze this example using the 4 steps outlined above:<\/p>\n<ol>\n<li>\n<p id=\"N10B79\"><em>Stating the claims:<\/em><\/p>\n<p id=\"N10B7F\">There are two claims here:<\/p>\n<ul class=\"none\">\n<li>\n<p id=\"N10B86\"><em>claim 1:<\/em>\u00a0The proportion of smokers at Goodheart is .20.<\/p>\n<\/li>\n<li>\n<p id=\"N10B8D\"><em>claim 2:<\/em>\u00a0The proportion of smokers at Goodheart is less than .20.<\/p>\n<\/li>\n<\/ul>\n<p id=\"N10B93\">Claim 1 basically says \u201cnothing special goes on in Goodheart University; the proportion of smokers there is no different from the proportion in the entire country.\u201d This claim is challenged by the head of Health Services, who suspects that the proportion of smokers at Goodheart is lower.<\/p>\n<\/li>\n<li>\n<p id=\"N10B97\"><em>Choosing a sample and collecting data:<\/em><\/p>\n<p id=\"N10B9B\">A sample of n = 400 was chosen, and summarizing the data revealed that the sample proportion of smokers is\u00a0[latex]\\hat{\\mathcal{p}}=\\frac{70}{400}=.175[\/latex]<\/p>\n<p id=\"N10BC4\">While it is true that .175 is less than .20, it is not clear whether this is strong enough evidence against claim 1.<\/p>\n<\/li>\n<li>\n<p id=\"N10BC8\"><em>Assessment of evidence:<\/em><\/p>\n<p id=\"N10BCD\">In order to assess whether the data provide strong enough evidence against claim 1, we need to ask ourselves: How surprising is it to get a sample proportion as low as\u00a0<span id=\"MathJax-Element-2-Frame\" class=\"mjx-chtml MathJax_CHTML\"><span id=\"MJXc-Node-14\" class=\"mjx-math\"><span id=\"MJXc-Node-15\" class=\"mjx-mrow\"><span id=\"MJXc-Node-16\" class=\"mjx-mrow\"><span id=\"MJXc-Node-17\" class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span id=\"MJXc-Node-19\" class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">\u02c6<\/span><\/span><\/span><span class=\"mjx-op\"><span id=\"MJXc-Node-18\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">p<\/span><\/span><\/span><\/span><\/span><span id=\"MJXc-Node-20\" class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span id=\"MJXc-Node-21\" class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">.<\/span><\/span><span id=\"MJXc-Node-22\" class=\"mjx-mn MJXc-space1\"><span class=\"mjx-char MJXc-TeX-main-R\">175<\/span><\/span><\/span><\/span><\/span><\/span>\u00a0(or lower), assuming claim 1 is true?<\/p>\n<p id=\"N10BEA\">In other words, we need to find how likely it is that in a random sample of size n = 400 taken from a population where the proportion of smokers is p = .20 we\u2019ll get a sample proportion as low as\u00a0<span id=\"MathJax-Element-3-Frame\" class=\"mjx-chtml MathJax_CHTML\"><span id=\"MJXc-Node-23\" class=\"mjx-math\"><span id=\"MJXc-Node-24\" class=\"mjx-mrow\"><span id=\"MJXc-Node-25\" class=\"mjx-mrow\"><span id=\"MJXc-Node-26\" class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span id=\"MJXc-Node-28\" class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">\u02c6<\/span><\/span><\/span><span class=\"mjx-op\"><span id=\"MJXc-Node-27\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">p<\/span><\/span><\/span><\/span><\/span><span id=\"MJXc-Node-29\" class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span id=\"MJXc-Node-30\" class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">.<\/span><\/span><span id=\"MJXc-Node-31\" class=\"mjx-mn MJXc-space1\"><span class=\"mjx-char MJXc-TeX-main-R\">175<\/span><\/span><\/span><\/span><\/span><\/span>\u00a0(or lower).<\/p>\n<p id=\"N10C07\">It turns out that the probability that we\u2019ll get a sample proportion as low as\u00a0<span id=\"MathJax-Element-4-Frame\" class=\"mjx-chtml MathJax_CHTML\"><span id=\"MJXc-Node-32\" class=\"mjx-math\"><span id=\"MJXc-Node-33\" class=\"mjx-mrow\"><span id=\"MJXc-Node-34\" class=\"mjx-mrow\"><span id=\"MJXc-Node-35\" class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span id=\"MJXc-Node-37\" class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">\u02c6<\/span><\/span><\/span><span class=\"mjx-op\"><span id=\"MJXc-Node-36\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">p<\/span><\/span><\/span><\/span><\/span><span id=\"MJXc-Node-38\" class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span id=\"MJXc-Node-39\" class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">.<\/span><\/span><span id=\"MJXc-Node-40\" class=\"mjx-mn MJXc-space1\"><span class=\"mjx-char MJXc-TeX-main-R\">175<\/span><\/span><\/span><\/span><\/span><\/span>\u00a0(or lower) in such a sample is roughly .106 (do not worry about how this was calculated at this point).<\/p>\n<\/li>\n<li>\n<p id=\"N10C25\"><em>Conclusion:<\/em><\/p>\n<p id=\"N10C2A\">Well, we found that if claim 1 were true there is a probability of .106 of observing data like that observed.<\/p>\n<p id=\"N10C2D\">Now you have to decide \u2026<\/p>\n<p id=\"N10C30\">Do you think that a probability of .106 makes our data rare enough (surprising enough) under claim 1 so that the fact that we\u00a0<em>did<\/em>\u00a0observe it is enough evidence to reject claim 1?<\/p>\n<p id=\"N10C36\">Or do you feel that a probability of .106 means that data like we observed are not very likely when claim 1 is true, but they are not unlikely enough to conclude that getting such data is sufficient evidence to reject claim 1.<\/p>\n<p id=\"N10C39\">Basically, this is your decision. However, it would be nice to have some kind of guideline about what is generally considered surprising enough.<\/p>\n<\/li>\n<\/ol>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"example clearfix\">\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Example 2<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<div>\n<p id=\"N10B0E\">A certain prescription allergy medicine is supposed to contain an average of 245 parts per million (ppm) of a certain chemical. If the concentration is higher than 245 ppm, the drug will likely cause unpleasant side effects, and if the concentration is below 245 ppm, the drug may be ineffective. The manufacturer wants to check whether the mean concentration in a large shipment is the required 245 ppm or not. To this end, a random sample of 64 portions from the large shipment is tested, and it is found that the sample mean concentration is 250 ppm with a sample standard deviation of 12 ppm. Let\u2019s analyze this example according to the four steps of hypotheses testing we outlined on the previous page:<\/p>\n<ol>\n<li>\n<p id=\"N10B13\"><em>Stating the claims:<\/em><\/p>\n<ul class=\"none\">\n<li>\n<p id=\"N10B1C\"><em>Claim 1:<\/em>\u00a0The mean concentration in the shipment is the required 245 ppm.<\/p>\n<\/li>\n<li>\n<p id=\"N10B23\"><em>Claim 2:<\/em>\u00a0The mean concentration in the shipment is not the required 245 ppm.<\/p>\n<\/li>\n<\/ul>\n<p id=\"N10B29\">Note that again, claim 1 basically says: \u201cThere is nothing unusual about this shipment, the mean concentration is the required 245 ppm.\u201d This claim is challenged by the manufacturer, who wants to check whether that is, indeed, the case or not.<\/p>\n<\/li>\n<li>\n<p id=\"N10B2D\"><em>Choosing a sample and collecting data:<\/em><\/p>\n<p id=\"N10B31\">A sample of n = 64 portions is chosen and after summarizing the data it is found that the sample concentration is\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">x<\/span><\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">250<\/span><\/span><\/span><\/span><\/span>\u00a0and the sample standard deviation is s = 12.<\/p>\n<p id=\"N10B4A\">Is the fact that\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">x<\/span><\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">250<\/span><\/span><\/span><\/span><\/span>\u00a0is different from 245 strong enough evidence to reject claim 1 and conclude that the mean concentration in the whole shipment is not the required 245? In other words, do the data provide strong enough evidence to reject claim 1?<\/p>\n<\/li>\n<li><em>Assessing the evidence:<\/em>\n<p id=\"N10B69\">In order to assess whether the data provide strong enough evidence against claim 1, we need to ask ourselves the following question: If the mean concentration in the whole shipment were really the required 245 ppm (i.e., if claim 1 were true), how surprising would it be to observe a sample of 64 portions where the sample mean concentration is off by 5 ppm or more (as we did)? It turns out that it would be extremely unlikely to get such a result if the mean concentration were really the required 245. There is only a probability of .0007 (i.e., 7 in 10,000) of that happening. (Do not worry about how this was calculated at this point.)<\/p>\n<\/li>\n<li>\n<p id=\"N10B6D\"><em>Making conclusions:<\/em><\/p>\n<p id=\"N10B73\">Here, it is pretty clear that a sample like the one we observed is extremely rare (or extremely unlikely) if the mean concentration in the shipment were really the required 245 ppm. The fact that we\u00a0<em>did<\/em>\u00a0observe such a sample therefore provides strong evidence against claim 1, so we reject it and conclude with very little doubt that the mean concentration in the shipment is not the required 245 ppm.<\/p>\n<\/li>\n<\/ol>\n<\/div>\n<\/div>\n<\/div>\n<p id=\"N10B7A\">Do you think that you\u2019re getting it? Let\u2019s make sure, and look at another example.<\/p>\n<div class=\"examplewrap\">\n<div class=\"exHead\"><\/div>\n<div class=\"example clearfix\">\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Example 3<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<div>\n<p id=\"N10B82\">Is there a relationship between gender and combined scores (Math + Verbal) on the SAT exam?<\/p>\n<p id=\"N10B85\">Following a report on the College Board website, which showed that in 2003, males scored generally higher than females on the SAT exam (http:\/\/www.collegeboard.com\/prod_downloads\/about\/news_info\/cbsenior\/yr2003\/pdf\/2003CBSVM.pdf), an educational researcher wanted to check whether this was also the case in her school district. The researcher chose random samples of 150 males and 150 females from her school district, collected data on their SAT performance and found the following:<\/p>\n<table id=\"N10B88_bx\" class=\"table labeled aligncenter\">\n<thead>\n<tr>\n<th>Males<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>\n<table class=\"grid\">\n<thead>\n<tr>\n<th>n<\/th>\n<th>mean<\/th>\n<th>standard deviation<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>150<\/td>\n<td>1025<\/td>\n<td>212<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table id=\"N10B9B_bx\" class=\"table labeled aligncenter\">\n<thead>\n<tr>\n<th>Females<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>\n<table class=\"grid\" style=\"height: 28px;\">\n<thead>\n<tr style=\"height: 14px;\">\n<th style=\"height: 14px; width: 22.45px;\">n<\/th>\n<th style=\"height: 14px; width: 38.3625px;\">mean<\/th>\n<th style=\"height: 14px; width: 128.538px;\">standard deviation<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr style=\"height: 14px;\">\n<td style=\"height: 14px; width: 22.85px;\">150<\/td>\n<td style=\"height: 14px; width: 39.1625px;\">1010<\/td>\n<td style=\"height: 14px; width: 128.938px;\">206<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p id=\"N10BAE\">Again, let\u2019s see how the process of hypothesis testing works for this example:<\/p>\n<ol>\n<li>\n<p id=\"N10BB3\"><em>Stating the claims:<\/em><\/p>\n<ul class=\"none\">\n<li>\n<p id=\"N10BBD\"><em>Claim 1:<\/em>\u00a0Performance on the SAT is not related to gender (males and females score the same).<\/p>\n<\/li>\n<li><em>Claim 2:<\/em>\u00a0Performance on the SAT is related to gender \u2013 males score higher.<\/li>\n<\/ul>\n<p id=\"N10BCA\">Note that again, claim 1 basically says: \u201cThere is nothing going on between the variables SAT and gender.\u201d Claim 2 represents what the researcher wants to check, or suspects might actually be the case.<\/p>\n<\/li>\n<li>\n<p id=\"N10BCE\"><em>Choosing a sample and collecting data:<\/em><\/p>\n<p id=\"N10BD3\">Data were collected and summarized as given above.<\/p>\n<p id=\"N10BD6\">Is the fact that the sample mean score of males (1,025) is higher than the sample mean score of females (1,010) by 15 points strong enough information to reject claim 1 and conclude that in this researcher\u2019s school district, males score higher on the SAT than females?<\/p>\n<\/li>\n<li>\n<p id=\"N10BDA\"><em>Assessment of evidence:<\/em><\/p>\n<p id=\"N10BDE\">In order to assess whether the data provide strong enough evidence against claim 1, we need to ask ourselves: If SAT scores are in fact not related to gender (claim 1 is true), how likely is it to get data like the data we observed, in which the difference between the males\u2019 average and females\u2019 average score is as high as 15 points or higher? It turns out that the probability of observing such a sample result if SAT score is not related to gender is approximately .29 (Again, do not worry about how this was calculated at this point).<\/p>\n<\/li>\n<li>\n<p id=\"N10BE2\"><em>Conclusion:<\/em><\/p>\n<p id=\"N10BE6\">Here, we have an example where observing a sample like the one we observed is definitely not surprising (roughly 30% chance) if claim 1 were true (i.e., if indeed there is no difference in SAT scores between males and females). We therefore conclude that our data does not provide enough evidence for rejecting claim 1.<\/p>\n<\/li>\n<\/ol>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"section\">\n<div class=\"sectionContain\">\n<h2><span title=\"Quick scroll up\">Comment<\/span><\/h2>\n<p id=\"N10BF1\">Go back and read the conclusion sections of the three examples, and pay attention to the wording. Note that there are two type of conclusions:<\/p>\n<ul class=\"none\">\n<li>\n<p id=\"N10BF7\">\u201cThe data provide enough evidence to reject claim 1 and accept claim 2\u201d; or<\/p>\n<\/li>\n<li>\n<p id=\"N10BFB\">\u201cThe data do not provide enough evidence to reject claim 1.\u201d<\/p>\n<\/li>\n<\/ul>\n<p id=\"N10BFE\">In particular, note that in the second type of conclusion\u00a0<em>we did not say:<\/em>\u00a0\u201c<em class=\"highlight\">I accept claim 1<\/em>,\u201d but only \u201c<em>I don\u2019t have enough evidence to reject claim 1<\/em>.\u201d We will come back to this issue later, but this is a good place to make you aware of this subtle difference.<\/p>\n<p id=\"N10C0B\">Hopefully by now, you understand the logic behind the statistical hypothesis testing process. Here is a summary:<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"_i_0\" class=\"img-responsive popimg aligncenter\" title=\"A flow chart describing the process. First, we state Claim 1 and Claim 2. Claim 1 says &quot;nothing special is going on&quot; and is challenged by claim 2. Second, we collect relevant data and summarize it. Third, we assess how surprising it woudl be to observe data like that observed if Claim 1 is true. Fourth, we draw conclusions in context.\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m1_inference_one_variable\/webcontent\/image205.gif\" alt=\"A flow chart describing the process. First, we state Claim 1 and Claim 2. Claim 1 says &quot;nothing special is going on&quot; and is challenged by claim 2. Second, we collect relevant data and summarize it. Third, we assess how surprising it woudl be to observe data like that observed if Claim 1 is true. Fourth, we draw conclusions in context.\" \/><\/span><\/span><\/p>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Learn by Doing<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p id=\"N10C1E\">For many years &#8220;working full-time&#8221; has meant 40 hours per week. Nowadays it seems that corporate employers expect their employees to work more than this amount. A researcher decides to investigate this hypothesis.<\/p>\n<ul class=\"none\">\n<li>\n<p><em>Claim 1:<\/em>\u00a0The average time full-time corporate employees work per week is 40 hours.<\/p>\n<\/li>\n<li>\n<p><em>Claim 2:<\/em>\u00a0The average time full-time corporate employees work per week is more than 40 hours.<\/p>\n<\/li>\n<\/ul>\n<p id=\"N10C33\">To substantiate his claim, the researcher randomly selects 250 corporate employees and finds that they work an average of 47 hours per week with a standard deviation of 3.2 hours.<\/p>\n<div id=\"h5p-151\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-151\" class=\"h5p-iframe\" data-content-id=\"151\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"8.1 Learn by doing 1\"><\/iframe><\/div>\n<\/div>\n<p id=\"N10C55\">According to the Center for Disease Control (CDC), roughly 21.5% of all high-school seniors in the United States. have used marijuana. (Comments: The data were collected in 2002. The figure represents those who smoked during the month prior to the survey, so the actual figure might be higher). A sociologist suspects that the rate among African-American high school seniors is lower, and wants to check that. In this case, then,<\/p>\n<ul class=\"none\">\n<li>\n<p id=\"N10C5C\"><em>Claim 1:<\/em>\u00a0The rate of African-American high-school seniors who have used marijuana is 21.5% (same as the overall rate of seniors).<\/p>\n<\/li>\n<li>\n<p id=\"N10C63\"><em>Claim 2:<\/em>\u00a0The rate of African-American high-school seniors who have used marijuana is lower than 21.5%.<\/p>\n<\/li>\n<\/ul>\n<p id=\"N10C69\">To check his claim, the sociologist chooses a random sample of 375 African-American high school seniors, and finds that 16.5% of them have used marijuana.<\/p>\n<div id=\"h5p-152\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-152\" class=\"h5p-iframe\" data-content-id=\"152\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"8.1 Learn by doing 2\"><\/iframe><\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p id=\"N10C95\">The most commonly accepted tradition is that college students will study 2 hours outside of class for every hour in class. This means 30 hours\/week for a full-time student taking 15 units (hours of class). An educator suspects that this figure is different now than in the past.<\/p>\n<ul class=\"none\">\n<li>\n<p id=\"N10C9C\"><em>Claim 1:<\/em>\u00a0The average time full-time college students study outside of class per week is 30 hours.<\/p>\n<\/li>\n<li>\n<p id=\"N10CA3\"><em>Claim 2:<\/em>\u00a0The average time full-time college students study outside of class per week is not 30 hours.<\/p>\n<\/li>\n<\/ul>\n<p id=\"N10CA9\">To substantiate her claim, the educator randomly selects 1,500 college students and finds that they study an average of 27 hours per week with a standard deviation of 1.7 hours.<\/p>\n<div id=\"h5p-153\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-153\" class=\"h5p-iframe\" data-content-id=\"153\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"8.1 Did I get this 1\"><\/iframe><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div>\n<h2><span title=\"Quick scroll up\">More Details and Terminology<\/span><\/h2>\n<p>Now that we understand the general idea of how statistical hypothesis testing works, let\u2019s go back to each of the steps and delve slightly deeper, getting more details and learning some terminology.<\/p>\n<p><em>Hypothesis testing step 1: Stating the claims.<\/em><\/p>\n<p>In all three examples, our aim is to decide between two opposing points of view, Claim 1 and Claim 2. In hypothesis testing,\u00a0<em>Claim 1<\/em>\u00a0is called the\u00a0<em>null hypothesis<\/em>\u00a0(denoted \u201c<em>H<sub>0<\/sub><\/em>\u201c), and\u00a0<em>Claim 2<\/em>\u00a0plays the role of the\u00a0<em>alternative hypothesis<\/em>\u00a0(denoted \u201c<em>H<sub>a<\/sub><\/em>\u201c). As we saw in the three examples, the null hypothesis suggests nothing special is going on; in other words, there is no change from the status quo, no difference from the traditional state of affairs, no relationship. In contrast, the alternative hypothesis disagrees with this, stating that something is going on, or there is a change from the status quo, or there is a difference from the traditional state of affairs. The alternative hypothesis, H<sub>a<\/sub>, usually represents what we want to check or what we suspect is really going on.<\/p>\n<p>Let\u2019s go back to our three examples and apply the new notation:<\/p>\n<p id=\"N10B36\"><em>In example 1:<\/em><\/p>\n<ul class=\"none\">\n<li>\n<p id=\"N10B41\"><em>H<sub>0<\/sub>:<\/em>\u00a0The proportion of smokers at Goodheart is .20.<\/p>\n<\/li>\n<li><em>H<sub>a<\/sub>:<\/em>\u00a0The proportion of smokers at Goodheart is less than .20.<\/li>\n<\/ul>\n<p id=\"N10B58\"><em>In example 2:<\/em><\/p>\n<ul class=\"none\">\n<li>\n<p id=\"N10B63\"><em>H<sub>0<\/sub>:<\/em>\u00a0The mean concentration in the shipment is the required 245 ppm.<\/p>\n<\/li>\n<li><em>H<sub>a<\/sub>:<\/em>\u00a0The mean concentration in the shipment is not the required 245 ppm.<\/li>\n<\/ul>\n<p><em>In example 3:<\/em><\/p>\n<ul class=\"none\">\n<li><em>H<sub>0<\/sub>:<\/em>\u00a0Performance on the SAT is not related to gender (males and females score the same).<\/li>\n<li>\n<p id=\"N10B91\"><em>H<sub>a<\/sub>:<\/em>\u00a0Performance on the SAT is related to gender \u2013 males score higher.<\/p>\n<\/li>\n<\/ul>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Learn by Doing<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p>According to the Centers for Disease Control and Prevention, the proportion of U.S. adults age 25 or older who smoke is .22. A researcher suspects that the rate is lower among U.S. adults 25 or older who have a bachelor&#8217;s degree or higher education level.<\/p>\n<div id=\"h5p-154\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-154\" class=\"h5p-iframe\" data-content-id=\"154\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"8.1 Learn by doing 3\"><\/iframe><\/div>\n<\/div>\n<p>A study investigated whether there are differences between the mean IQ level of people who were reared by their biological parents and those who were reared by someone else.<\/p>\n<div id=\"h5p-155\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-155\" class=\"h5p-iframe\" data-content-id=\"155\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"8.1 Learn by doing 4\"><\/iframe><\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p>Data were collected in order to determine whether there is a relationship between a person&#8217;s level of education and whether or not the person is a smoker.<\/p>\n<div id=\"h5p-156\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-156\" class=\"h5p-iframe\" data-content-id=\"156\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"8.1 Learn by doing 5\"><\/iframe><\/div>\n<\/div>\n<\/div>\n<\/div>\n<p id=\"N10C18\"><em>Hypothesis testing step 2: Choosing a sample and collecting data<\/em>.<\/p>\n<p id=\"N10C1D\">This step is pretty obvious. This is what inference is all about. You look at sampled data in order to draw conclusions about the entire population. In the case of hypothesis testing, based on the data, you draw conclusions about whether or not there is enough evidence to reject H<sub>o<\/sub>.<\/p>\n<p id=\"N10C23\">There is, however, one detail that we would like to add here. In this step we collect data and\u00a0<em>summarize<\/em>\u00a0it. Go back and look at the second step in our three examples. Note that in order to summarize the data we used simple sample statistics such as the sample proportion (<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">\u02c6<\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">p<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>), sample mean (<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">x<\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>) and the sample standard deviation (s).<\/p>\n<p id=\"N10C4A\">In practice, you go a step further and use these sample statistics to summarize the data with what\u2019s called a\u00a0<em>test statistic<\/em>. We are not going to go into any details right now, but we will discuss test statistics when we go through the specific tests.<\/p>\n<p id=\"N10C50\"><em>Hypothesis testing step 3: Assessing the evidence.<\/em><\/p>\n<p id=\"N10C54\">As we saw, this is the step where we calculate how likely is it to get data like that observed when H<sub>o<\/sub>\u00a0true. In a sense, this is the heart of the process, since we draw our conclusions based on this probability. If this probability is very small (see example 2), then that means that it would be very surprising to get data like that observed if H<sub>0<\/sub>\u00a0were true. The fact that we\u00a0<em>did<\/em>\u00a0observe such data is therefore evidence against H<sub>0<\/sub>, and we should reject it. On the other hand, if this probability is not very small (see example 3) this means that observing data like that observed is not very surprising if H<sub>0<\/sub>\u00a0were true, so the fact that we observed such data does not provide evidence against H<sub>o<\/sub>. This crucial probability, therefore, has a special name. It is called the\u00a0<em>p-value<\/em>\u00a0of the test.<\/p>\n<p id=\"N10C6C\">In our three examples, the p-values were given to you (and you were reassured that you didn\u2019t need to worry about how these were derived):<\/p>\n<ul class=\"none\">\n<li>\n<p id=\"N10C74\">Example 1: p-value = .106<\/p>\n<\/li>\n<li>\n<p id=\"N10C7A\">Example 2: p-value = .0007<\/p>\n<\/li>\n<li>\n<p id=\"N10C80\">Example 3: p-value = .29<\/p>\n<\/li>\n<\/ul>\n<p id=\"N10C85\">Obviously, the smaller the p-value, the more surprising it is to get data like ours when H<sub>0<\/sub>\u00a0is true, and therefore, the stronger the evidence the data provide against H<sub>0<\/sub>. Looking at the three p-values of our three examples, we see that the data that we observed in example 2 provide the strongest evidence against the null hypothesis, followed by example 1, while the data in example 3 provides the least evidence against H<sub>0<\/sub>.<\/p>\n<div id=\"N10C94\" class=\"section\">\n<div class=\"sectionContain\">\n<h2>Comments:<\/h2>\n<p id=\"N10C9B\">Right now we will not go into specific details about p-value calculations, but just mention that since the p-value is the probability of getting\u00a0<em>data<\/em>\u00a0like those observed when H<sub>0<\/sub>\u00a0is true, it would make sense that the calculation of the p-value will be based on the data summary, which, as we mentioned, is the test statistic. Indeed, this is the case. In practice, we will mostly use software to provide the p-value for us.<\/p>\n<p id=\"N10CA4\">It should be noted that in the past, before statistical software was such an integral part of intro stats courses it was common to use critical values (rather than p-values) in order to assess the evidence provided by the data. While this courses focuses on p-values, we will provide some details about the critical values approach later in this module for those students who are interested in learning more about it.<\/p>\n<p><em>Hypothesis testing step 4: Making conclusions.<\/em><\/p>\n<p id=\"N10B35\">Since our conclusion is based on how small the p-value is, or in other words, how surprising our data are when H<sub>o<\/sub>\u00a0is true, it would be nice to have some kind of guideline or cutoff that will help determine how small the p-value must be, or how \u201crare\u201d (unlikely) our data must be when H<sub>o<\/sub>\u00a0is true, for us to conclude that we have enough evidence to reject H<sub>o<\/sub>.<\/p>\n<p>This cutoff exists, and because it is so important, it has a special name. It is called the\u00a0<em>significance level of the test<\/em>\u00a0and is usually denoted by the Greek letter \u03b1. The most commonly used significance level is \u03b1 = .05 (or 5%). This means that:<\/p>\n<ul>\n<li>if the p-value &lt; \u03b1 (usually .05), then the data we got is considered to be \u201crare (or surprising) enough\u201d when H<sub>o<\/sub>\u00a0is true, and we say that the data provide significant evidence against H<sub>o<\/sub>, so we reject H<sub>o<\/sub>\u00a0and accept H<sub>a<\/sub>.<\/li>\n<li>if the p-value &gt; \u03b1 (usually .05), then our data are not considered to be \u201csurprising enough\u201d when H<sub>o<\/sub>\u00a0is true, and we say that our data do not provide enough evidence to reject H<sub>o<\/sub>\u00a0(or, equivalently, that the data do not provide enough evidence to accept H<sub>a<\/sub>).<\/li>\n<\/ul>\n<p id=\"N10B65\"><em>Important comment about wording.<\/em><\/p>\n<p id=\"N10B6A\">Another common wording (mostly in scientific journals) is:<\/p>\n<p>\u201cThe results are statistically significant\u201d \u2013 when the p-value &lt; \u03b1.<\/p>\n<p id=\"N10B70\">\u201cThe results are not statistically significant\u201d \u2013 when the p-value &gt; \u03b1.<\/p>\n<div class=\"section\">\n<div class=\"sectionContain\">\n<h2><span title=\"Quick scroll up\">Comments<\/span><\/h2>\n<ol>\n<li>\n<p id=\"N10B7E\">Although the significance level provides a good guideline for drawing our conclusions, it should not be treated as an incontrovertible truth. There is a lot of room for personal interpretation. What if your p-value is .052? You might want to stick to the rules and say \u201c.052 &gt; .05 and therefore I don\u2019t have enough evidence to reject H<sub>o<\/sub>\u201c, but you might decide that .052 is small enough for you to believe that H<sub>o<\/sub>\u00a0should be rejected.<\/p>\n<p>It should be noted that scientific journals do consider .05 to be the cutoff point for which any p-value below the cutoff indicates enough evidence against H<sub>o<\/sub>, and any p-value above it,\u00a0<em class=\"italic\">or even equal to it<\/em>, indicates there is not enough evidence against H<sub>o<\/sub>.<\/li>\n<li>\n<p id=\"N10B95\">It is important to draw your conclusions\u00a0<em>in context<\/em>. It is\u00a0<em>never enough<\/em>\u00a0to say:\u00a0<em class=\"italic\">\u201cp-value = \u2026, and therefore I have enough evidence to reject H<sub>o<\/sub>\u00a0at the .05 significance level.\u201d<\/em>You\u00a0<em>should always add:<\/em>\u00a0\u201c\u2026 and conclude that \u2026 (what it means in the context of the problem)\u201d.<\/p>\n<\/li>\n<li>\n<p id=\"N10BAA\">Let\u2019s go back to the issue of the nature of the two types of conclusions that I can make.<\/p>\n<p id=\"N10BAD\"><em>Either<\/em>\u00a0<em class=\"italic\">I reject H<sub>o<\/sub>\u00a0and accept H<sub>a<\/sub>\u00a0(when the p-value is smaller than the significance level)<\/em>\u00a0<em>or<\/em>\u00a0<em class=\"italic\">I cannot reject H<sub>o<\/sub>\u00a0(when the p-value is larger than the significance level).<\/em><\/p>\n<\/li>\n<\/ol>\n<p id=\"N10BC7\">As we mentioned earlier, note that the second conclusion does not imply that I accept H<sub>o<\/sub>, but just that I don\u2019t have enough evidence to reject it. Saying (by mistake) \u201cI don\u2019t have enough evidence to reject H<sub>o<\/sub>\u00a0so I accept it\u201d indicates that the data provide evidence that H<sub>o<\/sub>\u00a0is true, which is\u00a0<em>not necessarily the case<\/em>. Consider the following slightly artificial yet effective example:<\/p>\n<div class=\"examplewrap\">\n<div class=\"example clearfix\">\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Example<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p id=\"N10BD8\">An employer claims to subscribe to an \u201cequal opportunity\u201d policy, not hiring men any more often than women for managerial positions. Is this credible? You\u2019re not sure, so you want to test the following\u00a0<em>two hypotheses:<\/em><\/p>\n<ul class=\"none\">\n<li><em>H<sub>o<\/sub>:<\/em>\u00a0The proportion of male managers hired is .5<\/li>\n<li>\n<p id=\"N10BEC\"><em>H<sub>a<\/sub>:<\/em>\u00a0The proportion of male managers hired is more than .5<\/p>\n<\/li>\n<\/ul>\n<p id=\"N10BF6\"><em>Data:<\/em>\u00a0You choose at random three of the new managers who were hired in the last 5 years and find that all 3 are men.<\/p>\n<p id=\"N10BFC\"><em>Assessing Evidence:<\/em>\u00a0If the proportion of male managers hired is really .5 (H<sub>o<\/sub>\u00a0is true), then the probability that the random selection of three managers will yield three males is therefore .5 * .5 * .5 = .125. This is the p-value.<\/p>\n<p id=\"N10C05\"><em>Conclusion:<\/em>\u00a0Using .05 as the significance level, you conclude that since the p-value = .125 &gt; .05, the fact that the three randomly selected mangers were all males is not enough evidence to reject H<sub>o<\/sub>. In other words, you do not have enough evidence to reject the employer\u2019s claim of subscribing to an equal opportunity policy.<\/p>\n<p id=\"N10C0E\">However,\u00a0<em>the data (all three selected are males) definitely does not provide evidence to accept the employer\u2019s claim (H<sub>o<\/sub>).<\/em><\/p>\n<\/div>\n<\/div>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Learn by Doing<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p id=\"N10C22\">The following two hypotheses are tested:<\/p>\n<ul class=\"none\">\n<li>\n<p>H<sub>o<\/sub>: The proportion of U.S. adults who oppose gay marriage is roughly 50%.<\/p>\n<\/li>\n<li>\n<p id=\"N10C32\">H<sub>a<\/sub>: The proportion of U.S. adults who oppose gay marriage is above 50% (i.e., the majority oppose).<\/p>\n<\/li>\n<\/ul>\n<p>Suppose a survey was conducted in which a random sample of 1,100 U.S. adults was asked about their opinions about gay marriage, and based on the data, the p-value was found to be .002.<\/p>\n<p id=\"N10C3C\">Comment: Throughout this activity use a .05 (5%) significance level (cutoff).<\/p>\n<div id=\"h5p-157\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-157\" class=\"h5p-iframe\" data-content-id=\"157\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"8.1 Learn by doing 6\"><\/iframe><\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p id=\"N10CC5\">The following two hypotheses are tested:<\/p>\n<ul class=\"none\">\n<li>\n<p id=\"N10CCD\">H<sub>o<\/sub>: The average number of miles driven per year is 12,000.<\/p>\n<\/li>\n<li>\n<p id=\"N10CD5\">H<sub>a<\/sub>: The average number of miles driven per year is less than 12,000.<\/p>\n<\/li>\n<\/ul>\n<p id=\"N10CDC\">In a survey, 1,600 randomly selected drivers were asked the number of miles they drive yearly. Based upon the results, the p-value = .068.<\/p>\n<p id=\"N10CDF\">Comment: Throughout this activity use a .05 (5%) significance level.<\/p>\n<div id=\"h5p-158\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-158\" class=\"h5p-iframe\" data-content-id=\"158\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"8.1 Learn by doing 7\"><\/iframe><\/div>\n<\/div>\n<\/div>\n<\/div>\n<h2><span title=\"Quick scroll up\">Let\u2019s summarize<\/span><\/h2>\n<p id=\"N10B20\">We learned quite a lot about hypothesis testing. We learned the logic behind it, what the key elements are, and what types of conclusions we can and cannot draw in hypothesis testing. Here is a quick recap:<\/p>\n<div class=\"figurewrap\">\n<div class=\"figure clearfix\">\n<div id=\"uwrap__i_0\" class=\"youtube\">\n<p>&nbsp;<\/p>\n<p><iframe loading=\"lazy\" id=\"oembed-1\" title=\"Hypothesis Testing\" width=\"500\" height=\"375\" src=\"https:\/\/www.youtube.com\/embed\/GzkWcsJyPH4?feature=oembed&#38;rel=0&#38;rel=0\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p><em>Background:<\/em>\u00a0Based on the National Center of Health Statistics, the proportion of babies born at low birth weight (below 2,500 grams) in the United States is roughly .078, or 7.8% (based on all the births in the United States in the year 2002). A study was done in order to check whether smoking by pregnant women increases the risk of low birth weight. In other words, the researchers wanted to check whether the proportion of babies born at low birth weight among women who smoked during their pregnancy is higher than the proportion in the general population. The researchers followed a sample of 400 women who had smoked during their pregnancy and recorded the birth weight of the newborns. Based on the data, the p-value was found to be .016.<\/p>\n<div id=\"h5p-159\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-159\" class=\"h5p-iframe\" data-content-id=\"159\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"8.1 Did I get this 2\"><\/iframe><\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p>The same researchers also wanted to examine whether second-hand smoking (exposure to a another person smoking) by pregnant women increases the risk of low birth weight (i.e., the proportion of babies born at a low birth weight among women who were second-hand smokers during their pregnancy is higher than the proportion in the general population). The researchers obtained a sample of 175 pregnant women who were second-hand smokers, followed them during their pregnancies, and found that 10.2% of the newborns had low birth weight. Based on these data, the p-value was found to be .119.<\/p>\n<div id=\"h5p-160\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-160\" class=\"h5p-iframe\" data-content-id=\"160\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"8.1 Did I get this 3\"><\/iframe><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/section>\n","protected":false},"author":150,"menu_order":2,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[48],"contributor":[],"license":[],"class_list":["post-555","chapter","type-chapter","status-publish","hentry","chapter-type-numberless"],"part":421,"_links":{"self":[{"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/pressbooks\/v2\/chapters\/555","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/wp\/v2\/users\/150"}],"version-history":[{"count":9,"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/pressbooks\/v2\/chapters\/555\/revisions"}],"predecessor-version":[{"id":908,"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/pressbooks\/v2\/chapters\/555\/revisions\/908"}],"part":[{"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/pressbooks\/v2\/parts\/421"}],"metadata":[{"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/pressbooks\/v2\/chapters\/555\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/wp\/v2\/media?parent=555"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/pressbooks\/v2\/chapter-type?post=555"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/wp\/v2\/contributor?post=555"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/wp\/v2\/license?post=555"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}