{"id":571,"date":"2024-10-18T02:48:24","date_gmt":"2024-10-18T02:48:24","guid":{"rendered":"https:\/\/pressbooks.ccconline.org\/mat1260\/?post_type=chapter&#038;p=571"},"modified":"2025-01-24T18:49:16","modified_gmt":"2025-01-24T18:49:16","slug":"10-2-two-independent-means","status":"publish","type":"chapter","link":"https:\/\/pressbooks.ccconline.org\/mat1260\/chapter\/10-2-two-independent-means\/","title":{"raw":"10.2: Inference for Two Independent Means","rendered":"10.2: Inference for Two Independent Means"},"content":{"raw":"<div id=\"lobjh\" class=\"\">\r\n<div class=\"textbox textbox--learning-objectives\"><header class=\"textbox__header\">\r\n<h2 class=\"textbox__title\">Learning Objectives<\/h2>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<ul>\r\n \t<li id=\"carry_out_inferential_method_groups\">In a given context, carry out the inferential method for comparing groups and draw the appropriate conclusions.<\/li>\r\n \t<li id=\"specify_hypotheses_for_groups\">Specify the null and alternative hypotheses for comparing groups.<\/li>\r\n<\/ul>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<div id=\"b5c4b619553c4e55ba36c14050ba4849\" class=\"section purposewrap\">\r\n<div class=\"sectionContain\">\r\n<h2><span title=\"Quick scroll up\">Comparing Two Means\u2014Two Independent Samples (The Two-Sample t-Test)<\/span><\/h2>\r\n<div id=\"ceb0b2ffcfbf4c118deb08c5504905f9\" class=\"section\">\r\n<div class=\"sectionContain\">\r\n<h3>Overview<\/h3>\r\n<p id=\"e89b27f006554c30a9d3bcfe60ae8899\">As we mentioned in the summary of the introduction to Case C\u2192Q, the first case that we will deal with is comparing two means when the two samples are independent:<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"ccbe0916ce1c4a17ac7bc369ff890fd3\" class=\"img-responsive popimg aligncenter\" title=\"Sub-Population 1 has a Y Mean of \u03bc_1, and Sub-Population 2 has a Y Mean of \u03bc_2. From Sub-population 1 we take an SRS of size n_1, and from Sub-population 2 we take an SRS of size n_2. Both of these samples are independent.\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image014.gif\" alt=\"Sub-Population 1 has a Y Mean of \u03bc_1, and Sub-Population 2 has a Y Mean of \u03bc_2. From Sub-population 1 we take an SRS of size n_1, and from Sub-population 2 we take an SRS of size n_2. Both of these samples are independent.\" \/><\/span><\/span>\r\n<p id=\"d7b09156f9a84e869b349611cdb4008e\">Recall that here we are interested in the effect of a two-valued (k = 2) categorical variable (X) on a quantitative response (Y). Samples are drawn independently from the two sub-populations (defined by the two categories of X), and we need to evaluate whether or not the data provide enough evidence for us to believe that the two sub-population means are different.<\/p>\r\n<p id=\"ed61630578e547b1aefff7879e709a58\">In other words, our goal is to test whether the means \u03bc<sub>1<\/sub>\u00a0and \u03bc<sub>2<\/sub>\u00a0(which are the means of the variable of interest in the two sub-populations) are equal or not, and in order to do that we have two samples, one from each sub-population, which were chosen independently of each other. As the title of this part suggests, the test that we will learn here is commonly known as the\u00a0<em class=\"italic\">two-sample t-test<\/em>. As the name suggests, this is a t-test, which as we know means that the p-values for this test are calculated under some t distribution. Here is how this part is organized.<\/p>\r\n<p id=\"f9d3cd00f134493cbdc92b015088ad52\">We first introduce our leading example, and then go in detail through the four steps of the two-sample t-test, illustrating each step using our example.<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<div id=\"d2b04204d7c041039bc6c814df590186\" class=\"pulloutwrap note\">\r\n<div class=\"pullout clearfix\">\r\n<div>\r\n\r\n<strong><span class=\"pullout-lbl\">Note\u2026<\/span><\/strong>Up until now, we have been dividing our population into\u00a0<em class=\"italic\">sub-populations<\/em>, then sampling from these sub-populations.\r\n<p id=\"ed74e7b3f9ea4527af290e98a77ef577\">From now on, instead of calling them sub-populations, we will usually call the groups we wish to compare\u00a0<em class=\"italic\">population 1, population 2,\u00a0<\/em>and so on. These two descriptions of the groups we are comparing can be used interchangeably.<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<div id=\"de803c92f05c4a30828ba9066eacaf54\" class=\"examplewrap\">\r\n<div class=\"example clearfix\">\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Example<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<p id=\"c16e7606130f4c3796cf7cc06e8ff642\">What is more important to you \u2014 personality or looks?<\/p>\r\n<p id=\"ea70b07524744541aadd62fd37c15fb9\">This question was asked of a random sample of 239 college students, who were to answer on a scale of 1 to 25. An answer of 1 means personality has maximum importance and looks no importance at all, whereas an answer of 25 means looks have maximum importance and personality no importance at all. The purpose of this survey was to examine whether males and females differ with respect to the importance of looks vs. personality.<\/p>\r\n\r\n<div class=\"altSelector\"><\/div>\r\n<div class=\"Excel2019PC altContentOn\">\r\n<div class=\"alternative\">\r\n<p id=\"edb825a6a2fe45cdbc3d3dcea809128d\">To open Excel with the data in the worksheet, right-click to download the <a href=\"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-content\/uploads\/sites\/206\/2024\/10\/looks.xls\">looks<\/a> file to your computer. Then find the downloaded file and double-click it to open it in Excel. When Excel opens, you may have to enable editing.<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<p id=\"d7156d8a11cc41489cab140e427090b9\">Note that the data have the following format:<\/p>\r\n\r\n<table id=\"a5d58fb7c8f9485b9cab58780357dd23\" class=\"grid aligncenter\">\r\n<thead>\r\n<tr style=\"height: 27px\">\r\n<th style=\"height: 27px;width: 136.031px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\r\n<p id=\"ad37ffe30324e4b348ac81595f39e03e2\">Score (Y)<\/p>\r\n<\/th>\r\n<th style=\"height: 27px;width: 160.719px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\r\n<p id=\"abde5e22fcb374270914b9524ce7c935e\">Gender (X)<\/p>\r\n<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr style=\"height: 27px\">\r\n<td style=\"height: 27px;width: 136.531px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\r\n<p id=\"addaf7a5ca1704a2cb4c37624939b8ac2\">15<\/p>\r\n<\/td>\r\n<td style=\"height: 27px;width: 161.219px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\r\n<p id=\"aff39d31e08424d238c1872480333ae7a\">Male<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"e\" style=\"height: 27px\">\r\n<td style=\"height: 27px;width: 136.531px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\r\n<p id=\"ad4d7bc47268a40bd928ea72082dbb9d6\">13<\/p>\r\n<\/td>\r\n<td style=\"height: 27px;width: 161.219px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\r\n<p id=\"ae32166c3cc354c5db2d1c6a106391aaf\">Female<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr style=\"height: 27px\">\r\n<td style=\"height: 27px;width: 136.531px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\r\n<p id=\"af2c4025ec7cd47bdadded2fb9a89139b\">10<\/p>\r\n<\/td>\r\n<td style=\"height: 27px;width: 161.219px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\r\n<p id=\"ad17b64c929a54906a9976106ca77d542\">Female<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"e\" style=\"height: 27px\">\r\n<td style=\"height: 27px;width: 136.531px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\r\n<p id=\"ae201938ba3c741fe963151df97304dff\">12<\/p>\r\n<\/td>\r\n<td style=\"height: 27px;width: 161.219px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\r\n<p id=\"ab50e8397ad42409491bde442c88e0c83\">Male<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr style=\"height: 27px\">\r\n<td style=\"height: 27px;width: 136.531px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\r\n<p id=\"aa89c817de4f9443a997fea77ea8bca56\">14<\/p>\r\n<\/td>\r\n<td style=\"height: 27px;width: 161.219px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\r\n<p id=\"aa1ec504596e442b4a6b9382e81028883\">Female<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"e\" style=\"height: 27px\">\r\n<td style=\"height: 27px;width: 136.531px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\r\n<p id=\"aa74a699bae9c491d8c857491450ddd7f\">14<\/p>\r\n<\/td>\r\n<td style=\"height: 27px;width: 161.219px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\r\n<p id=\"aed45f110e92e4645b3b5790e6abeeef7\">Male<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr style=\"height: 27px\">\r\n<td style=\"height: 27px;width: 136.531px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\r\n<p id=\"aa591601b2e4b4005892551f7b912d723\">6<\/p>\r\n<\/td>\r\n<td style=\"height: 27px;width: 161.219px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\r\n<p id=\"acd9979189b3f412984ca1a80846347f5\">Male<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"e\" style=\"height: 10px\">\r\n<td style=\"height: 10px;width: 136.531px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\r\n<p id=\"add0baa2ccf574a4b84dbe3103a437aeb\">17<\/p>\r\n<\/td>\r\n<td style=\"height: 10px;width: 161.219px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\r\n<p id=\"abf6f57cb867a4db2a4c77f220adb64b4\">Male<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr style=\"height: 27px\">\r\n<td style=\"height: 27px;width: 136.531px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\r\n<p id=\"ab61ebc7846274e31813a0f27e9dbbd12\">etc.<\/p>\r\n<\/td>\r\n<td style=\"height: 27px;width: 161.219px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\r\n<p id=\"N10C8A\"><\/p>\r\n<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<p id=\"a687dd9b2a67499f895411cd5c991578\">The format of the data reminds us that we are essentially examining the relationship between the two-valued categorical variable, gender, and the quantitative response, score. The two values of the categorical explanatory variable define the two populations that we are comparing \u2014 males and females. The comparison is with respect to the response variable score. Here is a figure that summarizes the example:<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"cf9cbc8d9b2742a38c152deb41b4dddb\" class=\"img-responsive popimg aligncenter\" title=\"We have two populations, Females and Males. This is our Gender (X) Variable. For each of these populations, there is a Score (Y) mean, \u03bc_1 for Females and \u03bc_2 for Males. For the Female population we generate an SRS of size 150. For Males, we generate a SRS of size 85.\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image018.gif\" alt=\"We have two populations, Females and Males. This is our Gender (X) Variable. For each of these populations, there is a Score (Y) mean, \u03bc_1 for Females and \u03bc_2 for Males. For the Female population we generate an SRS of size 150. For Males, we generate a SRS of size 85.\" \/><\/span><\/span>\r\n<p id=\"cba6ede8680247b99165fc345d8237d0\"><em class=\"italic\">Comments:<\/em><\/p>\r\n\r\n<ol id=\"a7f70c3359a648cea04c847e4c4b2716\">\r\n \t<li>\r\n<p id=\"ef055f8c386442f0a21f7801ca08ffe2\">Note that this figure emphasizes how the fact that our explanatory is a two-valued categorical variable means that in practice we are comparing two populations (defined by these two values) with respect to our response Y.<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"d85071ffd5f746b08a6054a6b5305122\">Note that even though the problem description just says that we had 239 students, the figure tells us that there were 85 males in the sample, and 150 females.<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"cc8f17ae69724a50b3bd2c00e231a3e7\">Following up on comment 2, note that 85 + 150 = 235 and not 239. In these data (which are real) there are four \u201cmissing observations\u201d\u20144 students for which we do not have the value of the response variable, \u201cimportance.\u201d This could be due to a number of reasons, such as recording error or nonresponse. The bottom line is that even though data were collected from 239 students, effectively we have data from only 235. (Recommended: Go through the data file and note that there are 4 cases of missing observations: students 34, 138, 179, and 183).<\/p>\r\n<\/li>\r\n<\/ol>\r\n<\/div>\r\n<\/div>\r\n<div id=\"N10AFD\" class=\"section\">\r\n<div class=\"sectionContain\">\r\n<h2><span title=\"Quick scroll up\">The Two-Sample t-Test<\/span><\/h2>\r\n<p id=\"N10B04\">Here again is the general situation which requires us to use the two-sample t-test:<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"_i_0\" class=\"img-responsive popimg aligncenter\" title=\"Sub-Population 1 has a Y Mean of \u03bc_1, and Sub-Population 2 has a Y Mean of \u03bc_2. From Sub-population 1 we take an SRS of size n_1, and from Sub-population 2 we take an SRS of size n_2. Both of these samples are independent.\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image014.gif\" alt=\"Sub-Population 1 has a Y Mean of \u03bc_1, and Sub-Population 2 has a Y Mean of \u03bc_2. From Sub-population 1 we take an SRS of size n_1, and from Sub-population 2 we take an SRS of size n_2. Both of these samples are independent.\" \/><\/span><\/span>\r\n<p id=\"N10B0D\">Our goal is to compare the means \u03bc<sub>1<\/sub>\u00a0and \u03bc<sub>2<\/sub>\u00a0based on the two independent samples.<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<div id=\"N10B1B\" class=\"section\">\r\n<div class=\"sectionContain\">\r\n<h2><span title=\"Quick scroll up\">Step 1: Stating the Hypotheses<\/span><\/h2>\r\n<p id=\"N10B22\">The hypotheses represent our goal, comparing the means: \u03bc<sub>1<\/sub>\u00a0and \u03bc<sub>2<\/sub>\u00a0.<\/p>\r\n\r\n<ul>\r\n \t<li>\r\n<p id=\"N10B2D\">The null hypothesis has the form:<\/p>\r\n\r\n<ul class=\"none\">\r\n \t<li>\r\n<p id=\"N10B33\">[latex]H_{0}:\\mu _{1}-\\mu _{2}=0[\/latex] (which is the same as [latex] H_{0}:\\mu _{1}=\\mu _{2}[\/latex] )<\/p>\r\n<\/li>\r\n<\/ul>\r\n<\/li>\r\n \t<li>\r\n<p id=\"N10B91\">The alternative hypothesis takes one of the following three forms (depending on the context):<\/p>\r\n\r\n<ul class=\"none\">\r\n \t<li>\r\n<p id=\"N10B97\">[latex]H_{a}:\\mu _{1}-\\mu _{2}&lt; 0[\/latex] (which is the same as [latex] H_{a}:\\mu _{1}&lt;\\mu _{2}[\/latex] ) (one-sided)<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"N10BF5\">[latex]H_{a}:\\mu _{1}-\\mu _{2}&gt; 0[\/latex] (which is the same as [latex] H_{a}:\\mu _{1}&gt;\\mu _{2}[\/latex] ) (one-sided)<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"N10C53\">[latex]H_{a}:\\mu _{1}- \\mu _{2}\\neq 0[\/latex]\u00a0(which is the same as\u00a0&lt;\u00a0[latex]H_{0}:\\mu _{1}\\neq \\mu _{2}[\/latex]\u00a0) (two-sided)<\/p>\r\n<\/li>\r\n<\/ul>\r\n<\/li>\r\n<\/ul>\r\n<p id=\"N10CB0\">Note that the null hypothesis claims that there is no difference between the means, which can either represented as the difference is 0 (no difference), or as its (algebraically and conceptually) equivalent,\u00a0[latex]\\mu _{1}= \\mu _{2}[\/latex]\u00a0(the means are equal). Either way, conceptually, H<sub>o<\/sub>\u00a0claims that there is no relationship between the two relevant variables.<\/p>\r\n<p id=\"N10CD4\">The first way of writing the hypotheses (using a difference between the means) will be easier to use when (in the future) we look for a difference that is not 0.<\/p>\r\n<p id=\"N10CD7\">Each one of the three alternatives claims that there is a difference between the means. The two one-sided alternatives specify the nature of the difference; either negative, indicating that \u03bc<sub>1<\/sub>\u00a0is smaller than \u03bc<sub>2<\/sub>, or positive, indicating that \u03bc<sub>1<\/sub>\u00a0is larger than \u03bc<sub>2<\/sub>. The two-sided alternative, as usual, is more general and simply claims that a difference exists. As before, it should be clear from the context of the problem which of the three alternatives is appropriate.<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<div id=\"N10CEB\" class=\"section\">\r\n<div class=\"sectionContain\">\r\n<h2><span title=\"Quick scroll up\">Comment<\/span><\/h2>\r\n<p id=\"N10CF2\">Note that our parameter of interest in this case (the parameter about which we are making an inference) is the difference between the means\u00a0[latex]\\mu_{1}-\\mu _{2}[\/latex]\u00a0, and that the null value is 0.<\/p>\r\n\r\n<div class=\"examplewrap\">\r\n<div class=\"example clearfix\">\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Example<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<p id=\"N10D15\">Recall that the purpose of this survey was to examine whether the opinions of females and males\u00a0<em>differ\u00a0<\/em>with respect to the importance of looks vs. personality. The hypotheses in this case are therefore:<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"_i_1\" class=\"img-responsive popimg aligncenter\" title=\"H_0: \u03bc_1 - \u03bc_2 = 0, H_a: \u03bc_1 - \u03bc_2 \u2260 0\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image030.gif\" alt=\"H_0: \u03bc_1 - \u03bc_2 = 0, H_a: \u03bc_1 - \u03bc_2 \u2260 0\" \/><\/span><\/span>\r\n<p id=\"N10D21\">where \u03bc<sub>1<\/sub>\u00a0represents the mean importance for females and \u03bc<sub>2<\/sub>\u00a0represents the mean importance for males.<\/p>\r\n<p id=\"N10D2A\">It is important to understand that conceptually, the two hypotheses claim:<\/p>\r\n<p id=\"N10D2D\">H<sub>o<\/sub>: Score (of looks vs. personality) is not related to gender<\/p>\r\n<p id=\"N10D33\">H<sub>a<\/sub>: Score (of looks vs. personality) is related to gender<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<p id=\"N10D44\">In order to check the claim that the pregnancy length of women who smoke during pregnancy is shorter, on average, than the pregnancy length of women who do not smoke, a random sample of 35 pregnant women who smoke and a random sample of 35 pregnant women who do not smoke were chosen and their pregnancy lengths were recorded. Here is a figure of this example:<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"_i_2\" class=\"img-responsive popimg aligncenter\" title=\"The Smoking (X) variable gives us our two populations. These are Population 1: Pregnant women who smoke, and Pop 2: Pregnant Women who don't smoke. For each of these populations we have the variable Length (Y) and its mean. For smokers we have \u03bc_1, and for non-smokers we have \u03bc_2. From the population of smokers, we create an SRS of size 35, and from the population of non-smokers we create an SRS of 35.\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image156.gif\" alt=\"The Smoking (X) variable gives us our two populations. These are Population 1: Pregnant women who smoke, and Pop 2: Pregnant Women who don't smoke. For each of these populations we have the variable Length (Y) and its mean. For smokers we have \u03bc_1, and for non-smokers we have \u03bc_2. From the population of smokers, we create an SRS of size 35, and from the population of non-smokers we create an SRS of 35.\" \/><\/span><\/span>\r\n<div class=\"asx\">\r\n<div id=\"du4_m3_twosamples2_tutor1\" class=\"activitywrap sectionNest flash\">\r\n<div class=\"activityhead\">\r\n<div class=\"activityinfo\"><\/div>\r\n<\/div>\r\n<div class=\"actContain\">\r\n<div class=\"activity flash\">\r\n<div id=\"u4_m3_twosamples2_tutor1\" class=\"flash_obj asx testFlash mark_flash\">\r\n<div id=\"ou4_m3_twosamples2_tutor1\" class=\"page 2963795\">\r\n<div id=\"2963795\" class=\"question ddfb\">\r\n<div>\r\n<p id=\"N10077\">[h5p id=\"224\"]<\/p>\r\n<p id=\"N1007F\">Note that \u201cmu\u201d stands for the Greek letter \u03bc, the population mean\u2014mu1 stands for the mean of population 1 and mu2 stands for the mean of population 2.<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<div id=\"ad7f14fd8ed34ebb8a4fb01f9a96c2e6\" class=\"section\">\r\n<div class=\"sectionContain\">\r\n<h2><span title=\"Quick scroll up\">Step 2: Check Conditions, and Summarize the Data Using a Test Statistic<\/span><\/h2>\r\n<p id=\"fc79d192eed94671a30c9f62206b8c86\">The two-sample t-test can be safely used as long as the following conditions are met:<\/p>\r\n\r\n<ol id=\"f97a6abe462148c782ad0abf03782069\">\r\n \t<li>\r\n<p id=\"e6a1e2f4a4e241d482a98ad7875ec678\">The two samples are indeed independent.<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"c823cd8259ed4d868d88a8924fafef96\">We are in one of the following two scenarios:<\/p>\r\n\r\n<ol id=\"fe491992270e4f5392699e06513930a7\" class=\"lower-roman\">\r\n \t<li>\r\n<p id=\"e0762ba45fa246b58625d4b82860f412\">Both populations are normal, or more specifically, the distribution of the response Y in both populations is normal, and both samples are random (or at least can be considered as such). In practice, checking normality in the populations is done by looking at each of the samples using a histogram and checking whether there are any signs that the populations are not normal. Such signs could be extreme skewness and\/or extreme outliers.<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"dc38d4c25aa64eeb921c2db27dbe75e8\">The populations are known or discovered not to be normal, but the sample size of each of the random samples is large enough (we can use the rule of thumb that &gt; 30 is considered large enough).<\/p>\r\n<\/li>\r\n<\/ol>\r\n<\/li>\r\n<\/ol>\r\n<p id=\"fd6e2f7b42824dd09a723a8783d13f40\">Assuming that we can safely use the two-sample t-test, we need to summarize the data, and in particular, calculate our data summary\u2014the test statistic.<\/p>\r\n<p id=\"bf0d2ea66ebe4fb9b6bdaf2e2a93f1dd\"><em class=\"italic\">The two-sample t-test statistic<\/em>\u00a0is:<\/p>\r\n\u00a0[latex]t=\\frac{(\\bar{y_{1}}-\\bar{y_{2}})-0}{\\sqrt{\\frac{s_{1}^{2}}{n_{1}}+\\frac{s_{2}^{2}}{n_{2}}}}[\/latex]\r\n<p id=\"b664119348e64ba69e3a98ca0bb6523b\">Where:<\/p>\r\n<p id=\"f09a09d7d13640c89e0df517e41b883e\">[latex]\\overline{y_{1}},\\overline{y_{2}}[\/latex]\u00a0are the sample means of the samples from population 1 and population 2 respectively.<\/p>\r\n<p id=\"bb324e17e88645608b65224d5035eb83\"><span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">s<sub>1<\/sub><\/span><\/span><\/span><\/span><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">,<\/span><\/span><span class=\"mjx-mtext MJXc-space1\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00a0<\/span><\/span><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">s<sub>2<\/sub><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00a0are the sample standard deviations of the samples from population 1 and population 2 respectively.<\/p>\r\n<p id=\"c38c81f4ee3e4ab185da7dc1858960be\"><span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">n<sub>1<\/sub><\/span><\/span><\/span><\/span><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">,<\/span><\/span><span class=\"mjx-mtext MJXc-space1\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00a0<\/span><\/span><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">n<sub>2<\/sub><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00a0are the sample sizes of the two samples.<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<div id=\"a0c9248df13542bfa77e06a0664b64a7\" class=\"section\">\r\n<div class=\"sectionContain\">\r\n<h2><span title=\"Quick scroll up\">Comment<\/span><\/h2>\r\n<p id=\"f959b35949654b7195a9d6c14bf09748\">Let\u2019s see why this test statistic makes sense, bearing in mind that our inference is about\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/span><\/span><\/span><\/span>.<\/p>\r\n\r\n<ul id=\"cf85e0c633e64748a6cc9c35c766f50c\">\r\n \t<li>\r\n<p id=\"d0c4fdaef9aa49c18ef5c1f0d2ac1d7b\"><span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-delim-h\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">y<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00a0estimates \u03bc<sub>1<\/sub>\u00a0and\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-delim-h\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">y<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00a0estimates \u03bc<sub>2<\/sub>, and therefore\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-delim-h\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">y<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-mtext MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00a0<\/span><\/span><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-delim-h\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">y<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00a0is what the data tell me about (or, how the data estimate)<\/p>\r\n<p id=\"f04c54a0c4374cefb205ef3eb832a3d3\"><span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span>.<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"b20f1197393248abb0cf5829eb5ecf1d\">0 is the \u201cnull value\u201d \u2014 what the null hypothesis, H<sub>o<\/sub>, claims that\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span id=\"MJXc-Node-141\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span id=\"MJXc-Node-142\" class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span>\u00a0is.<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"a8c1b5b850a34d2887a4ba31d6201260\">The denominator\u00a0[latex]\\sqrt{\\frac{\\mathcal{s}_1^2}{\\mathcal{n}_1}+\\frac{\\mathcal{s}_2^2}{\\mathcal{n}_2}}[\/latex]\u00a0is the standard error of\u00a0<span id=\"MathJax-Element-12-Frame\" class=\"mjx-chtml MathJax_CHTML\"><span id=\"MJXc-Node-166\" class=\"mjx-math\"><span id=\"MJXc-Node-167\" class=\"mjx-mrow\"><span id=\"MJXc-Node-168\" class=\"mjx-mrow\"><span id=\"MJXc-Node-169\" class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span id=\"MJXc-Node-173\" class=\"mjx-mo\"><span class=\"mjx-delim-h\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><\/span><span class=\"mjx-op\"><span id=\"MJXc-Node-170\" class=\"mjx-msub\"><span class=\"mjx-base\"><span id=\"MJXc-Node-171\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">y<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span id=\"MJXc-Node-172\" class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><span id=\"MJXc-Node-174\" class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span id=\"MJXc-Node-175\" class=\"mjx-mtext MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00a0<\/span><\/span><span id=\"MJXc-Node-176\" class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span id=\"MJXc-Node-180\" class=\"mjx-mo\"><span class=\"mjx-delim-h\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><\/span><span class=\"mjx-op\"><span id=\"MJXc-Node-177\" class=\"mjx-msub\"><span class=\"mjx-base\"><span id=\"MJXc-Node-178\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">y<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span id=\"MJXc-Node-179\" class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>. (We will not go into the details of why this is true.)<\/p>\r\n<\/li>\r\n<\/ul>\r\n<p id=\"cefcd75fbda94756a85985b4dcd09f2d\">We therefore see that our test statistic, like the previous test statistics we encountered, has the structure:<\/p>\r\n[latex]\\frac{sample\\ estimate-null\\ value}{standard\\ error}[\/latex]\r\n<p id=\"e92762ca21d24b278c442ff3c1e47d6d\">and therefore, like the previous test statistics, measures (in standard errors) the difference between what the data tell us about the parameter of interest\u00a0<span id=\"MathJax-Element-14-Frame\" class=\"mjx-chtml MathJax_CHTML\"><span id=\"MJXc-Node-197\" class=\"mjx-math\"><span id=\"MJXc-Node-198\" class=\"mjx-mrow\"><span id=\"MJXc-Node-199\" class=\"mjx-mrow\"><span id=\"MJXc-Node-200\" class=\"mjx-msub\"><span class=\"mjx-base\"><span id=\"MJXc-Node-201\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span id=\"MJXc-Node-202\" class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span id=\"MJXc-Node-203\" class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span id=\"MJXc-Node-204\" class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span id=\"MJXc-Node-205\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span id=\"MJXc-Node-206\" class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><\/span>\u00a0(sample estimate) and what the null hypothesis claims the value of the parameter is (null value).<\/p>\r\n\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Example<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<p id=\"f05ee6ed0c6c44a0bf11fe4ba40526c4\">Let\u2019s first check whether the conditions that allow us to safely use the two-sample t-test are met.<\/p>\r\n\r\n<ol id=\"df98f3cefcce4aa6959463a890d1e4f0\" class=\"lower-roman\">\r\n \t<li>\r\n<p id=\"ef4d7a2bca6241d49d2aac10c66e14d9\">Here, 239 students were chosen and were naturally divided into a sample of females and a sample of males. Since the students were chosen at random, the sample of females is independent of the sample of males.<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"b6a8383586684f899088b14895907a66\">Here we are in the second scenario \u2014 the sample sizes (150 and 85), are definitely large enough, and so we can proceed regardless of whether the populations are normal or not.<\/p>\r\n<\/li>\r\n<\/ol>\r\n<div class=\"StatCrunch altContentOn\">\r\n<div class=\"alternative\">\r\n<p id=\"ca6ed6e22193464f82032799d4835a1b\">In order to avoid tedious calculations, we will lift the test statistic from the output. The StatCrunch output (edited) is shown below:<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"d19980c9effc49f2ba294184b28ff5d4\" class=\"img-responsive popimg aligncenter\" title=\"Two Sample T - Test and CI: Score(Y),Gender (X) Summary statistics for Score (Y): For Gender(X) = Female: n = 150, Mean = 10.733334, Std. Dev. = 4.254751, Std. Err. = 0.347399 For Gender(X) = Male: n = 85, Mean = 13.3294115, Std. Dev. = 4.0189676, Std. Err. = 0.43591824 Hypothesis test results: \u03bc_1: mean of score (Y) where X = Female. \u03bc_2: mean of score (Y) where X = Male. \u03bc_1 - \u03bc_2: mean difference. H_0: \u03bc_1 - \u03bc_2 = 0, H_A: \u03bc_1 - \u03bc_2 \u2260 0 Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 T-Stat: -4.657358 P-Value: &amp;lt; 0.0001 95% Confidence Interval Results: Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 L. Limit: -3.6958647 U. Limit: -1.4962921\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image192_statcrunch.gif\" alt=\"Two Sample T - Test and CI: Score(Y),Gender (X) Summary statistics for Score (Y): For Gender(X) = Female: n = 150, Mean = 10.733334, Std. Dev. = 4.254751, Std. Err. = 0.347399 For Gender(X) = Male: n = 85, Mean = 13.3294115, Std. Dev. = 4.0189676, Std. Err. = 0.43591824 Hypothesis test results: \u03bc_1: mean of score (Y) where X = Female. \u03bc_2: mean of score (Y) where X = Male. \u03bc_1 - \u03bc_2: mean difference. H_0: \u03bc_1 - \u03bc_2 = 0, H_A: \u03bc_1 - \u03bc_2 \u2260 0 Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 T-Stat: -4.657358 P-Value: &amp;lt; 0.0001 95% Confidence Interval Results: Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 L. Limit: -3.6958647 U. Limit: -1.4962921\" \/><\/span><\/span>\r\n<p id=\"b4f4d2c2dae945a59d6a76295ca02d2a\">As you can see we highlighted the \u201cingredients\u201d needed to calculate the test statistic, as well as the test statistic itself. Just for this first example, let\u2019s make sure that we understand what these ingredients are and how to use them to find the test statistic.<\/p>\r\n\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<h4 class=\"textbox__title\">Learn by Doing<\/h4>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\n[h5p id=\"225\"]\r\n\r\n<\/div>\r\n<\/div>\r\n<p id=\"d5047aff759044adb4d933cdb1cb8feb\">And when we put it all together we get that indeed,<\/p>\r\n[latex]\\mathcal{t}=\\frac{\\bar{\\mathcal{y}_1}-\\bar{\\mathcal{y}_2}-0}{\\sqrt{\\frac{\\mathcal{s}_1^2}{\\mathcal{n}_1}+\\frac{\\mathcal{s}_2^2}{\\mathcal{n}_2}}}=\\ \\frac{10.73-13.33}{\\sqrt{\\frac{{4.25}^2}{150}+\\frac{{4.02}^2}{85}}}=-4.66[\/latex]\r\n<p id=\"e1dac8d831e74cd5b75e1b908018d071\">The test statistic tells us what the data tell us about\u00a0<span id=\"MathJax-Element-16-Frame\" class=\"mjx-chtml MathJax_CHTML\"><span id=\"MJXc-Node-286\" class=\"mjx-math\"><span id=\"MJXc-Node-287\" class=\"mjx-mrow\"><span id=\"MJXc-Node-288\" class=\"mjx-mrow\"><span id=\"MJXc-Node-289\" class=\"mjx-msub\"><span class=\"mjx-base\"><span id=\"MJXc-Node-290\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span id=\"MJXc-Node-291\" class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span id=\"MJXc-Node-292\" class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span id=\"MJXc-Node-293\" class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span id=\"MJXc-Node-294\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span id=\"MJXc-Node-295\" class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><\/span>. In this case that difference (10.73 \u2013 13.33) is 4.66 standard errors below what the null hypothesis claims this difference to be (0). 4.66 standard errors is quite a lot and probably indicates that the data provide evidence against H<sub>o<\/sub>.<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\nWe have completed step 2 and are ready to proceed to step 3, finding the p-value of the test.\r\n<h2><span title=\"Quick scroll up\">Step 3: Finding the p-value of the test<\/span><\/h2>\r\n<p id=\"bf615bd08efb48c7b0e9cb6280496b06\">Since our test is called the two-sample t test ,we know that the p-values are calculated under a t distribution. Indeed, it turns out that the null distribution of our test statistic is approximately t. Figuring out which one of the t distributions (in other words, how many degrees of freedom this t distribution has) is quite involved and will not be discussed here. Instead, we use a statistics package to find that the p-value in this case is 0.<\/p>\r\n\r\n<div id=\"a3d1aa7f1b1a48738a4ce6a04142e632\" class=\"examplewrap\">\r\n<div class=\"example clearfix\">\r\n<div class=\"StatCrunch altContentOn\">\r\n<div class=\"alternative\">\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Example<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<div class=\"StatCrunch altContentOn\">\r\n<div class=\"alternative\">\r\n<p id=\"ee43826ce74c4b1fbe34d003ff40f6b9\">Here, again is the relevant output for our example:<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"f6a2b08958c047a38845a8069d2ac604\" class=\"img-responsive popimg aligncenter\" title=\"Two Sample T - Test and CI: Score(Y),Gender (X) Summary statistics for Score (Y): For Gender(X) = Female: n = 150, Mean = 10.733334, Std. Dev. = 4.254751, Std. Err. = 0.347399 For Gender(X) = Male: n = 85, Mean = 13.3294115, Std. Dev. = 4.0189676, Std. Err. = 0.43591824 Hypothesis test results: \u03bc_1: mean of score (Y) where X = Female. \u03bc_2: mean of score (Y) where X = Male. \u03bc_1 - \u03bc_2: mean difference. H_0: \u03bc_1 - \u03bc_2 = 0, H_A: \u03bc_1 - \u03bc_2 \u2260 0 Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 T-Stat: -4.657358 P-Value: &amp;lt; 0.0001 95% Confidence Interval Results: Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 L. Limit: -3.6958647 U. Limit: -1.4962921\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image192_statcrunch_2.gif\" alt=\"Two Sample T - Test and CI: Score(Y),Gender (X) Summary statistics for Score (Y): For Gender(X) = Female: n = 150, Mean = 10.733334, Std. Dev. = 4.254751, Std. Err. = 0.347399 For Gender(X) = Male: n = 85, Mean = 13.3294115, Std. Dev. = 4.0189676, Std. Err. = 0.43591824 Hypothesis test results: \u03bc_1: mean of score (Y) where X = Female. \u03bc_2: mean of score (Y) where X = Male. \u03bc_1 - \u03bc_2: mean difference. H_0: \u03bc_1 - \u03bc_2 = 0, H_A: \u03bc_1 - \u03bc_2 \u2260 0 Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 T-Stat: -4.657358 P-Value: &amp;lt; 0.0001 95% Confidence Interval Results: Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 L. Limit: -3.6958647 U. Limit: -1.4962921\" \/><\/span><\/span>\r\n<p id=\"da07ff0ff6f74b49a6e124b201a2c5be\">According to the output the p-value of this test is less than 0.0001. How do we interpret this?<\/p>\r\n<p id=\"fdc7820a59154b94b73e13e69df2e2d2\">A p-value which is practically 0 means that it would be almost impossible to get data like that observed (or even more extreme) had the null hypothesis been true.<\/p>\r\n<p id=\"e6d287dcd66c412da4784fa3b75abd7f\">More specifically to our example, if there were no differences between females and males with respect to whether they value looks vs. personality, it would be almost impossible (probability approximately 0) to get data where the difference between the sample means of females and males is -2.596 (that difference is 10.733 \u2013 13.329 = -2.596) or higher.<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<p id=\"ec55a502861e40e48f5cd593f57e44b4\">Comment: Note that the output tells us that\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-delim-h\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">y<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-mtext MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00a0<\/span><\/span><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-delim-h\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">y<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00a0is approximately -2.6. But more importantly, we want to know if this difference is significant. To answer this, we use the fact that this difference is 4.66 standard errors below the null value.<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<div id=\"d2984175cb644581a87ba513017c7d68\" class=\"section\">\r\n<div class=\"sectionContain\">\r\n<h2><span title=\"Quick scroll up\">Step 4: Conclusion in context<\/span><\/h2>\r\n<p id=\"b8f476f21b2940eeb5cce53568ae47c5\">As usual a small p-value provides evidence against H<sub>o<\/sub>. In our case our p-value is practically 0 (which smaller than any level of significance that we will choose). The data therefore provide very strong evidence against H<sub>o<\/sub>\u00a0so we reject it and conclude that the mean Importance score (of looks vs personality) of males differs from that of females. In other words, males and females differ with respect to how they value looks vs. personality.<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<div id=\"fcc366c48d0d4fec9e5167d92daccdef\" class=\"section\">\r\n<div class=\"sectionContain\">\r\n<h2><span title=\"Quick scroll up\">Comments<\/span><\/h2>\r\n<p id=\"af42a2ce465d447d8ee53b1ce10d1586\">You might ask yourself: \u201cWhere do we use the test statistic?\u201d<\/p>\r\n<p id=\"d533fea5cb0f48d39638a29341248bce\">It is true that for all practical purposes all we have to do is check that the conditions which allow us to use the two-sample t-test are met, lift the p-value from the output, and draw our conclusions accordingly.<\/p>\r\n<p id=\"bb3b85caacc243198954378214f789dc\">However, we feel that it is important to mention the test statistic for two reasons:<\/p>\r\n\r\n<ol id=\"caf67a2ae21149c687e0e92f0ad6282e\">\r\n \t<li>\r\n<p id=\"d0d303e608c545e6a49da2994e252d66\">The test statistic is what\u2019s behind the scenes; based on its null distribution and its value, the p-value is calculated.<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"c3fb8578560d48fc86beea0ec6b1fa08\">Apart from being the key for calculating the p-value, the test statistic is also itself a measure of the evidence stored in the data against H<sub>o<\/sub>. As we mentioned, it measures (in standard errors) how different our data is from what is claimed in the null hypothesis.<\/p>\r\n<\/li>\r\n<\/ol>\r\n\r\n<hr \/>\r\n<p id=\"b270400a4cf340dd96d155e7c3b04c4a\">Let\u2019s look at another example, and then you\u2019ll do one yourself.<\/p>\r\n\r\n<div id=\"eb3d5832301444dfa3ec8e974b2e86e0\" class=\"examplewrap\">\r\n<div class=\"example clearfix\">\r\n<div id=\"eff9df4b81cb4e73a84b6d6326ad00ee\" class=\"section\">\r\n<div class=\"sectionContain\">\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Example<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<p id=\"d8bcd41a6c294bfbbafa227e139704e7\">According to the National Health And Nutrition Examination Survey (NHANES) sponsored by the U.S. government, a random sample of 712 males between 20 and 29 years of age and a random sample of 1,001 males over the age of 75 were chosen, and the weight of each of the males was recorded (in kg). Here is a summary of the results (source: http:\/\/www.cdc.gov\/nchs\/data\/ad\/ad347.pdf):<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"c8c2d0b8e7f84413af77e4265d72a1af\" class=\"img-responsive popimg aligncenter\" title=\"For males 20-29 years old, n = 712, Y-bar = 83.4, S = 18.7. For males 75+ years old, n = 1001, Y-bar = 78.5, S = 19.0\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image194.gif\" alt=\"For males 20-29 years old, n = 712, Y-bar = 83.4, S = 18.7. For males 75+ years old, n = 1001, Y-bar = 78.5, S = 19.0\" \/><\/span><\/span>\r\n<p id=\"d27d36c4fcf54a27bfbe00dbcc034612\">Do the data provide evidence that the younger male population weighs more (on average) than the older male population? (Note that here the data are given in a summarized form, unlike the previous problem, where the raw data were given.)<\/p>\r\n<p id=\"e411f0cb5bd448dbbd66f1d4a61de6bc\">Here is a figure that summarizes this example:<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"b4394595edec4b64a55a6f2d8988c55d\" class=\"img-responsive popimg aligncenter\" title=\"We have two populations, from the two categories in the variable Age Group(X). Population 1 is Males 20-29 years old, and Population 2 is Males 75+ years old. Population 1&amp;apos;s Weight (Y) mean is \u03bc_1, and population 2&amp;apos;s weight (Y) mean is \u03bc_2. For population 1, a SRS of size 712 is generated. It has a mean of 83.4 and SD of 18.7 . For population 2, another SRS is generated of size 1001. It has a mean of 78.5 and SD of 19.0 .\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image043.gif\" alt=\"We have two populations, from the two categories in the variable Age Group(X). Population 1 is Males 20-29 years old, and Population 2 is Males 75+ years old. Population 1&amp;apos;s Weight (Y) mean is \u03bc_1, and population 2&amp;apos;s weight (Y) mean is \u03bc_2. For population 1, a SRS of size 712 is generated. It has a mean of 83.4 and SD of 18.7 . For population 2, another SRS is generated of size 1001. It has a mean of 78.5 and SD of 19.0 .\" \/><\/span><\/span>\r\n<p id=\"eafc5b46660e4ba7ac3af95788fcf214\">Note that we defined the younger age group and the older age group as population 1 and population 2, respectively, and \u03bc<sub>1<\/sub>\u00a0and \u03bc<sub>2<\/sub>\u00a0as the mean weight of population 1 and population 2, respectively.<\/p>\r\n<p id=\"af2db248783a4c5c9e5753d4f0a86399\"><em class=\"italic\">Step 1:<\/em><\/p>\r\n<p id=\"f9f8633456c745b6932e869de26cb262\">Since we want to test whether the older age group (population 2) weighs less on average than the younger age group (population 1), we are testing:<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"ba0ecaa9314d40ca9e2a572238cd7eb8\" class=\"img-responsive popimg aligncenter\" title=\"H_0: \u03bc_1 - \u03bc_2 = 0, H_a: \u03bc_1 - \u03bc_2 &amp;gt; 0\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image044.gif\" alt=\"H_0: \u03bc_1 - \u03bc_2 = 0, H_a: \u03bc_1 - \u03bc_2 &amp;gt; 0\" \/><\/span><\/span>\r\n<p id=\"ab68c8a28da84b26beebb8c1c4f22216\">or equivalently,<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"cd0bfb7306b84e4ca183addaceb7d12e\" class=\"img-responsive popimg aligncenter\" title=\"H_0: \u03bc_1 = \u03bc_2, H_a: \u03bc_1 &amp;gt; \u03bc_2\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image045.gif\" alt=\"H_0: \u03bc_1 = \u03bc_2, H_a: \u03bc_1 &amp;gt; \u03bc_2\" \/><\/span><\/span>\r\n<p id=\"b770d52870bf4fa7b9abf9018d8c4761\"><em class=\"italic\">Step 2:<\/em><\/p>\r\n<p id=\"d53bf865dcf8466c81962e60d8d6798c\">We can safely use the two-sample t-test in this case since:<\/p>\r\n\r\n<ol id=\"cc5d2537c08446529bdc42f9f93de692\" class=\"lower-roman\">\r\n \t<li>\r\n<p id=\"ed490d0efa274047bbe7ac674396618c\">The samples are independent, since each of the samples was chosen at random.<\/p>\r\n<\/li>\r\n \t<li>\r\n<p id=\"e40ed184118f4582852174a0450d04e5\">Both sample sizes are very large (712 and 1,001), and therefore we can proceed regardless of whether the populations are normal or not.<\/p>\r\n<\/li>\r\n<\/ol>\r\n<p id=\"a46b62f88dc54b93a59d687801f701bc\">It is possible from these data to calculate the t-statistic of 5.31 and the p-value of 0.000. The t-value is quite large, and the p-value correspondingly small, indicating that our data are very different from what is claimed in the null hypothesis.<\/p>\r\n<p id=\"c67e98b271da452184e34acf82bc1df5\"><em class=\"italic\">Step 3:<\/em><\/p>\r\n<p id=\"e7a95f7ebfde4d1d84b0553a6b46736b\">The p-value is essentially 0, indicating that it would be nearly impossible to observe a difference between the sample mean weights of 4.9 (or more) if the mean weights in the age group populations were the same (i.e., if H<sub>o<\/sub>\u00a0were true).<\/p>\r\n<p id=\"b882b1c5b7aa4647b24713440abc7fe2\"><em class=\"italic\">Step 4:<\/em><\/p>\r\n<p id=\"b01f14ca4b3342bfada0f4a19255b91d\">A p-value of 0 (or very close to it) indicates that the data provide strong evidence against H<sub>o<\/sub>, so we reject it and conclude that the mean weight of males 20-29 years old is higher than the mean weight of males 75 years old and older. In other words, males in the younger age group weigh more, on average, than males in the older age group.<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<h2><span title=\"Quick scroll up\">Confidence Interval for (Two-Sample t Confidence Interval)<\/span><\/h2>\r\n<p id=\"f53e8bc858774f5dbeb9655bcb6c460d\">So far we\u2019ve discussed the two-sample t-test, which checks whether there is enough evidence stored in the data to reject the claim that\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">0<\/span><\/span><\/span><\/span><\/span>\u00a0(or equivalently, that\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-msub MJXc-space3\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00a0) in favor of one of the three possible alternatives.<\/p>\r\n<p id=\"aff7008c161d46c9b63dbd520ea38410\">If we would like to estimate\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span>\u00a0we can use the natural point estimate,\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-delim-h\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">y<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-mtext MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00a0<\/span><\/span><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-delim-h\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">y<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00a0, or preferably, a 95% confidence interval which will provide us with a set of plausible values for the difference between the population means\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span>\u00a0.<\/p>\r\n<p id=\"b52b90992ff34fe9b79da0dd0e667952\">In particular, if the test has rejected\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">H<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">o<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">:<\/span><\/span><span class=\"mjx-msub MJXc-space3\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">0<\/span><\/span><\/span><\/span><\/span>\u00a0, a confidence interval for\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00a0can be insightful since it quantifies the effect that the categorical explanatory variable has on the response.<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<div id=\"f9136214344044bf9e3304edcc6855fa\" class=\"section\">\r\n<div class=\"sectionContain\">\r\n<h2><span title=\"Quick scroll up\">Comment<\/span><\/h2>\r\n<p id=\"a93f7fd93cb34b20b0cfc6c2eada6451\">We will not go into the formula and calculation of the confidence interval, but rather ask our software to do it for us, and focus on interpretation.<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<div id=\"afdfb70e4fc64177b2e744344306204a\" class=\"examplewrap\">\r\n<div class=\"example clearfix\">\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Example<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<p id=\"a8312cf778e7448aba4821a5817057cf\">Recall our leading example about the looks vs. personality score of females and males:<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"e0a85bbb6c2e4d40933b82bc4b487958\" class=\"img-responsive popimg aligncenter\" title=\"The Gender(X) Variable has two categories, which gives us Population 1: Females and Population 2: Males. Each population has its own Y-Mean \u03bc, so population 1&amp;apos;s mean is \u03bc_1 and population 2&amp;apos;s mean is \u03bc_2. For each population we take an SRS. For Population 1, an SRS of size 150 is taken, and for population 2 an SRS of size 85 is taken.\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image047.gif\" alt=\"The Gender(X) Variable has two categories, which gives us Population 1: Females and Population 2: Males. Each population has its own Y-Mean \u03bc, so population 1&amp;apos;s mean is \u03bc_1 and population 2&amp;apos;s mean is \u03bc_2. For each population we take an SRS. For Population 1, an SRS of size 150 is taken, and for population 2 an SRS of size 85 is taken.\" \/><\/span><\/span>\r\n<div class=\"altSelector\"><\/div>\r\n<div class=\"StatCrunch altContentOn\">\r\n<div class=\"alternative\">\r\n<p id=\"d34df1f9495c44f4813597bab6208d5c\">Here again is the output:<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"f233ade69c0e4161819ee92641ed3c50\" class=\"img-responsive popimg aligncenter\" title=\"Two Sample T - Test and CI: Score(Y),Gender (X) Summary statistics for Score (Y): For Gender(X) = Female: n = 150, Mean = 10.733334, Std. Dev. = 4.254751, Std. Err. = 0.347399 For Gender(X) = Male: n = 85, Mean = 13.3294115, Std. Dev. = 4.0189676, Std. Err. = 0.43591824 Hypothesis test results: \u03bc_1: mean of score (Y) where X = Female. \u03bc_2: mean of score (Y) where X = Male. \u03bc_1 - \u03bc_2: mean difference. H_0: \u03bc_1 - \u03bc_2 = 0, H_A: \u03bc_1 - \u03bc_2 \u2260 0 Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 T-Stat: -4.657358 P-Value: &amp;lt; 0.0001 95% Confidence Interval Results: Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 L. Limit: -3.6958647 U. Limit: -1.4962921\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image196_statcrunch.gif\" alt=\"Two Sample T - Test and CI: Score(Y),Gender (X) Summary statistics for Score (Y): For Gender(X) = Female: n = 150, Mean = 10.733334, Std. Dev. = 4.254751, Std. Err. = 0.347399 For Gender(X) = Male: n = 85, Mean = 13.3294115, Std. Dev. = 4.0189676, Std. Err. = 0.43591824 Hypothesis test results: \u03bc_1: mean of score (Y) where X = Female. \u03bc_2: mean of score (Y) where X = Male. \u03bc_1 - \u03bc_2: mean difference. H_0: \u03bc_1 - \u03bc_2 = 0, H_A: \u03bc_1 - \u03bc_2 \u2260 0 Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 T-Stat: -4.657358 P-Value: &amp;lt; 0.0001 95% Confidence Interval Results: Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 L. Limit: -3.6958647 U. Limit: -1.4962921\" \/><\/span><\/span>\r\n<p id=\"ac039cbe56fbe4a328d403c6961a7fad5\"><\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<p id=\"c6b4811423b8489ba405019f28d1e1f1\">Recall that we rejected the null hypothesis in favor of the two-sided alternative and concluded that the mean score of females is different from the mean score of males. It would be interesting to supplement this conclusion with more details about this difference between the means, and the 95% confidence interval for\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00a0does exactly that.<\/p>\r\n<p id=\"e21766556fe148468e77e1b4c80fbe9b\">According to the output the 95% confidence interval for\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00a0is roughly (-3.7, -1.5). First, note that the confidence interval is strictly negative suggesting that \u03bc<sub>1<\/sub>\u00a0is lower than \u03bc<sub>2<\/sub>\u00a0. Furthermore, the confidence interval tells me that we are 95% confident that the mean \u201clooks vs. personality score\u201d of females ( \u03bc<sub>1<\/sub>\u00a0) is between 1.5 and 3.7 points lower than the mean looks vs. personality score of males ( \u03bc<sub>2<\/sub>\u00a0). The confidence interval therefore quantifies the effect that the explanatory variable (gender) has on the response (looks vs personality score).<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<h2><span title=\"Quick scroll up\">Comment<\/span><\/h2>\r\n<p id=\"N10B10\">As we\u2019ve seen in previous tests, as well as in the two-samples case, the 95% confidence interval for\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span>\u00a0can be used for testing in the two-sided case (<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">H<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">o<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">:<\/span><\/span><span class=\"mjx-msub MJXc-space3\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><code><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/code><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">0<\/span><\/span><\/span><\/span><\/span>\u00a0vs.\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">H<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">a<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">:<\/span><\/span><span class=\"mjx-msub MJXc-space3\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2260<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">0<\/span><\/span><\/span><\/span><\/span>\u00a0):<\/p>\r\nIf the null value, 0, falls outside the confidence interval, H<sub>o<\/sub>\u00a0is rejected\r\n\r\nIf the null value, 0, falls inside the confidence interval, H<sub>o<\/sub>\u00a0is not rejected\r\n<div class=\"examplewrap\">\r\n<div class=\"example clearfix\">\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Example<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<p id=\"N10B9F\">Let\u2019s go back to our leading example of the looks vs. personality score where we had a two-sided test.<\/p>\r\n\r\n<div class=\"altSelector\"><\/div>\r\n<div class=\"statcrunch altContentOn\">\r\n<div class=\"alternative\"><span class=\"imagewrap\"><span class=\"image\"><img class=\"img-responsive popimg aligncenter\" title=\"Two Sample T - Test and CI: Score(Y),Gender (X) Summary statistics for Score (Y): For Gender(X) = Female: n = 150, Mean = 10.733334, Std. Dev. = 4.254751, Std. Err. = 0.347399 For Gender(X) = Male: n = 85, Mean = 13.3294115, Std. Dev. = 4.0189676, Std. Err. = 0.43591824 Hypothesis test results: \u03bc_1: mean of score (Y) where X = Female. \u03bc_2: mean of score (Y) where X = Male. \u03bc_1 - \u03bc_2: mean difference. H_0: \u03bc_1 - \u03bc_2 = 0, H_A: \u03bc_1 - \u03bc_2 \u2260 0 Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 T-Stat: -4.657358 P-Value: &lt; 0.0001 95% Confidence Interval Results: Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 L. Limit: -3.6958647 U. Limit: -1.4962921\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/statcrunch_output.png\" alt=\"Two Sample T - Test and CI: Score(Y),Gender (X) Summary statistics for Score (Y): For Gender(X) = Female: n = 150, Mean = 10.733334, Std. Dev. = 4.254751, Std. Err. = 0.347399 For Gender(X) = Male: n = 85, Mean = 13.3294115, Std. Dev. = 4.0189676, Std. Err. = 0.43591824 Hypothesis test results: \u03bc_1: mean of score (Y) where X = Female. \u03bc_2: mean of score (Y) where X = Male. \u03bc_1 - \u03bc_2: mean difference. H_0: \u03bc_1 - \u03bc_2 = 0, H_A: \u03bc_1 - \u03bc_2 \u2260 0 Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 T-Stat: -4.657358 P-Value: &lt; 0.0001 95% Confidence Interval Results: Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 L. Limit: -3.6958647 U. Limit: -1.4962921\" \/><\/span><\/span><\/div>\r\n<\/div>\r\n<p id=\"N10BF1\">We used the fact that the p-value is so small to conclude that Ho can be rejected. We can also use the confidence interval to reach the same conclusion since 0 falls outside the confidence interval. In other words, since 0 is not a plausible value for\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span>\u00a0we can reject H<sub>o<\/sub>, which claims that\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">0<\/span><\/span><\/span><\/span><\/span> .<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<p id=\"N10C4A\">Below you'll find three sample outputs of the two-sided two-sample t-test:<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"_i_4\" class=\"img-responsive popimg aligncenter\" title=\"H_0: \u03bc_1 - \u03bc_2 = 0 vs. H_a: \u03bc_1 - \u03bc_2 \u2260 0\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image157.gif\" alt=\"H_0: \u03bc_1 - \u03bc_2 = 0 vs. H_a: \u03bc_1 - \u03bc_2 \u2260 0\" \/><\/span><\/span>\r\n<p id=\"N10C53\">However, only one of the outputs could be correct (the other two contain an inconsistency). Your task is to decide which of the following outputs is the correct one (<em>Hint:<\/em>\u00a0No calculations are necessary in order to answer this question. Instead pay attention to the p-value and confidence interval).<\/p>\r\n\r\n<ul class=\"none\">\r\n \t<li><em>Output A:<\/em>\r\n<ul class=\"none\">\r\n \t<li>p-value: 0.289<\/li>\r\n \t<li>95% Confidence Interval: (-5.93090, -1.78572)<\/li>\r\n<\/ul>\r\n<\/li>\r\n<\/ul>\r\n<ul class=\"none\">\r\n \t<li><em>Output B:<\/em>\r\n<ul class=\"none\">\r\n \t<li>p-value: 0.003<\/li>\r\n \t<li>95% Confidence Interval: (-13.97384, 2.89733)<\/li>\r\n<\/ul>\r\n<\/li>\r\n<\/ul>\r\n<ul class=\"none\">\r\n \t<li><em>Output C:<\/em>\r\n<ul class=\"none\">\r\n \t<li>p-value: 0.223<\/li>\r\n \t<li>95% Confidence Interval: (-9.31432, 2.20505)<\/li>\r\n<\/ul>\r\n<\/li>\r\n<\/ul>\r\n[h5p id=\"226\"]\r\n\r\n<\/div>\r\n<\/div>\r\n<h2><span title=\"Quick scroll up\">Let\u2019s Summarize<\/span><\/h2>\r\n<p id=\"N10CAD\">We have completed our discussion of the two-sample t-test for comparing two populations\u2019 means when the samples are independent. Let\u2019s summarize what we have learned.<\/p>\r\n\r\n<ul>\r\n \t<li>The two sample t-test is used for comparing the means of a quantitative variables (Y) in two populations (which we initially called sub-populations).<span class=\"imagewrap\"><span class=\"image\"><img id=\"_i_5\" class=\"img-responsive popimg aligncenter\" title=\"\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image523a.jpg\" alt=\"\" width=\"400\" \/><\/span><\/span><\/li>\r\n \t<li>Our goal is comparing \u03bc<sub>1<\/sub>\u00a0and \u03bc<sub>2<\/sub>\u00a0(which in practice is done by making inference on the difference \u03bc<sub>1<\/sub>\u00a0\u2013 \u03bc<sub>2<\/sub>). The null hypotheses is\r\n<ul class=\"none\">\r\n \t<li>Ho: \u03bc<sub>1<\/sub>\u00a0\u2013 \u03bc<sub>2<\/sub>\u00a0= 0<\/li>\r\n<\/ul>\r\nand the alternative hypothesis is one of the following (depending on the context of the problem):\r\n<ul class=\"none\">\r\n \t<li>Ha: \u03bc<sub>1<\/sub>\u00a0\u2013 \u03bc<sub>2<\/sub>\u00a0&lt; 0<\/li>\r\n \t<li>Ha: \u03bc<sub>1<\/sub>\u00a0\u2013 \u03bc<sub>2<\/sub>\u00a0&gt; 0<\/li>\r\n \t<li>Ha: \u03bc<sub>1<\/sub>\u00a0\u2013 \u03bc<sub>2<\/sub>\u00a0\u2260 0<\/li>\r\n<\/ul>\r\n<\/li>\r\n \t<li>The two-sample t-test can be safely used when the samples are independent and at least one of the following two conditions hold:\r\n<ul>\r\n \t<li>The variable Y is known to have a normal distribution in both populations<\/li>\r\n \t<li>The two sample sizes are large.<\/li>\r\n<\/ul>\r\nWhen the sample sizes are not large (and we therefore need to check the normality of Y in both population), what we do in practice is look at the histograms of the two samples and make sure that there are no signs of non-normality such as extreme skewedness and\/or outliers.<\/li>\r\n \t<li>The test statistic is as follows and has a t distribution when the null hypothesis is true:<span class=\"imagewrap\"><span class=\"image\"><img id=\"_i_6\" class=\"img-responsive popimg aligncenter\" title=\"\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image523b.png\" alt=\"\" \/><\/span><\/span><\/li>\r\n \t<li>P-values are obtained from the output, and conclusions are drawn as usual, comparing the p-value to the significance level alpha.<\/li>\r\n \t<li>If H<sub>o<\/sub>\u00a0is rejected, a 95% confidence interval for \u03bc<sub>1<\/sub>\u00a0\u2013 \u03bc<sub>2<\/sub>\u00a0can be very insightful and can also be used for the two-sided test.<\/li>\r\n<\/ul>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>","rendered":"<div id=\"lobjh\" class=\"\">\n<div class=\"textbox textbox--learning-objectives\">\n<header class=\"textbox__header\">\n<h2 class=\"textbox__title\">Learning Objectives<\/h2>\n<\/header>\n<div class=\"textbox__content\">\n<ul>\n<li id=\"carry_out_inferential_method_groups\">In a given context, carry out the inferential method for comparing groups and draw the appropriate conclusions.<\/li>\n<li id=\"specify_hypotheses_for_groups\">Specify the null and alternative hypotheses for comparing groups.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"b5c4b619553c4e55ba36c14050ba4849\" class=\"section purposewrap\">\n<div class=\"sectionContain\">\n<h2><span title=\"Quick scroll up\">Comparing Two Means\u2014Two Independent Samples (The Two-Sample t-Test)<\/span><\/h2>\n<div id=\"ceb0b2ffcfbf4c118deb08c5504905f9\" class=\"section\">\n<div class=\"sectionContain\">\n<h3>Overview<\/h3>\n<p id=\"e89b27f006554c30a9d3bcfe60ae8899\">As we mentioned in the summary of the introduction to Case C\u2192Q, the first case that we will deal with is comparing two means when the two samples are independent:<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"ccbe0916ce1c4a17ac7bc369ff890fd3\" class=\"img-responsive popimg aligncenter\" title=\"Sub-Population 1 has a Y Mean of \u03bc_1, and Sub-Population 2 has a Y Mean of \u03bc_2. From Sub-population 1 we take an SRS of size n_1, and from Sub-population 2 we take an SRS of size n_2. Both of these samples are independent.\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image014.gif\" alt=\"Sub-Population 1 has a Y Mean of \u03bc_1, and Sub-Population 2 has a Y Mean of \u03bc_2. From Sub-population 1 we take an SRS of size n_1, and from Sub-population 2 we take an SRS of size n_2. Both of these samples are independent.\" \/><\/span><\/span><\/p>\n<p id=\"d7b09156f9a84e869b349611cdb4008e\">Recall that here we are interested in the effect of a two-valued (k = 2) categorical variable (X) on a quantitative response (Y). Samples are drawn independently from the two sub-populations (defined by the two categories of X), and we need to evaluate whether or not the data provide enough evidence for us to believe that the two sub-population means are different.<\/p>\n<p id=\"ed61630578e547b1aefff7879e709a58\">In other words, our goal is to test whether the means \u03bc<sub>1<\/sub>\u00a0and \u03bc<sub>2<\/sub>\u00a0(which are the means of the variable of interest in the two sub-populations) are equal or not, and in order to do that we have two samples, one from each sub-population, which were chosen independently of each other. As the title of this part suggests, the test that we will learn here is commonly known as the\u00a0<em class=\"italic\">two-sample t-test<\/em>. As the name suggests, this is a t-test, which as we know means that the p-values for this test are calculated under some t distribution. Here is how this part is organized.<\/p>\n<p id=\"f9d3cd00f134493cbdc92b015088ad52\">We first introduce our leading example, and then go in detail through the four steps of the two-sample t-test, illustrating each step using our example.<\/p>\n<\/div>\n<\/div>\n<div id=\"d2b04204d7c041039bc6c814df590186\" class=\"pulloutwrap note\">\n<div class=\"pullout clearfix\">\n<div>\n<p><strong><span class=\"pullout-lbl\">Note\u2026<\/span><\/strong>Up until now, we have been dividing our population into\u00a0<em class=\"italic\">sub-populations<\/em>, then sampling from these sub-populations.<\/p>\n<p id=\"ed74e7b3f9ea4527af290e98a77ef577\">From now on, instead of calling them sub-populations, we will usually call the groups we wish to compare\u00a0<em class=\"italic\">population 1, population 2,\u00a0<\/em>and so on. These two descriptions of the groups we are comparing can be used interchangeably.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"de803c92f05c4a30828ba9066eacaf54\" class=\"examplewrap\">\n<div class=\"example clearfix\">\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Example<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p id=\"c16e7606130f4c3796cf7cc06e8ff642\">What is more important to you \u2014 personality or looks?<\/p>\n<p id=\"ea70b07524744541aadd62fd37c15fb9\">This question was asked of a random sample of 239 college students, who were to answer on a scale of 1 to 25. An answer of 1 means personality has maximum importance and looks no importance at all, whereas an answer of 25 means looks have maximum importance and personality no importance at all. The purpose of this survey was to examine whether males and females differ with respect to the importance of looks vs. personality.<\/p>\n<div class=\"altSelector\"><\/div>\n<div class=\"Excel2019PC altContentOn\">\n<div class=\"alternative\">\n<p id=\"edb825a6a2fe45cdbc3d3dcea809128d\">To open Excel with the data in the worksheet, right-click to download the <a href=\"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-content\/uploads\/sites\/206\/2024\/10\/looks.xls\">looks<\/a> file to your computer. Then find the downloaded file and double-click it to open it in Excel. When Excel opens, you may have to enable editing.<\/p>\n<\/div>\n<\/div>\n<p id=\"d7156d8a11cc41489cab140e427090b9\">Note that the data have the following format:<\/p>\n<table id=\"a5d58fb7c8f9485b9cab58780357dd23\" class=\"grid aligncenter\">\n<thead>\n<tr style=\"height: 27px\">\n<th style=\"height: 27px;width: 136.031px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\n<p id=\"ad37ffe30324e4b348ac81595f39e03e2\">Score (Y)<\/p>\n<\/th>\n<th style=\"height: 27px;width: 160.719px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\n<p id=\"abde5e22fcb374270914b9524ce7c935e\">Gender (X)<\/p>\n<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr style=\"height: 27px\">\n<td style=\"height: 27px;width: 136.531px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\n<p id=\"addaf7a5ca1704a2cb4c37624939b8ac2\">15<\/p>\n<\/td>\n<td style=\"height: 27px;width: 161.219px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\n<p id=\"aff39d31e08424d238c1872480333ae7a\">Male<\/p>\n<\/td>\n<\/tr>\n<tr class=\"e\" style=\"height: 27px\">\n<td style=\"height: 27px;width: 136.531px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\n<p id=\"ad4d7bc47268a40bd928ea72082dbb9d6\">13<\/p>\n<\/td>\n<td style=\"height: 27px;width: 161.219px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\n<p id=\"ae32166c3cc354c5db2d1c6a106391aaf\">Female<\/p>\n<\/td>\n<\/tr>\n<tr style=\"height: 27px\">\n<td style=\"height: 27px;width: 136.531px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\n<p id=\"af2c4025ec7cd47bdadded2fb9a89139b\">10<\/p>\n<\/td>\n<td style=\"height: 27px;width: 161.219px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\n<p id=\"ad17b64c929a54906a9976106ca77d542\">Female<\/p>\n<\/td>\n<\/tr>\n<tr class=\"e\" style=\"height: 27px\">\n<td style=\"height: 27px;width: 136.531px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\n<p id=\"ae201938ba3c741fe963151df97304dff\">12<\/p>\n<\/td>\n<td style=\"height: 27px;width: 161.219px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\n<p id=\"ab50e8397ad42409491bde442c88e0c83\">Male<\/p>\n<\/td>\n<\/tr>\n<tr style=\"height: 27px\">\n<td style=\"height: 27px;width: 136.531px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\n<p id=\"aa89c817de4f9443a997fea77ea8bca56\">14<\/p>\n<\/td>\n<td style=\"height: 27px;width: 161.219px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\n<p id=\"aa1ec504596e442b4a6b9382e81028883\">Female<\/p>\n<\/td>\n<\/tr>\n<tr class=\"e\" style=\"height: 27px\">\n<td style=\"height: 27px;width: 136.531px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\n<p id=\"aa74a699bae9c491d8c857491450ddd7f\">14<\/p>\n<\/td>\n<td style=\"height: 27px;width: 161.219px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\n<p id=\"aed45f110e92e4645b3b5790e6abeeef7\">Male<\/p>\n<\/td>\n<\/tr>\n<tr style=\"height: 27px\">\n<td style=\"height: 27px;width: 136.531px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\n<p id=\"aa591601b2e4b4005892551f7b912d723\">6<\/p>\n<\/td>\n<td style=\"height: 27px;width: 161.219px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\n<p id=\"acd9979189b3f412984ca1a80846347f5\">Male<\/p>\n<\/td>\n<\/tr>\n<tr class=\"e\" style=\"height: 10px\">\n<td style=\"height: 10px;width: 136.531px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\n<p id=\"add0baa2ccf574a4b84dbe3103a437aeb\">17<\/p>\n<\/td>\n<td style=\"height: 10px;width: 161.219px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\n<p id=\"abf6f57cb867a4db2a4c77f220adb64b4\">Male<\/p>\n<\/td>\n<\/tr>\n<tr style=\"height: 27px\">\n<td style=\"height: 27px;width: 136.531px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\n<p id=\"ab61ebc7846274e31813a0f27e9dbbd12\">etc.<\/p>\n<\/td>\n<td style=\"height: 27px;width: 161.219px\" colspan=\"1\" rowspan=\"1\" align=\"left\">\n<p id=\"N10C8A\">\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p id=\"a687dd9b2a67499f895411cd5c991578\">The format of the data reminds us that we are essentially examining the relationship between the two-valued categorical variable, gender, and the quantitative response, score. The two values of the categorical explanatory variable define the two populations that we are comparing \u2014 males and females. The comparison is with respect to the response variable score. Here is a figure that summarizes the example:<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"cf9cbc8d9b2742a38c152deb41b4dddb\" class=\"img-responsive popimg aligncenter\" title=\"We have two populations, Females and Males. This is our Gender (X) Variable. For each of these populations, there is a Score (Y) mean, \u03bc_1 for Females and \u03bc_2 for Males. For the Female population we generate an SRS of size 150. For Males, we generate a SRS of size 85.\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image018.gif\" alt=\"We have two populations, Females and Males. This is our Gender (X) Variable. For each of these populations, there is a Score (Y) mean, \u03bc_1 for Females and \u03bc_2 for Males. For the Female population we generate an SRS of size 150. For Males, we generate a SRS of size 85.\" \/><\/span><\/span><\/p>\n<p id=\"cba6ede8680247b99165fc345d8237d0\"><em class=\"italic\">Comments:<\/em><\/p>\n<ol id=\"a7f70c3359a648cea04c847e4c4b2716\">\n<li>\n<p id=\"ef055f8c386442f0a21f7801ca08ffe2\">Note that this figure emphasizes how the fact that our explanatory is a two-valued categorical variable means that in practice we are comparing two populations (defined by these two values) with respect to our response Y.<\/p>\n<\/li>\n<li>\n<p id=\"d85071ffd5f746b08a6054a6b5305122\">Note that even though the problem description just says that we had 239 students, the figure tells us that there were 85 males in the sample, and 150 females.<\/p>\n<\/li>\n<li>\n<p id=\"cc8f17ae69724a50b3bd2c00e231a3e7\">Following up on comment 2, note that 85 + 150 = 235 and not 239. In these data (which are real) there are four \u201cmissing observations\u201d\u20144 students for which we do not have the value of the response variable, \u201cimportance.\u201d This could be due to a number of reasons, such as recording error or nonresponse. The bottom line is that even though data were collected from 239 students, effectively we have data from only 235. (Recommended: Go through the data file and note that there are 4 cases of missing observations: students 34, 138, 179, and 183).<\/p>\n<\/li>\n<\/ol>\n<\/div>\n<\/div>\n<div id=\"N10AFD\" class=\"section\">\n<div class=\"sectionContain\">\n<h2><span title=\"Quick scroll up\">The Two-Sample t-Test<\/span><\/h2>\n<p id=\"N10B04\">Here again is the general situation which requires us to use the two-sample t-test:<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"_i_0\" class=\"img-responsive popimg aligncenter\" title=\"Sub-Population 1 has a Y Mean of \u03bc_1, and Sub-Population 2 has a Y Mean of \u03bc_2. From Sub-population 1 we take an SRS of size n_1, and from Sub-population 2 we take an SRS of size n_2. Both of these samples are independent.\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image014.gif\" alt=\"Sub-Population 1 has a Y Mean of \u03bc_1, and Sub-Population 2 has a Y Mean of \u03bc_2. From Sub-population 1 we take an SRS of size n_1, and from Sub-population 2 we take an SRS of size n_2. Both of these samples are independent.\" \/><\/span><\/span><\/p>\n<p id=\"N10B0D\">Our goal is to compare the means \u03bc<sub>1<\/sub>\u00a0and \u03bc<sub>2<\/sub>\u00a0based on the two independent samples.<\/p>\n<\/div>\n<\/div>\n<div id=\"N10B1B\" class=\"section\">\n<div class=\"sectionContain\">\n<h2><span title=\"Quick scroll up\">Step 1: Stating the Hypotheses<\/span><\/h2>\n<p id=\"N10B22\">The hypotheses represent our goal, comparing the means: \u03bc<sub>1<\/sub>\u00a0and \u03bc<sub>2<\/sub>\u00a0.<\/p>\n<ul>\n<li>\n<p id=\"N10B2D\">The null hypothesis has the form:<\/p>\n<ul class=\"none\">\n<li>\n<p id=\"N10B33\">[latex]H_{0}:\\mu _{1}-\\mu _{2}=0[\/latex] (which is the same as [latex]H_{0}:\\mu _{1}=\\mu _{2}[\/latex] )<\/p>\n<\/li>\n<\/ul>\n<\/li>\n<li>\n<p id=\"N10B91\">The alternative hypothesis takes one of the following three forms (depending on the context):<\/p>\n<ul class=\"none\">\n<li>\n<p id=\"N10B97\">[latex]H_{a}:\\mu _{1}-\\mu _{2}< 0[\/latex] (which is the same as [latex]H_{a}:\\mu _{1}<\\mu _{2}[\/latex] ) (one-sided)<\/p>\n<\/li>\n<li>\n<p id=\"N10BF5\">[latex]H_{a}:\\mu _{1}-\\mu _{2}> 0[\/latex] (which is the same as [latex]H_{a}:\\mu _{1}>\\mu _{2}[\/latex] ) (one-sided)<\/p>\n<\/li>\n<li>\n<p id=\"N10C53\">[latex]H_{a}:\\mu _{1}- \\mu _{2}\\neq 0[\/latex]\u00a0(which is the same as\u00a0&lt;\u00a0[latex]H_{0}:\\mu _{1}\\neq \\mu _{2}[\/latex]\u00a0) (two-sided)<\/p>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p id=\"N10CB0\">Note that the null hypothesis claims that there is no difference between the means, which can either represented as the difference is 0 (no difference), or as its (algebraically and conceptually) equivalent,\u00a0[latex]\\mu _{1}= \\mu _{2}[\/latex]\u00a0(the means are equal). Either way, conceptually, H<sub>o<\/sub>\u00a0claims that there is no relationship between the two relevant variables.<\/p>\n<p id=\"N10CD4\">The first way of writing the hypotheses (using a difference between the means) will be easier to use when (in the future) we look for a difference that is not 0.<\/p>\n<p id=\"N10CD7\">Each one of the three alternatives claims that there is a difference between the means. The two one-sided alternatives specify the nature of the difference; either negative, indicating that \u03bc<sub>1<\/sub>\u00a0is smaller than \u03bc<sub>2<\/sub>, or positive, indicating that \u03bc<sub>1<\/sub>\u00a0is larger than \u03bc<sub>2<\/sub>. The two-sided alternative, as usual, is more general and simply claims that a difference exists. As before, it should be clear from the context of the problem which of the three alternatives is appropriate.<\/p>\n<\/div>\n<\/div>\n<div id=\"N10CEB\" class=\"section\">\n<div class=\"sectionContain\">\n<h2><span title=\"Quick scroll up\">Comment<\/span><\/h2>\n<p id=\"N10CF2\">Note that our parameter of interest in this case (the parameter about which we are making an inference) is the difference between the means\u00a0[latex]\\mu_{1}-\\mu _{2}[\/latex]\u00a0, and that the null value is 0.<\/p>\n<div class=\"examplewrap\">\n<div class=\"example clearfix\">\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Example<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p id=\"N10D15\">Recall that the purpose of this survey was to examine whether the opinions of females and males\u00a0<em>differ\u00a0<\/em>with respect to the importance of looks vs. personality. The hypotheses in this case are therefore:<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"_i_1\" class=\"img-responsive popimg aligncenter\" title=\"H_0: \u03bc_1 - \u03bc_2 = 0, H_a: \u03bc_1 - \u03bc_2 \u2260 0\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image030.gif\" alt=\"H_0: \u03bc_1 - \u03bc_2 = 0, H_a: \u03bc_1 - \u03bc_2 \u2260 0\" \/><\/span><\/span><\/p>\n<p id=\"N10D21\">where \u03bc<sub>1<\/sub>\u00a0represents the mean importance for females and \u03bc<sub>2<\/sub>\u00a0represents the mean importance for males.<\/p>\n<p id=\"N10D2A\">It is important to understand that conceptually, the two hypotheses claim:<\/p>\n<p id=\"N10D2D\">H<sub>o<\/sub>: Score (of looks vs. personality) is not related to gender<\/p>\n<p id=\"N10D33\">H<sub>a<\/sub>: Score (of looks vs. personality) is related to gender<\/p>\n<\/div>\n<\/div>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p id=\"N10D44\">In order to check the claim that the pregnancy length of women who smoke during pregnancy is shorter, on average, than the pregnancy length of women who do not smoke, a random sample of 35 pregnant women who smoke and a random sample of 35 pregnant women who do not smoke were chosen and their pregnancy lengths were recorded. Here is a figure of this example:<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"_i_2\" class=\"img-responsive popimg aligncenter\" title=\"The Smoking (X) variable gives us our two populations. These are Population 1: Pregnant women who smoke, and Pop 2: Pregnant Women who don't smoke. For each of these populations we have the variable Length (Y) and its mean. For smokers we have \u03bc_1, and for non-smokers we have \u03bc_2. From the population of smokers, we create an SRS of size 35, and from the population of non-smokers we create an SRS of 35.\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image156.gif\" alt=\"The Smoking (X) variable gives us our two populations. These are Population 1: Pregnant women who smoke, and Pop 2: Pregnant Women who don't smoke. For each of these populations we have the variable Length (Y) and its mean. For smokers we have \u03bc_1, and for non-smokers we have \u03bc_2. From the population of smokers, we create an SRS of size 35, and from the population of non-smokers we create an SRS of 35.\" \/><\/span><\/span><\/p>\n<div class=\"asx\">\n<div id=\"du4_m3_twosamples2_tutor1\" class=\"activitywrap sectionNest flash\">\n<div class=\"activityhead\">\n<div class=\"activityinfo\"><\/div>\n<\/div>\n<div class=\"actContain\">\n<div class=\"activity flash\">\n<div id=\"u4_m3_twosamples2_tutor1\" class=\"flash_obj asx testFlash mark_flash\">\n<div id=\"ou4_m3_twosamples2_tutor1\" class=\"page 2963795\">\n<div id=\"2963795\" class=\"question ddfb\">\n<div>\n<p id=\"N10077\">\n<div id=\"h5p-224\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-224\" class=\"h5p-iframe\" data-content-id=\"224\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"10.2 Did I get this 1\"><\/iframe><\/div>\n<\/div>\n<p id=\"N1007F\">Note that \u201cmu\u201d stands for the Greek letter \u03bc, the population mean\u2014mu1 stands for the mean of population 1 and mu2 stands for the mean of population 2.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"ad7f14fd8ed34ebb8a4fb01f9a96c2e6\" class=\"section\">\n<div class=\"sectionContain\">\n<h2><span title=\"Quick scroll up\">Step 2: Check Conditions, and Summarize the Data Using a Test Statistic<\/span><\/h2>\n<p id=\"fc79d192eed94671a30c9f62206b8c86\">The two-sample t-test can be safely used as long as the following conditions are met:<\/p>\n<ol id=\"f97a6abe462148c782ad0abf03782069\">\n<li>\n<p id=\"e6a1e2f4a4e241d482a98ad7875ec678\">The two samples are indeed independent.<\/p>\n<\/li>\n<li>\n<p id=\"c823cd8259ed4d868d88a8924fafef96\">We are in one of the following two scenarios:<\/p>\n<ol id=\"fe491992270e4f5392699e06513930a7\" class=\"lower-roman\">\n<li>\n<p id=\"e0762ba45fa246b58625d4b82860f412\">Both populations are normal, or more specifically, the distribution of the response Y in both populations is normal, and both samples are random (or at least can be considered as such). In practice, checking normality in the populations is done by looking at each of the samples using a histogram and checking whether there are any signs that the populations are not normal. Such signs could be extreme skewness and\/or extreme outliers.<\/p>\n<\/li>\n<li>\n<p id=\"dc38d4c25aa64eeb921c2db27dbe75e8\">The populations are known or discovered not to be normal, but the sample size of each of the random samples is large enough (we can use the rule of thumb that &gt; 30 is considered large enough).<\/p>\n<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<p id=\"fd6e2f7b42824dd09a723a8783d13f40\">Assuming that we can safely use the two-sample t-test, we need to summarize the data, and in particular, calculate our data summary\u2014the test statistic.<\/p>\n<p id=\"bf0d2ea66ebe4fb9b6bdaf2e2a93f1dd\"><em class=\"italic\">The two-sample t-test statistic<\/em>\u00a0is:<\/p>\n<p>\u00a0[latex]t=\\frac{(\\bar{y_{1}}-\\bar{y_{2}})-0}{\\sqrt{\\frac{s_{1}^{2}}{n_{1}}+\\frac{s_{2}^{2}}{n_{2}}}}[\/latex]<\/p>\n<p id=\"b664119348e64ba69e3a98ca0bb6523b\">Where:<\/p>\n<p id=\"f09a09d7d13640c89e0df517e41b883e\">[latex]\\overline{y_{1}},\\overline{y_{2}}[\/latex]\u00a0are the sample means of the samples from population 1 and population 2 respectively.<\/p>\n<p id=\"bb324e17e88645608b65224d5035eb83\"><span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">s<sub>1<\/sub><\/span><\/span><\/span><\/span><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">,<\/span><\/span><span class=\"mjx-mtext MJXc-space1\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00a0<\/span><\/span><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">s<sub>2<\/sub><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00a0are the sample standard deviations of the samples from population 1 and population 2 respectively.<\/p>\n<p id=\"c38c81f4ee3e4ab185da7dc1858960be\"><span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">n<sub>1<\/sub><\/span><\/span><\/span><\/span><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">,<\/span><\/span><span class=\"mjx-mtext MJXc-space1\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00a0<\/span><\/span><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">n<sub>2<\/sub><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00a0are the sample sizes of the two samples.<\/p>\n<\/div>\n<\/div>\n<div id=\"a0c9248df13542bfa77e06a0664b64a7\" class=\"section\">\n<div class=\"sectionContain\">\n<h2><span title=\"Quick scroll up\">Comment<\/span><\/h2>\n<p id=\"f959b35949654b7195a9d6c14bf09748\">Let\u2019s see why this test statistic makes sense, bearing in mind that our inference is about\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/span><\/span><\/span><\/span>.<\/p>\n<ul id=\"cf85e0c633e64748a6cc9c35c766f50c\">\n<li>\n<p id=\"d0c4fdaef9aa49c18ef5c1f0d2ac1d7b\"><span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-delim-h\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">y<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00a0estimates \u03bc<sub>1<\/sub>\u00a0and\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-delim-h\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">y<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00a0estimates \u03bc<sub>2<\/sub>, and therefore\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-delim-h\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">y<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-mtext MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00a0<\/span><\/span><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-delim-h\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">y<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00a0is what the data tell me about (or, how the data estimate)<\/p>\n<p id=\"f04c54a0c4374cefb205ef3eb832a3d3\"><span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span>.<\/p>\n<\/li>\n<li>\n<p id=\"b20f1197393248abb0cf5829eb5ecf1d\">0 is the \u201cnull value\u201d \u2014 what the null hypothesis, H<sub>o<\/sub>, claims that\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span id=\"MJXc-Node-141\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span id=\"MJXc-Node-142\" class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span>\u00a0is.<\/p>\n<\/li>\n<li>\n<p id=\"a8c1b5b850a34d2887a4ba31d6201260\">The denominator\u00a0[latex]\\sqrt{\\frac{\\mathcal{s}_1^2}{\\mathcal{n}_1}+\\frac{\\mathcal{s}_2^2}{\\mathcal{n}_2}}[\/latex]\u00a0is the standard error of\u00a0<span id=\"MathJax-Element-12-Frame\" class=\"mjx-chtml MathJax_CHTML\"><span id=\"MJXc-Node-166\" class=\"mjx-math\"><span id=\"MJXc-Node-167\" class=\"mjx-mrow\"><span id=\"MJXc-Node-168\" class=\"mjx-mrow\"><span id=\"MJXc-Node-169\" class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span id=\"MJXc-Node-173\" class=\"mjx-mo\"><span class=\"mjx-delim-h\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><\/span><span class=\"mjx-op\"><span id=\"MJXc-Node-170\" class=\"mjx-msub\"><span class=\"mjx-base\"><span id=\"MJXc-Node-171\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">y<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span id=\"MJXc-Node-172\" class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><span id=\"MJXc-Node-174\" class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span id=\"MJXc-Node-175\" class=\"mjx-mtext MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00a0<\/span><\/span><span id=\"MJXc-Node-176\" class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span id=\"MJXc-Node-180\" class=\"mjx-mo\"><span class=\"mjx-delim-h\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><\/span><span class=\"mjx-op\"><span id=\"MJXc-Node-177\" class=\"mjx-msub\"><span class=\"mjx-base\"><span id=\"MJXc-Node-178\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">y<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span id=\"MJXc-Node-179\" class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><\/span><\/span><\/span><\/span>. (We will not go into the details of why this is true.)<\/p>\n<\/li>\n<\/ul>\n<p id=\"cefcd75fbda94756a85985b4dcd09f2d\">We therefore see that our test statistic, like the previous test statistics we encountered, has the structure:<\/p>\n<p>[latex]\\frac{sample\\ estimate-null\\ value}{standard\\ error}[\/latex]<\/p>\n<p id=\"e92762ca21d24b278c442ff3c1e47d6d\">and therefore, like the previous test statistics, measures (in standard errors) the difference between what the data tell us about the parameter of interest\u00a0<span id=\"MathJax-Element-14-Frame\" class=\"mjx-chtml MathJax_CHTML\"><span id=\"MJXc-Node-197\" class=\"mjx-math\"><span id=\"MJXc-Node-198\" class=\"mjx-mrow\"><span id=\"MJXc-Node-199\" class=\"mjx-mrow\"><span id=\"MJXc-Node-200\" class=\"mjx-msub\"><span class=\"mjx-base\"><span id=\"MJXc-Node-201\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span id=\"MJXc-Node-202\" class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span id=\"MJXc-Node-203\" class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span id=\"MJXc-Node-204\" class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span id=\"MJXc-Node-205\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span id=\"MJXc-Node-206\" class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><\/span>\u00a0(sample estimate) and what the null hypothesis claims the value of the parameter is (null value).<\/p>\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Example<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p id=\"f05ee6ed0c6c44a0bf11fe4ba40526c4\">Let\u2019s first check whether the conditions that allow us to safely use the two-sample t-test are met.<\/p>\n<ol id=\"df98f3cefcce4aa6959463a890d1e4f0\" class=\"lower-roman\">\n<li>\n<p id=\"ef4d7a2bca6241d49d2aac10c66e14d9\">Here, 239 students were chosen and were naturally divided into a sample of females and a sample of males. Since the students were chosen at random, the sample of females is independent of the sample of males.<\/p>\n<\/li>\n<li>\n<p id=\"b6a8383586684f899088b14895907a66\">Here we are in the second scenario \u2014 the sample sizes (150 and 85), are definitely large enough, and so we can proceed regardless of whether the populations are normal or not.<\/p>\n<\/li>\n<\/ol>\n<div class=\"StatCrunch altContentOn\">\n<div class=\"alternative\">\n<p id=\"ca6ed6e22193464f82032799d4835a1b\">In order to avoid tedious calculations, we will lift the test statistic from the output. The StatCrunch output (edited) is shown below:<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"d19980c9effc49f2ba294184b28ff5d4\" class=\"img-responsive popimg aligncenter\" title=\"Two Sample T - Test and CI: Score(Y),Gender (X) Summary statistics for Score (Y): For Gender(X) = Female: n = 150, Mean = 10.733334, Std. Dev. = 4.254751, Std. Err. = 0.347399 For Gender(X) = Male: n = 85, Mean = 13.3294115, Std. Dev. = 4.0189676, Std. Err. = 0.43591824 Hypothesis test results: \u03bc_1: mean of score (Y) where X = Female. \u03bc_2: mean of score (Y) where X = Male. \u03bc_1 - \u03bc_2: mean difference. H_0: \u03bc_1 - \u03bc_2 = 0, H_A: \u03bc_1 - \u03bc_2 \u2260 0 Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 T-Stat: -4.657358 P-Value: &amp;lt; 0.0001 95% Confidence Interval Results: Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 L. Limit: -3.6958647 U. Limit: -1.4962921\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image192_statcrunch.gif\" alt=\"Two Sample T - Test and CI: Score(Y),Gender (X) Summary statistics for Score (Y): For Gender(X) = Female: n = 150, Mean = 10.733334, Std. Dev. = 4.254751, Std. Err. = 0.347399 For Gender(X) = Male: n = 85, Mean = 13.3294115, Std. Dev. = 4.0189676, Std. Err. = 0.43591824 Hypothesis test results: \u03bc_1: mean of score (Y) where X = Female. \u03bc_2: mean of score (Y) where X = Male. \u03bc_1 - \u03bc_2: mean difference. H_0: \u03bc_1 - \u03bc_2 = 0, H_A: \u03bc_1 - \u03bc_2 \u2260 0 Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 T-Stat: -4.657358 P-Value: &amp;lt; 0.0001 95% Confidence Interval Results: Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 L. Limit: -3.6958647 U. Limit: -1.4962921\" \/><\/span><\/span><\/p>\n<p id=\"b4f4d2c2dae945a59d6a76295ca02d2a\">As you can see we highlighted the \u201cingredients\u201d needed to calculate the test statistic, as well as the test statistic itself. Just for this first example, let\u2019s make sure that we understand what these ingredients are and how to use them to find the test statistic.<\/p>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<h4 class=\"textbox__title\">Learn by Doing<\/h4>\n<\/header>\n<div class=\"textbox__content\">\n<div id=\"h5p-225\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-225\" class=\"h5p-iframe\" data-content-id=\"225\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"10.2 Learn by doing 1\"><\/iframe><\/div>\n<\/div>\n<\/div>\n<\/div>\n<p id=\"d5047aff759044adb4d933cdb1cb8feb\">And when we put it all together we get that indeed,<\/p>\n<p>[latex]\\mathcal{t}=\\frac{\\bar{\\mathcal{y}_1}-\\bar{\\mathcal{y}_2}-0}{\\sqrt{\\frac{\\mathcal{s}_1^2}{\\mathcal{n}_1}+\\frac{\\mathcal{s}_2^2}{\\mathcal{n}_2}}}=\\ \\frac{10.73-13.33}{\\sqrt{\\frac{{4.25}^2}{150}+\\frac{{4.02}^2}{85}}}=-4.66[\/latex]<\/p>\n<p id=\"e1dac8d831e74cd5b75e1b908018d071\">The test statistic tells us what the data tell us about\u00a0<span id=\"MathJax-Element-16-Frame\" class=\"mjx-chtml MathJax_CHTML\"><span id=\"MJXc-Node-286\" class=\"mjx-math\"><span id=\"MJXc-Node-287\" class=\"mjx-mrow\"><span id=\"MJXc-Node-288\" class=\"mjx-mrow\"><span id=\"MJXc-Node-289\" class=\"mjx-msub\"><span class=\"mjx-base\"><span id=\"MJXc-Node-290\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span id=\"MJXc-Node-291\" class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span id=\"MJXc-Node-292\" class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span id=\"MJXc-Node-293\" class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span id=\"MJXc-Node-294\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span id=\"MJXc-Node-295\" class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><\/span>. In this case that difference (10.73 \u2013 13.33) is 4.66 standard errors below what the null hypothesis claims this difference to be (0). 4.66 standard errors is quite a lot and probably indicates that the data provide evidence against H<sub>o<\/sub>.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p>We have completed step 2 and are ready to proceed to step 3, finding the p-value of the test.<\/p>\n<h2><span title=\"Quick scroll up\">Step 3: Finding the p-value of the test<\/span><\/h2>\n<p id=\"bf615bd08efb48c7b0e9cb6280496b06\">Since our test is called the two-sample t test ,we know that the p-values are calculated under a t distribution. Indeed, it turns out that the null distribution of our test statistic is approximately t. Figuring out which one of the t distributions (in other words, how many degrees of freedom this t distribution has) is quite involved and will not be discussed here. Instead, we use a statistics package to find that the p-value in this case is 0.<\/p>\n<div id=\"a3d1aa7f1b1a48738a4ce6a04142e632\" class=\"examplewrap\">\n<div class=\"example clearfix\">\n<div class=\"StatCrunch altContentOn\">\n<div class=\"alternative\">\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Example<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<div class=\"StatCrunch altContentOn\">\n<div class=\"alternative\">\n<p id=\"ee43826ce74c4b1fbe34d003ff40f6b9\">Here, again is the relevant output for our example:<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"f6a2b08958c047a38845a8069d2ac604\" class=\"img-responsive popimg aligncenter\" title=\"Two Sample T - Test and CI: Score(Y),Gender (X) Summary statistics for Score (Y): For Gender(X) = Female: n = 150, Mean = 10.733334, Std. Dev. = 4.254751, Std. Err. = 0.347399 For Gender(X) = Male: n = 85, Mean = 13.3294115, Std. Dev. = 4.0189676, Std. Err. = 0.43591824 Hypothesis test results: \u03bc_1: mean of score (Y) where X = Female. \u03bc_2: mean of score (Y) where X = Male. \u03bc_1 - \u03bc_2: mean difference. H_0: \u03bc_1 - \u03bc_2 = 0, H_A: \u03bc_1 - \u03bc_2 \u2260 0 Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 T-Stat: -4.657358 P-Value: &amp;lt; 0.0001 95% Confidence Interval Results: Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 L. Limit: -3.6958647 U. Limit: -1.4962921\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image192_statcrunch_2.gif\" alt=\"Two Sample T - Test and CI: Score(Y),Gender (X) Summary statistics for Score (Y): For Gender(X) = Female: n = 150, Mean = 10.733334, Std. Dev. = 4.254751, Std. Err. = 0.347399 For Gender(X) = Male: n = 85, Mean = 13.3294115, Std. Dev. = 4.0189676, Std. Err. = 0.43591824 Hypothesis test results: \u03bc_1: mean of score (Y) where X = Female. \u03bc_2: mean of score (Y) where X = Male. \u03bc_1 - \u03bc_2: mean difference. H_0: \u03bc_1 - \u03bc_2 = 0, H_A: \u03bc_1 - \u03bc_2 \u2260 0 Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 T-Stat: -4.657358 P-Value: &amp;lt; 0.0001 95% Confidence Interval Results: Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 L. Limit: -3.6958647 U. Limit: -1.4962921\" \/><\/span><\/span><\/p>\n<p id=\"da07ff0ff6f74b49a6e124b201a2c5be\">According to the output the p-value of this test is less than 0.0001. How do we interpret this?<\/p>\n<p id=\"fdc7820a59154b94b73e13e69df2e2d2\">A p-value which is practically 0 means that it would be almost impossible to get data like that observed (or even more extreme) had the null hypothesis been true.<\/p>\n<p id=\"e6d287dcd66c412da4784fa3b75abd7f\">More specifically to our example, if there were no differences between females and males with respect to whether they value looks vs. personality, it would be almost impossible (probability approximately 0) to get data where the difference between the sample means of females and males is -2.596 (that difference is 10.733 \u2013 13.329 = -2.596) or higher.<\/p>\n<\/div>\n<\/div>\n<p id=\"ec55a502861e40e48f5cd593f57e44b4\">Comment: Note that the output tells us that\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-delim-h\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">y<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-mtext MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00a0<\/span><\/span><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-delim-h\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">y<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00a0is approximately -2.6. But more importantly, we want to know if this difference is significant. To answer this, we use the fact that this difference is 4.66 standard errors below the null value.<\/p>\n<\/div>\n<\/div>\n<div id=\"d2984175cb644581a87ba513017c7d68\" class=\"section\">\n<div class=\"sectionContain\">\n<h2><span title=\"Quick scroll up\">Step 4: Conclusion in context<\/span><\/h2>\n<p id=\"b8f476f21b2940eeb5cce53568ae47c5\">As usual a small p-value provides evidence against H<sub>o<\/sub>. In our case our p-value is practically 0 (which smaller than any level of significance that we will choose). The data therefore provide very strong evidence against H<sub>o<\/sub>\u00a0so we reject it and conclude that the mean Importance score (of looks vs personality) of males differs from that of females. In other words, males and females differ with respect to how they value looks vs. personality.<\/p>\n<\/div>\n<\/div>\n<div id=\"fcc366c48d0d4fec9e5167d92daccdef\" class=\"section\">\n<div class=\"sectionContain\">\n<h2><span title=\"Quick scroll up\">Comments<\/span><\/h2>\n<p id=\"af42a2ce465d447d8ee53b1ce10d1586\">You might ask yourself: \u201cWhere do we use the test statistic?\u201d<\/p>\n<p id=\"d533fea5cb0f48d39638a29341248bce\">It is true that for all practical purposes all we have to do is check that the conditions which allow us to use the two-sample t-test are met, lift the p-value from the output, and draw our conclusions accordingly.<\/p>\n<p id=\"bb3b85caacc243198954378214f789dc\">However, we feel that it is important to mention the test statistic for two reasons:<\/p>\n<ol id=\"caf67a2ae21149c687e0e92f0ad6282e\">\n<li>\n<p id=\"d0d303e608c545e6a49da2994e252d66\">The test statistic is what\u2019s behind the scenes; based on its null distribution and its value, the p-value is calculated.<\/p>\n<\/li>\n<li>\n<p id=\"c3fb8578560d48fc86beea0ec6b1fa08\">Apart from being the key for calculating the p-value, the test statistic is also itself a measure of the evidence stored in the data against H<sub>o<\/sub>. As we mentioned, it measures (in standard errors) how different our data is from what is claimed in the null hypothesis.<\/p>\n<\/li>\n<\/ol>\n<hr \/>\n<p id=\"b270400a4cf340dd96d155e7c3b04c4a\">Let\u2019s look at another example, and then you\u2019ll do one yourself.<\/p>\n<div id=\"eb3d5832301444dfa3ec8e974b2e86e0\" class=\"examplewrap\">\n<div class=\"example clearfix\">\n<div id=\"eff9df4b81cb4e73a84b6d6326ad00ee\" class=\"section\">\n<div class=\"sectionContain\">\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Example<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p id=\"d8bcd41a6c294bfbbafa227e139704e7\">According to the National Health And Nutrition Examination Survey (NHANES) sponsored by the U.S. government, a random sample of 712 males between 20 and 29 years of age and a random sample of 1,001 males over the age of 75 were chosen, and the weight of each of the males was recorded (in kg). Here is a summary of the results (source: http:\/\/www.cdc.gov\/nchs\/data\/ad\/ad347.pdf):<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"c8c2d0b8e7f84413af77e4265d72a1af\" class=\"img-responsive popimg aligncenter\" title=\"For males 20-29 years old, n = 712, Y-bar = 83.4, S = 18.7. For males 75+ years old, n = 1001, Y-bar = 78.5, S = 19.0\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image194.gif\" alt=\"For males 20-29 years old, n = 712, Y-bar = 83.4, S = 18.7. For males 75+ years old, n = 1001, Y-bar = 78.5, S = 19.0\" \/><\/span><\/span><\/p>\n<p id=\"d27d36c4fcf54a27bfbe00dbcc034612\">Do the data provide evidence that the younger male population weighs more (on average) than the older male population? (Note that here the data are given in a summarized form, unlike the previous problem, where the raw data were given.)<\/p>\n<p id=\"e411f0cb5bd448dbbd66f1d4a61de6bc\">Here is a figure that summarizes this example:<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"b4394595edec4b64a55a6f2d8988c55d\" class=\"img-responsive popimg aligncenter\" title=\"We have two populations, from the two categories in the variable Age Group(X). Population 1 is Males 20-29 years old, and Population 2 is Males 75+ years old. Population 1&amp;apos;s Weight (Y) mean is \u03bc_1, and population 2&amp;apos;s weight (Y) mean is \u03bc_2. For population 1, a SRS of size 712 is generated. It has a mean of 83.4 and SD of 18.7 . For population 2, another SRS is generated of size 1001. It has a mean of 78.5 and SD of 19.0 .\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image043.gif\" alt=\"We have two populations, from the two categories in the variable Age Group(X). Population 1 is Males 20-29 years old, and Population 2 is Males 75+ years old. Population 1&amp;apos;s Weight (Y) mean is \u03bc_1, and population 2&amp;apos;s weight (Y) mean is \u03bc_2. For population 1, a SRS of size 712 is generated. It has a mean of 83.4 and SD of 18.7 . For population 2, another SRS is generated of size 1001. It has a mean of 78.5 and SD of 19.0 .\" \/><\/span><\/span><\/p>\n<p id=\"eafc5b46660e4ba7ac3af95788fcf214\">Note that we defined the younger age group and the older age group as population 1 and population 2, respectively, and \u03bc<sub>1<\/sub>\u00a0and \u03bc<sub>2<\/sub>\u00a0as the mean weight of population 1 and population 2, respectively.<\/p>\n<p id=\"af2db248783a4c5c9e5753d4f0a86399\"><em class=\"italic\">Step 1:<\/em><\/p>\n<p id=\"f9f8633456c745b6932e869de26cb262\">Since we want to test whether the older age group (population 2) weighs less on average than the younger age group (population 1), we are testing:<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"ba0ecaa9314d40ca9e2a572238cd7eb8\" class=\"img-responsive popimg aligncenter\" title=\"H_0: \u03bc_1 - \u03bc_2 = 0, H_a: \u03bc_1 - \u03bc_2 &amp;gt; 0\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image044.gif\" alt=\"H_0: \u03bc_1 - \u03bc_2 = 0, H_a: \u03bc_1 - \u03bc_2 &amp;gt; 0\" \/><\/span><\/span><\/p>\n<p id=\"ab68c8a28da84b26beebb8c1c4f22216\">or equivalently,<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"cd0bfb7306b84e4ca183addaceb7d12e\" class=\"img-responsive popimg aligncenter\" title=\"H_0: \u03bc_1 = \u03bc_2, H_a: \u03bc_1 &amp;gt; \u03bc_2\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image045.gif\" alt=\"H_0: \u03bc_1 = \u03bc_2, H_a: \u03bc_1 &amp;gt; \u03bc_2\" \/><\/span><\/span><\/p>\n<p id=\"b770d52870bf4fa7b9abf9018d8c4761\"><em class=\"italic\">Step 2:<\/em><\/p>\n<p id=\"d53bf865dcf8466c81962e60d8d6798c\">We can safely use the two-sample t-test in this case since:<\/p>\n<ol id=\"cc5d2537c08446529bdc42f9f93de692\" class=\"lower-roman\">\n<li>\n<p id=\"ed490d0efa274047bbe7ac674396618c\">The samples are independent, since each of the samples was chosen at random.<\/p>\n<\/li>\n<li>\n<p id=\"e40ed184118f4582852174a0450d04e5\">Both sample sizes are very large (712 and 1,001), and therefore we can proceed regardless of whether the populations are normal or not.<\/p>\n<\/li>\n<\/ol>\n<p id=\"a46b62f88dc54b93a59d687801f701bc\">It is possible from these data to calculate the t-statistic of 5.31 and the p-value of 0.000. The t-value is quite large, and the p-value correspondingly small, indicating that our data are very different from what is claimed in the null hypothesis.<\/p>\n<p id=\"c67e98b271da452184e34acf82bc1df5\"><em class=\"italic\">Step 3:<\/em><\/p>\n<p id=\"e7a95f7ebfde4d1d84b0553a6b46736b\">The p-value is essentially 0, indicating that it would be nearly impossible to observe a difference between the sample mean weights of 4.9 (or more) if the mean weights in the age group populations were the same (i.e., if H<sub>o<\/sub>\u00a0were true).<\/p>\n<p id=\"b882b1c5b7aa4647b24713440abc7fe2\"><em class=\"italic\">Step 4:<\/em><\/p>\n<p id=\"b01f14ca4b3342bfada0f4a19255b91d\">A p-value of 0 (or very close to it) indicates that the data provide strong evidence against H<sub>o<\/sub>, so we reject it and conclude that the mean weight of males 20-29 years old is higher than the mean weight of males 75 years old and older. In other words, males in the younger age group weigh more, on average, than males in the older age group.<\/p>\n<\/div>\n<\/div>\n<h2><span title=\"Quick scroll up\">Confidence Interval for (Two-Sample t Confidence Interval)<\/span><\/h2>\n<p id=\"f53e8bc858774f5dbeb9655bcb6c460d\">So far we\u2019ve discussed the two-sample t-test, which checks whether there is enough evidence stored in the data to reject the claim that\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">0<\/span><\/span><\/span><\/span><\/span>\u00a0(or equivalently, that\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-msub MJXc-space3\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00a0) in favor of one of the three possible alternatives.<\/p>\n<p id=\"aff7008c161d46c9b63dbd520ea38410\">If we would like to estimate\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span>\u00a0we can use the natural point estimate,\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-delim-h\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">y<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-mtext MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00a0<\/span><\/span><span class=\"mjx-mover\"><span class=\"mjx-stack\"><span class=\"mjx-over\"><span class=\"mjx-mo\"><span class=\"mjx-delim-h\"><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><span class=\"mjx-char MJXc-TeX-main-R\">\u00af<\/span><\/span><\/span><\/span><span class=\"mjx-op\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">y<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00a0, or preferably, a 95% confidence interval which will provide us with a set of plausible values for the difference between the population means\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span>\u00a0.<\/p>\n<p id=\"b52b90992ff34fe9b79da0dd0e667952\">In particular, if the test has rejected\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">H<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">o<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">:<\/span><\/span><span class=\"mjx-msub MJXc-space3\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">0<\/span><\/span><\/span><\/span><\/span>\u00a0, a confidence interval for\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00a0can be insightful since it quantifies the effect that the categorical explanatory variable has on the response.<\/p>\n<\/div>\n<\/div>\n<div id=\"f9136214344044bf9e3304edcc6855fa\" class=\"section\">\n<div class=\"sectionContain\">\n<h2><span title=\"Quick scroll up\">Comment<\/span><\/h2>\n<p id=\"a93f7fd93cb34b20b0cfc6c2eada6451\">We will not go into the formula and calculation of the confidence interval, but rather ask our software to do it for us, and focus on interpretation.<\/p>\n<\/div>\n<\/div>\n<div id=\"afdfb70e4fc64177b2e744344306204a\" class=\"examplewrap\">\n<div class=\"example clearfix\">\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Example<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p id=\"a8312cf778e7448aba4821a5817057cf\">Recall our leading example about the looks vs. personality score of females and males:<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"e0a85bbb6c2e4d40933b82bc4b487958\" class=\"img-responsive popimg aligncenter\" title=\"The Gender(X) Variable has two categories, which gives us Population 1: Females and Population 2: Males. Each population has its own Y-Mean \u03bc, so population 1&amp;apos;s mean is \u03bc_1 and population 2&amp;apos;s mean is \u03bc_2. For each population we take an SRS. For Population 1, an SRS of size 150 is taken, and for population 2 an SRS of size 85 is taken.\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image047.gif\" alt=\"The Gender(X) Variable has two categories, which gives us Population 1: Females and Population 2: Males. Each population has its own Y-Mean \u03bc, so population 1&amp;apos;s mean is \u03bc_1 and population 2&amp;apos;s mean is \u03bc_2. For each population we take an SRS. For Population 1, an SRS of size 150 is taken, and for population 2 an SRS of size 85 is taken.\" \/><\/span><\/span><\/p>\n<div class=\"altSelector\"><\/div>\n<div class=\"StatCrunch altContentOn\">\n<div class=\"alternative\">\n<p id=\"d34df1f9495c44f4813597bab6208d5c\">Here again is the output:<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"f233ade69c0e4161819ee92641ed3c50\" class=\"img-responsive popimg aligncenter\" title=\"Two Sample T - Test and CI: Score(Y),Gender (X) Summary statistics for Score (Y): For Gender(X) = Female: n = 150, Mean = 10.733334, Std. Dev. = 4.254751, Std. Err. = 0.347399 For Gender(X) = Male: n = 85, Mean = 13.3294115, Std. Dev. = 4.0189676, Std. Err. = 0.43591824 Hypothesis test results: \u03bc_1: mean of score (Y) where X = Female. \u03bc_2: mean of score (Y) where X = Male. \u03bc_1 - \u03bc_2: mean difference. H_0: \u03bc_1 - \u03bc_2 = 0, H_A: \u03bc_1 - \u03bc_2 \u2260 0 Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 T-Stat: -4.657358 P-Value: &amp;lt; 0.0001 95% Confidence Interval Results: Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 L. Limit: -3.6958647 U. Limit: -1.4962921\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image196_statcrunch.gif\" alt=\"Two Sample T - Test and CI: Score(Y),Gender (X) Summary statistics for Score (Y): For Gender(X) = Female: n = 150, Mean = 10.733334, Std. Dev. = 4.254751, Std. Err. = 0.347399 For Gender(X) = Male: n = 85, Mean = 13.3294115, Std. Dev. = 4.0189676, Std. Err. = 0.43591824 Hypothesis test results: \u03bc_1: mean of score (Y) where X = Female. \u03bc_2: mean of score (Y) where X = Male. \u03bc_1 - \u03bc_2: mean difference. H_0: \u03bc_1 - \u03bc_2 = 0, H_A: \u03bc_1 - \u03bc_2 \u2260 0 Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 T-Stat: -4.657358 P-Value: &amp;lt; 0.0001 95% Confidence Interval Results: Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 L. Limit: -3.6958647 U. Limit: -1.4962921\" \/><\/span><\/span><\/p>\n<p id=\"ac039cbe56fbe4a328d403c6961a7fad5\">\n<\/div>\n<\/div>\n<p id=\"c6b4811423b8489ba405019f28d1e1f1\">Recall that we rejected the null hypothesis in favor of the two-sided alternative and concluded that the mean score of females is different from the mean score of males. It would be interesting to supplement this conclusion with more details about this difference between the means, and the 95% confidence interval for\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00a0does exactly that.<\/p>\n<p id=\"e21766556fe148468e77e1b4c80fbe9b\">According to the output the 95% confidence interval for\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/span><\/span><\/span><\/span>\u00a0is roughly (-3.7, -1.5). First, note that the confidence interval is strictly negative suggesting that \u03bc<sub>1<\/sub>\u00a0is lower than \u03bc<sub>2<\/sub>\u00a0. Furthermore, the confidence interval tells me that we are 95% confident that the mean \u201clooks vs. personality score\u201d of females ( \u03bc<sub>1<\/sub>\u00a0) is between 1.5 and 3.7 points lower than the mean looks vs. personality score of males ( \u03bc<sub>2<\/sub>\u00a0). The confidence interval therefore quantifies the effect that the explanatory variable (gender) has on the response (looks vs personality score).<\/p>\n<\/div>\n<\/div>\n<h2><span title=\"Quick scroll up\">Comment<\/span><\/h2>\n<p id=\"N10B10\">As we\u2019ve seen in previous tests, as well as in the two-samples case, the 95% confidence interval for\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span>\u00a0can be used for testing in the two-sided case (<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">H<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">o<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">:<\/span><\/span><span class=\"mjx-msub MJXc-space3\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><code><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/code><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">0<\/span><\/span><\/span><\/span><\/span>\u00a0vs.\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">H<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">a<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">:<\/span><\/span><span class=\"mjx-msub MJXc-space3\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2260<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">0<\/span><\/span><\/span><\/span><\/span>\u00a0):<\/p>\n<p>If the null value, 0, falls outside the confidence interval, H<sub>o<\/sub>\u00a0is rejected<\/p>\n<p>If the null value, 0, falls inside the confidence interval, H<sub>o<\/sub>\u00a0is not rejected<\/p>\n<div class=\"examplewrap\">\n<div class=\"example clearfix\">\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Example<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p id=\"N10B9F\">Let\u2019s go back to our leading example of the looks vs. personality score where we had a two-sided test.<\/p>\n<div class=\"altSelector\"><\/div>\n<div class=\"statcrunch altContentOn\">\n<div class=\"alternative\"><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" class=\"img-responsive popimg aligncenter\" title=\"Two Sample T - Test and CI: Score(Y),Gender (X) Summary statistics for Score (Y): For Gender(X) = Female: n = 150, Mean = 10.733334, Std. Dev. = 4.254751, Std. Err. = 0.347399 For Gender(X) = Male: n = 85, Mean = 13.3294115, Std. Dev. = 4.0189676, Std. Err. = 0.43591824 Hypothesis test results: \u03bc_1: mean of score (Y) where X = Female. \u03bc_2: mean of score (Y) where X = Male. \u03bc_1 - \u03bc_2: mean difference. H_0: \u03bc_1 - \u03bc_2 = 0, H_A: \u03bc_1 - \u03bc_2 \u2260 0 Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 T-Stat: -4.657358 P-Value: &lt; 0.0001 95% Confidence Interval Results: Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 L. Limit: -3.6958647 U. Limit: -1.4962921\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/statcrunch_output.png\" alt=\"Two Sample T - Test and CI: Score(Y),Gender (X) Summary statistics for Score (Y): For Gender(X) = Female: n = 150, Mean = 10.733334, Std. Dev. = 4.254751, Std. Err. = 0.347399 For Gender(X) = Male: n = 85, Mean = 13.3294115, Std. Dev. = 4.0189676, Std. Err. = 0.43591824 Hypothesis test results: \u03bc_1: mean of score (Y) where X = Female. \u03bc_2: mean of score (Y) where X = Male. \u03bc_1 - \u03bc_2: mean difference. H_0: \u03bc_1 - \u03bc_2 = 0, H_A: \u03bc_1 - \u03bc_2 \u2260 0 Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 T-Stat: -4.657358 P-Value: &lt; 0.0001 95% Confidence Interval Results: Difference: \u03bc_1 - \u03bc_2 Sample Mean: -2.5960784 Std. Err.: 0.55741435 DF: 182.97267 L. Limit: -3.6958647 U. Limit: -1.4962921\" \/><\/span><\/span><\/div>\n<\/div>\n<p id=\"N10BF1\">We used the fact that the p-value is so small to conclude that Ho can be rejected. We can also use the confidence interval to reach the same conclusion since 0 falls outside the confidence interval. In other words, since 0 is not a plausible value for\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><\/span><\/span><\/span>\u00a0we can reject H<sub>o<\/sub>, which claims that\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space2\"><span class=\"mjx-char MJXc-TeX-main-R\">\u2212<\/span><\/span><span class=\"mjx-msub MJXc-space2\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><sub><span class=\"mjx-sub\"><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><\/span><\/sub><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">0<\/span><\/span><\/span><\/span><\/span> .<\/p>\n<\/div>\n<\/div>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p id=\"N10C4A\">Below you&#8217;ll find three sample outputs of the two-sided two-sample t-test:<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"_i_4\" class=\"img-responsive popimg aligncenter\" title=\"H_0: \u03bc_1 - \u03bc_2 = 0 vs. H_a: \u03bc_1 - \u03bc_2 \u2260 0\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image157.gif\" alt=\"H_0: \u03bc_1 - \u03bc_2 = 0 vs. H_a: \u03bc_1 - \u03bc_2 \u2260 0\" \/><\/span><\/span><\/p>\n<p>However, only one of the outputs could be correct (the other two contain an inconsistency). Your task is to decide which of the following outputs is the correct one (<em>Hint:<\/em>\u00a0No calculations are necessary in order to answer this question. Instead pay attention to the p-value and confidence interval).<\/p>\n<ul class=\"none\">\n<li><em>Output A:<\/em>\n<ul class=\"none\">\n<li>p-value: 0.289<\/li>\n<li>95% Confidence Interval: (-5.93090, -1.78572)<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<ul class=\"none\">\n<li><em>Output B:<\/em>\n<ul class=\"none\">\n<li>p-value: 0.003<\/li>\n<li>95% Confidence Interval: (-13.97384, 2.89733)<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<ul class=\"none\">\n<li><em>Output C:<\/em>\n<ul class=\"none\">\n<li>p-value: 0.223<\/li>\n<li>95% Confidence Interval: (-9.31432, 2.20505)<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<div id=\"h5p-226\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-226\" class=\"h5p-iframe\" data-content-id=\"226\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"10.2 Did I get this 2\"><\/iframe><\/div>\n<\/div>\n<\/div>\n<\/div>\n<h2><span title=\"Quick scroll up\">Let\u2019s Summarize<\/span><\/h2>\n<p id=\"N10CAD\">We have completed our discussion of the two-sample t-test for comparing two populations\u2019 means when the samples are independent. Let\u2019s summarize what we have learned.<\/p>\n<ul>\n<li>The two sample t-test is used for comparing the means of a quantitative variables (Y) in two populations (which we initially called sub-populations).<span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"_i_5\" class=\"img-responsive popimg aligncenter\" title=\"\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image523a.jpg\" alt=\"\" width=\"400\" \/><\/span><\/span><\/li>\n<li>Our goal is comparing \u03bc<sub>1<\/sub>\u00a0and \u03bc<sub>2<\/sub>\u00a0(which in practice is done by making inference on the difference \u03bc<sub>1<\/sub>\u00a0\u2013 \u03bc<sub>2<\/sub>). The null hypotheses is\n<ul class=\"none\">\n<li>Ho: \u03bc<sub>1<\/sub>\u00a0\u2013 \u03bc<sub>2<\/sub>\u00a0= 0<\/li>\n<\/ul>\n<p>and the alternative hypothesis is one of the following (depending on the context of the problem):<\/p>\n<ul class=\"none\">\n<li>Ha: \u03bc<sub>1<\/sub>\u00a0\u2013 \u03bc<sub>2<\/sub>\u00a0&lt; 0<\/li>\n<li>Ha: \u03bc<sub>1<\/sub>\u00a0\u2013 \u03bc<sub>2<\/sub>\u00a0&gt; 0<\/li>\n<li>Ha: \u03bc<sub>1<\/sub>\u00a0\u2013 \u03bc<sub>2<\/sub>\u00a0\u2260 0<\/li>\n<\/ul>\n<\/li>\n<li>The two-sample t-test can be safely used when the samples are independent and at least one of the following two conditions hold:\n<ul>\n<li>The variable Y is known to have a normal distribution in both populations<\/li>\n<li>The two sample sizes are large.<\/li>\n<\/ul>\n<p>When the sample sizes are not large (and we therefore need to check the normality of Y in both population), what we do in practice is look at the histograms of the two samples and make sure that there are no signs of non-normality such as extreme skewedness and\/or outliers.<\/li>\n<li>The test statistic is as follows and has a t distribution when the null hypothesis is true:<span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"_i_6\" class=\"img-responsive popimg aligncenter\" title=\"\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u5_inference\/_m2_inference_for_relationships\/webcontent\/image523b.png\" alt=\"\" \/><\/span><\/span><\/li>\n<li>P-values are obtained from the output, and conclusions are drawn as usual, comparing the p-value to the significance level alpha.<\/li>\n<li>If H<sub>o<\/sub>\u00a0is rejected, a 95% confidence interval for \u03bc<sub>1<\/sub>\u00a0\u2013 \u03bc<sub>2<\/sub>\u00a0can be very insightful and can also be used for the two-sided test.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"author":150,"menu_order":14,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[48],"contributor":[],"license":[],"class_list":["post-571","chapter","type-chapter","status-publish","hentry","chapter-type-numberless"],"part":421,"_links":{"self":[{"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/pressbooks\/v2\/chapters\/571","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/wp\/v2\/users\/150"}],"version-history":[{"count":14,"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/pressbooks\/v2\/chapters\/571\/revisions"}],"predecessor-version":[{"id":1120,"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/pressbooks\/v2\/chapters\/571\/revisions\/1120"}],"part":[{"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/pressbooks\/v2\/parts\/421"}],"metadata":[{"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/pressbooks\/v2\/chapters\/571\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/wp\/v2\/media?parent=571"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/pressbooks\/v2\/chapter-type?post=571"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/wp\/v2\/contributor?post=571"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/wp\/v2\/license?post=571"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}