{"id":510,"date":"2024-10-18T02:15:29","date_gmt":"2024-10-18T02:15:29","guid":{"rendered":"https:\/\/pressbooks.ccconline.org\/mat1260\/?post_type=chapter&#038;p=510"},"modified":"2024-12-12T20:47:28","modified_gmt":"2024-12-12T20:47:28","slug":"5-3-mean-and-variance-of-a-discrete-random-variable","status":"publish","type":"chapter","link":"https:\/\/pressbooks.ccconline.org\/mat1260\/chapter\/5-3-mean-and-variance-of-a-discrete-random-variable\/","title":{"raw":"5.3: Mean and Variance of a Discrete Random Variable","rendered":"5.3: Mean and Variance of a Discrete Random Variable"},"content":{"raw":"<div id=\"lobjh\" class=\"\">\r\n<div class=\"textbox textbox--learning-objectives\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Learning Objective<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<ul>\r\n \t<li id=\"find_mean_variance_discrete_random\">Find the mean and variance of a discrete random variable, and apply these concepts to solve real-world problems.<\/li>\r\n<\/ul>\r\n<\/div>\r\n<\/div>\r\nIn the Exploratory Data Analysis (EDA) section, we displayed the distribution of one quantitative variable with a histogram, and supplemented it with numerical measures of center and spread. We are doing the same thing here. We display the probability distribution of a discrete random variable with a table, formula or histogram, and supplement it with numerical measures of the center and spread of the probability distribution. These measures are the\u00a0<em>mean and standard deviation of the random variable.<\/em>\r\n\r\n<\/div>\r\n<p id=\"N10B0E\">This section will be devoted to introducing these measures. As before, we\u2019ll start with the numerical measure of center, the mean. Let\u2019s begin by revisiting an example we saw in EDA.<\/p>\r\n\r\n<div id=\"N10B11\" class=\"section\">\r\n<div class=\"sectionContain\">\r\n<h2><span title=\"Quick scroll up\">World Cup Soccer<\/span><\/h2>\r\n<p id=\"N10B18\">Recall that we used the following data from 3 World Cup tournaments (a total of 192 games) to introduce the idea of a\u00a0<em>weighted average<\/em>.<\/p>\r\n<p id=\"N10B1E\">We\u2019ve added a third column to our table that gives us relative frequencies.<\/p>\r\n\r\n<table class=\"grid\">\r\n<thead>\r\n<tr>\r\n<th>total # goals\/game<\/th>\r\n<th>frequency<\/th>\r\n<th>relative frequency<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td align=\"center\">0<\/td>\r\n<td align=\"center\">17<\/td>\r\n<td align=\"center\">17 \/ 192 = .089<\/td>\r\n<\/tr>\r\n<tr class=\"e\">\r\n<td align=\"center\">1<\/td>\r\n<td align=\"center\">45<\/td>\r\n<td align=\"center\">45 \/ 192 = .234<\/td>\r\n<\/tr>\r\n<tr>\r\n<td align=\"center\">2<\/td>\r\n<td align=\"center\">51<\/td>\r\n<td align=\"center\">51 \/ 192 = .266<\/td>\r\n<\/tr>\r\n<tr class=\"e\">\r\n<td align=\"center\">3<\/td>\r\n<td align=\"center\">37<\/td>\r\n<td align=\"center\">37 \/ 192 = .193<\/td>\r\n<\/tr>\r\n<tr>\r\n<td align=\"center\">4<\/td>\r\n<td align=\"center\">25<\/td>\r\n<td align=\"center\">25 \/ 192 = .130<\/td>\r\n<\/tr>\r\n<tr class=\"e\">\r\n<td align=\"center\">5<\/td>\r\n<td align=\"center\">11<\/td>\r\n<td align=\"center\">11 \/ 192 = .057<\/td>\r\n<\/tr>\r\n<tr>\r\n<td align=\"center\">6<\/td>\r\n<td align=\"center\">3<\/td>\r\n<td align=\"center\">3 \/ 192 = .016<\/td>\r\n<\/tr>\r\n<tr class=\"e\">\r\n<td align=\"center\">7<\/td>\r\n<td align=\"center\">2<\/td>\r\n<td align=\"center\">2 \/ 192 = .010<\/td>\r\n<\/tr>\r\n<tr>\r\n<td align=\"center\">8<\/td>\r\n<td align=\"center\">1<\/td>\r\n<td align=\"center\">1 \/ 192 = .005<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<table class=\"wbtable plain\">\r\n<tbody>\r\n<tr class=\"e\">\r\n<td align=\"right\">the mean for this data<\/td>\r\n<td align=\"center\">=<\/td>\r\n<td align=\"left\">[latex]\\frac{0\\left(17\\right)+1\\left(45\\right)+2\\left(51\\right)+3\\left(37\\right)+4\\left(25\\right)+5\\left(11\\right)+6\\left(3\\right)+7\\left(2\\right)+8\\left(1\\right)}{192}[\/latex]<\/td>\r\n<\/tr>\r\n<tr>\r\n<td align=\"right\">distributing the division by 192 we get:<\/td>\r\n<td align=\"center\">=<\/td>\r\n<td align=\"left\">[latex]0\\left(\\frac{17}{192}\\right)+1\\left(\\frac{45}{192}\\right)+2\\left(\\frac{51}{192}\\right)+3\\left(\\frac{37}{192}\\right)+4\\left(\\frac{25}{192}\\right)+5\\left(\\frac{11}{192}\\right)+6\\left(\\frac{3}{192}\\right)+7\\left(\\frac{2}{192}\\right)+8\\left(\\frac{1}{192}\\right)[\/latex]<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<p id=\"N10D61\">Notice that the mean is each number of goals per game multiplied by its relative frequency. Since we usually write the relative frequencies as decimals, we can see that:<\/p>\r\n\r\n<table class=\"wbtable plain\">\r\n<tbody>\r\n<tr class=\"e\">\r\n<td align=\"right\">mean number of goals per game<\/td>\r\n<td align=\"center\">=<\/td>\r\n<td align=\"left\">0(.089) + 1(.234) + 2(.266) + 3(.193) + 4(.130) + 5(.057) + 6(.016) + 7(.010) + 8(.005)<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><\/td>\r\n<td align=\"center\">=<\/td>\r\n<td align=\"left\">2.36, rounded to two decimal places<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/div>\r\n<\/div>\r\n<div id=\"N10D86\" class=\"section\">\r\n<div class=\"sectionContain\">\r\n<h2><span title=\"Quick scroll up\">Mean of a Random Variable<\/span><\/h2>\r\n<p id=\"N10D8D\">In Exploratory Data Analysis, we used the mean of a sample of quantitative values\u2014their arithmetic average\u2014to tell the center of their distribution. We also saw how a weighted mean was used when we had a frequency table. These frequencies can be changed to relative frequencies. So we are essentially using the relative frequency approach to find probabilities. We can use this to find the mean, or center, of a probability distribution for a random variable by reporting its mean, which will be a weighted average of its values; the more probable a value is, the more weight it gets. As always, it is important to distinguish between a concrete sample of observed values for a variable versus an abstract population of all values taken by a random variable in the long run.<\/p>\r\n<p id=\"N10D90\">Whereas we denoted the mean of a sample as\u00a0[latex]\\bar{x}[\/latex], we now denote the mean of a random variable as\u00a0[latex]\\mu_{x}[\/latex]. Let\u2019s see how this is done by looking at a specific example.<\/p>\r\n\r\n<div class=\"examplewrap\">\r\n<div class=\"example clearfix\">\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Example<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<div class=\"examplewrap\">\r\n<div class=\"example clearfix\">\r\n<h4>Xavier\u2019s Production Line<\/h4>\r\n<div>\r\n<p id=\"N10DB9\">Xavier\u2019s production line produces a variable number of defective parts in an hour, with probabilities shown in this table:<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"_i_0\" class=\"img-responsive popimg aligncenter\" title=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; The data in columns (X: P(X=x)): 0: .15; 1: .30; 2: .25; 3: .20; 4: .10;\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image012.gif\" alt=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; The data in columns (X: P(X=x)): 0: .15; 1: .30; 2: .25; 3: .20; 4: .10;\" \/><\/span><\/span>\r\n<p id=\"N10DC2\"  >How many defective parts are typically produced in an hour on Xavier's production line? If we sum up the possible values of X, each weighted with its probability, we have<\/p>\r\n<p id=\"N10DC5\"  >[latex]\\mu_{x}=0(0.15)+1(0.30)+2(0.25)+3(0.20)+4(0.10)=1.8[\/latex]<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<p id=\"N10E5E\"  >Here is the general definition of the mean of a discrete random variable:<\/p>\r\n\r\n<dl>\r\n \t<dt>mean of a discrete random variable<\/dt>\r\n \t<dd>\r\n<div class=\"meaning\">In general, for any discrete random variable X with probability distribution<span class=\"imagewrap\"><span class=\"image\"><img id=\"_i_1\" class=\" img-responsive popimg\" title=\"white space\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image011.gif\" alt=\"white space\" \/><\/span><\/span><span class=\"imagewrap\"><span class=\"image\"><img id=\"_i_2\" class=\" img-responsive popimg\" title=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x)&quot;. Here is the data in the table, given in column format (X: P(X=x)): x_1: p_1; x_2: p_2; x_3: p_3; ... x_n: p_n;\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image020.gif\" alt=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x)&quot;. Here is the data in the table, given in column format (X: P(X=x)): x_1: p_1; x_2: p_2; x_3: p_3; ... x_n: p_n;\" \/><\/span><\/span>the\u00a0<em  >mean<\/em>\u00a0of X is defined to be<\/div>\r\n<div class=\"meaning\">[latex]\\mu x=x_{1}p_{1}+x_{2}p_{2}+...+x_{n}p_{n}=\\sum_{i=1}^{n}x_{i}p_{i}[\/latex]<\/div><\/dd>\r\n<\/dl>\r\n<p id=\"N10F32\"  >In general, the mean of a random variable tells us its \"long-run\" average value. It is sometimes referred to as the\u00a0<em  >expected value<\/em>\u00a0of the random variable. But this expression may be somewhat misleading, because in many cases it is impossible for a random variable to actually equal its expected value. For example, the mean number of goals for a World Cup soccer game is 2.36. But we can never expect any single game to result in 2.36 goals, since it is not possible to score a fraction of a goal. Rather, 2.36 is the long-run average of all World Cup soccer games. In the case of Xavier's production line, the mean number of defective parts produced in an hour is 1.8. But the actual number of defective parts produced in any given hour can never equal 1.8, since it must take whole number values.<\/p>\r\n<p id=\"N10F38\"  >To get a better feel for the mean of a random variable, let's extend the defective parts example:<\/p>\r\nIn general, the mean of a random variable tells us its \u201clong-run\u201d average value. It is sometimes referred to as the\u00a0<em>expected value<\/em>\u00a0of the random variable. But this expression may be somewhat misleading, because in many cases it is impossible for a random variable to actually equal its expected value. For example, the mean number of goals for a World Cup soccer game is 2.36. But we can never expect any single game to result in 2.36 goals, since it is not possible to score a fraction of a goal. Rather, 2.36 is the long-run average of all World Cup soccer games. In the case of Xavier\u2019s production line, the mean number of defective parts produced in an hour is 1.8. But the actual number of defective parts produced in any given hour can never equal 1.8, since it must take whole number values.\r\n\r\n<\/div>\r\n<\/div>\r\n<p id=\"N10F38\">To get a better feel for the mean of a random variable, let\u2019s extend the defective parts example:<\/p>\r\n\r\n<div class=\"examplewrap\">\r\n<div class=\"example clearfix\">\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Example<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<h4>Xavier\u2019s and Yves\u2019 Production Lines<\/h4>\r\n<div>\r\n<p id=\"N10F40\">Recall the probability distribution of the random variable X, representing the number of defective parts in an hour produced by Xavier\u2019s production line.<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"_i_4\" class=\"img-responsive popimg aligncenter\" title=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; The data in columns (X: P(X=x)): 0: .15; 1: .30; 2: .25; 3: .20; 4: .10;\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image012.gif\" alt=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; The data in columns (X: P(X=x)): 0: .15; 1: .30; 2: .25; 3: .20; 4: .10;\" \/><\/span><\/span>\r\n<p id=\"N10F49\">The number of defective parts produced each hour by Yves\u2019 production line is a random variable Y with the following probability distribution:<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"_i_5\" class=\"img-responsive popimg aligncenter\" title=\"A probability distribution table with two rows, labeled &quot;Y&quot; and &quot;P(Y=y).&quot; The data in column format (Y: P(Y=y)): 0: .05; 1: .05; 2: .10; 3: .75; 4: .05;\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image022.gif\" alt=\"A probability distribution table with two rows, labeled &quot;Y&quot; and &quot;P(Y=y).&quot; The data in column format (Y: P(Y=y)): 0: .05; 1: .05; 2: .10; 3: .75; 4: .05;\" \/><\/span><\/span>\r\n<p id=\"N10F52\">Look at both probability distributions. Both X and Y take the same possible values (0, 1, 2, 3, 4).<\/p>\r\n<p id=\"N10F55\">However, they are very different in the way the probability is distributed among these values.<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\n[h5p id=\"111\"]\r\n\r\n<\/div>\r\n<\/div>\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<p id=\"N10075\"  >Here again is the probability distribution of Y, the number of defective parts in an hour in Yves' production line:<\/p>\r\n\r\n<div class=\"image shouldbeleft\"><img id=\"N10077\" class=\"img-responsive popimg aligncenter\" title=\"\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/dig003.gif\" alt=\"\" \/><\/div>\r\n<div>[h5p id=\"112\"]<\/div>\r\n<\/div>\r\n<\/div>\r\n<div id=\"N10B10\" class=\"section\">\r\n<div class=\"sectionContain\">\r\n<h2><span title=\"Quick scroll up\">Applications of the Mean<\/span><\/h2>\r\n<p id=\"N10B17\">Means of random variables are useful for telling us about long-run gains in sales, or for insurance companies.<\/p>\r\n<p id=\"N10B1A\">Here are two examples:<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<div class=\"examplewrap\">\r\n<div class=\"example clearfix\">\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Example<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<h4>Pizza Delivery #1<\/h4>\r\n<div>\r\n<p id=\"N10B23\">Your favorite pizza place delivers only one kind of pizza, which is sold for $10, and costs the pizza place $6 to make. The pizza place has the following policy regarding delivery: if the pizza takes longer than half an hour to arrive, there is no charge. Let the random variable X be the pizza place\u2019s gain for any one pizza.<\/p>\r\n<p id=\"N10B26\">Experience has shown that delivery takes longer than half an hour only 10 percent of the time.<\/p>\r\n<p id=\"N10B29\">Find the mean gain per pizza, [latex]\\mu_{x}[\/latex].<\/p>\r\n<p id=\"N10B3C\">In order to find the mean of X, we first need to establish its probability distribution\u2014the possible values and their probabilities.<\/p>\r\n<p id=\"N10B3F\">The random variable X has two possible values: either the pizza costs them $6 to make and they sell it for $10, in which case X takes the value $10 \u2013 $6 = $4, or it costs them $6 to make and they give it away, in which case X takes the value $0 \u2013 $6 = -$6. The probability of the latter case is given to be 10 percent, or .1, so using complements, the former has probability .9. Here, then is the probability distribution of X:<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img class=\"img-responsive popimg aligncenter\" title=\"A probability distribution table with two rows, labeled &quot;X&quot; &quot;P(X=x).&quot; Here is the data in columns (X: P(X=x)): +4: .9; -6: .1; In other words, when pizza delivery is not longer than half an hour, X = +4, and P(X = +4) = .9 . When pizza delivery takes longer than half an hour, X=-6, and P(X = -6) = .1 .\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image024a.gif\" alt=\"A probability distribution table with two rows, labeled &quot;X&quot; &quot;P(X=x).&quot; Here is the data in columns (X: P(X=x)): +4: .9; -6: .1; In other words, when pizza delivery is not longer than half an hour, X = +4, and P(X = +4) = .9 . When pizza delivery takes longer than half an hour, X=-6, and P(X = -6) = .1 .\" \/><\/span><\/span>\r\n<p id=\"N10B48\">Therefore, [latex]\\mu _{x}=(+4)(.9)+(-6)(.1)=+3[\/latex]<\/p>\r\n<p id=\"N10B9D\">In the long run, the pizza place gains an average of $3 per pizza delivered.<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<div class=\"examplewrap\">\r\n<div class=\"example clearfix\">\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Example<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<h4>Pizza Delivery #2<\/h4>\r\n<div>\r\n<p id=\"N10BA5\">If the pizza place wants to increase its mean gain per pizza to $3.90, how much should it raise the price from $10? We need to replace the original cost of 10 with an as-yet-to-be-determined new cost N, resulting in this probability distribution table:<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img class=\"img-responsive popimg\" title=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; Here is the data in columns (X: P(X=x)): N-6: .9; -6: .1; In other words, when pizza delivery is not longer than half an hour, X = N-6, and P(X = N-6) = .9 . When pizza delivery takes longer than half an hour, X=-6, and P(X = -6) = .1 .\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image026a.gif\" alt=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; Here is the data in columns (X: P(X=x)): N-6: .9; -6: .1; In other words, when pizza delivery is not longer than half an hour, X = N-6, and P(X = N-6) = .9 . When pizza delivery takes longer than half an hour, X=-6, and P(X = -6) = .1 .\" \/><\/span><\/span>\r\n<p id=\"N10BAE\">Next, setting [latex]\\mu_{x}[\/latex] equal to +3.90 instead of +3, we solve<\/p>\r\n<p id=\"N10BC1\">[latex]3.9=(N-6)(.9)+(-6)(.1)=.9N-6[\/latex] or<\/p>\r\n<p id=\"N10C1C\">[latex].9N=9.9[\/latex]<\/p>\r\n<p id=\"N10C38\">Therefore, the new price must be 11 dollars.<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\n[h5p id=\"113\"]\r\n\r\n<\/div>\r\n<\/div>\r\n<div class=\"example clearfix\">\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Example<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<h4>Raffle<\/h4>\r\n<div>\r\n<p id=\"N10CAD\"  >In order to raise money, a charity decides to raffle off some prizes. The charity sells 2,000 raffle tickets for $5 each. The prizes are:<\/p>\r\n\r\n<ul  >\r\n \t<li  >10 movie packages (two tickets plus popcorn) worth $25 each<\/li>\r\n \t<li  >5 dinners for two worth $50 each<\/li>\r\n \t<li  >2 smart phones worth $200 each<\/li>\r\n \t<li  >1 flat-screen TV worth $1,500<\/li>\r\n<\/ul>\r\n<p id=\"N10CBF\"  >What is the expected gain or loss if you buy a single raffle ticket? The expected value can be written as E(X).<\/p>\r\n<p id=\"N10CC2\"  >There are 5 possible outcomes when you buy a ticket: win movie package, win dinner for two, win smart phone, win TV, win nothing.<\/p>\r\n\r\n<table class=\"wbtable alternating\"   cellspacing=\"0\" align=\"center\">\r\n<thead>\r\n<tr>\r\n<th  >prize<\/th>\r\n<th  >net gain or loss<\/th>\r\n<th  >probability<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td  >movie package<\/td>\r\n<td  >25 - 5<\/td>\r\n<td  >10 \/ 2000<\/td>\r\n<\/tr>\r\n<tr class=\"e\">\r\n<td  >dinner for two<\/td>\r\n<td  >50 - 5<\/td>\r\n<td  >5 \/ 2000<\/td>\r\n<\/tr>\r\n<tr>\r\n<td  >smart phone<\/td>\r\n<td  >200 - 5<\/td>\r\n<td  >2 \/ 2000<\/td>\r\n<\/tr>\r\n<tr class=\"e\">\r\n<td  >TV<\/td>\r\n<td  >1500 - 5<\/td>\r\n<td  >1 \/ 2000<\/td>\r\n<\/tr>\r\n<tr>\r\n<td  >nothing<\/td>\r\n<td  >0 - 5<\/td>\r\n<td  >(2000 - 10 - 5 - 2 - 1) \/ 2000<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<p id=\"N10D00\"  >The previous information is summarized below in a probability distribution:<\/p>\r\n\r\n<div class=\"image shouldbeleft\"><img id=\"_i_2\" class=\"img-responsive popimg aligncenter\" title=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; Here is the data in column oriented format (X: P(X=x), comment): 20: 10\/20000 (movie package); 45: 5\/2000 (dinner for two); 195: 2\/2000 (smart phone); 1495: 1\/2000 (TV); -5: 1982\/2000 (Nothing);\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image_raffle.gif\" alt=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; Here is the data in column oriented format (X: P(X=x), comment): 20: 10\/20000 (movie package); 45: 5\/2000 (dinner for two); 195: 2\/2000 (smart phone); 1495: 1\/2000 (TV); -5: 1982\/2000 (Nothing);\" \/><\/div>\r\n<p id=\"N10D08\"  >[latex]E\\left ( X \\right )=\\frac{-7600}{2000}=3.80\\mu_{x}=E\\left ( X \\right )=20\\left ( \\frac{10}{2000} \\right )+\\left ( \\frac{5}{2000} \\right )+195\\left ( \\frac{2}{2000} \\right )+1495\\left ( \\frac{1}{2000} \\right )+\\left ( -5 \\right )\\left ( \\frac{1982}{2000} \\right )[\/latex]<\/p>\r\n<p id=\"N10D93\"  >[latex]E\\left ( X \\right )=\\frac{-7600}{2000}=3.80[\/latex]<\/p>\r\n<p id=\"N10DB6\"  >Since we got a negative number, we have an expected loss of $3.80 for each raffle ticket purchased. Recall that this is based upon a long-run average.<\/p>\r\n<p id=\"N10DB9\"  >Each raffle ticket has only 5 possible outcomes:<\/p>\r\n\r\n<ul  >\r\n \t<li  >\r\n<p id=\"N10DBF\"  >$20 net gain if you win the movie package<\/p>\r\n<\/li>\r\n \t<li  >\r\n<p id=\"N10DC3\"  >$45 net gain if you win the dinner for two<\/p>\r\n<\/li>\r\n \t<li  >\r\n<p id=\"N10DC7\"  >$195 net gain if you win the smart phone<\/p>\r\n<\/li>\r\n \t<li  >\r\n<p id=\"N10DCB\"  >$1,495 net gain if you win the TV<\/p>\r\n<\/li>\r\n \t<li  >\r\n<p id=\"N10DCF\"  >$5 net loss if you do not win a prize<\/p>\r\n<\/li>\r\n<\/ul>\r\n<p id=\"N10DD3\"  >It should not be surprising that you have an expected loss. After all, the charity's goal is to raise money. If you have an expected loss of $3.80 per ticket, they will have an expected gain of $3.80 per ticket. Each ticket gives the charity +5 (it was -5 for you). The prizes are reversed, too. For example, the movie package is -20 + 5 for the charity (it was 20 - 5 for you).<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\nHere is another example:\r\n<div class=\"examplewrap\">\r\n<div class=\"example clearfix\">\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Example<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<h4>Life Insurance #1<\/h4>\r\n<div>\r\n\r\nSuppose you work for an insurance company, and you sell a $100,000 whole-life insurance policy at an annual premium of $1,200. (This means that the person who bought this policy pays $1,200 per year so that in the event that he or she dies, the policy beneficiaries will get $100,000). Actuarial tables show that the probability of death during the next year for a person of your customer\u2019s age, sex, health, etc. is .005. Let the random variable X be the company\u2019s gain from such a policy.\r\n\r\nWhat is the expected or mean gain (amount of money made by the company) for a policy of this type?\r\n<p id=\"N10B1D\">In other words, we need to find\u00a0[latex] \\mu _{x}[\/latex].<\/p>\r\n<p id=\"N10B30\">Since this is a whole-life policy, there are two possibilities here; either the customer dies this year (which you are given will happen with probability .005), or the customer does not die this year (which, by the complement rule, must be .995).<\/p>\r\n<p id=\"N10B33\">In both cases, the company gets the $1,200 premium. If the customer lives, the company just gains the $1,200, but if the customer dies, the company needs to pay $100,000 to the customer\u2019s beneficiaries. Therefore, here is the probability distribution of X:<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img class=\"img-responsive popimg aligncenter\" title=\"A two row probability distribution table, in which the rows are labeled &quot;X&quot; and &quot;P(X=x)&quot;. Here is the data in column oriented format (X: P(X=x), comment): +1200: .995 (live); 1200-100,000: .005 (die);\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image029a.gif\" alt=\"A two row probability distribution table, in which the rows are labeled &quot;X&quot; and &quot;P(X=x)&quot;. Here is the data in column oriented format (X: P(X=x), comment): +1200: .995 (live); 1200-100,000: .005 (die);\" \/><\/span><\/span>\r\n\r\nTheir average, or expected, gain overall is\r\n\r\n[latex] \\mu _{x}[\/latex]\u00a0= 1200(.995) + (1200 \u2013 100,000)(.005) = 700 dollars.\r\n\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<div class=\"examplewrap\">\r\n<div class=\"example clearfix\">\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Example<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<h4>Life Insurance #2<\/h4>\r\n<div>\r\n<p id=\"N10B57\">Suppose that five years have passed and your actuarial tables indicate that the probability of death during the next year for a person of your customer\u2019s current age has gone up to .0075. Obviously, this change in probability should be reflected in the annual premium (since it is slightly more risky for the insurance company to insure the customer).<\/p>\r\n<p id=\"N10B5A\">What should the annual premium be (instead of $1,200) if the company wants to keep the same expected gain?<\/p>\r\n<p id=\"N10B5D\">Now we substitute .0075 for .005, replace 1,200 with an unknown new premium N, and set the mean gain equal to 700, as it was before:<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img class=\"img-responsive popimg\" title=\"A two-row probability distribution table, in which the rows are labeled &quot;X&quot; and &quot;P(X=x)&quot;. Here is the data in column oriented format (X: P(X=x), comment): N: .9925 (live); N-100,000: .0075 (die);\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image030a.gif\" alt=\"A two-row probability distribution table, in which the rows are labeled &quot;X&quot; and &quot;P(X=x)&quot;. Here is the data in column oriented format (X: P(X=x), comment): N: .9925 (live); N-100,000: .0075 (die);\" \/><\/span><\/span>\r\n<table class=\"grid\" style=\"height: 40px;\">\r\n<tbody>\r\n<tr class=\"e\" style=\"height: 15px;\">\r\n<td style=\"height: 15px; width: 212.297px;\" align=\"left\">We need to solve:<\/td>\r\n<td style=\"height: 15px; width: 44.3906px;\" align=\"right\">700<\/td>\r\n<td style=\"height: 15px; width: 18.5469px;\" align=\"center\">=<\/td>\r\n<td style=\"height: 15px; width: 332.078px;\" align=\"left\">(N)(.9925) + (N \u2013 100,000)(.0075)<\/td>\r\n<\/tr>\r\n<tr style=\"height: 15px;\">\r\n<td style=\"height: 15px; width: 212.297px;\" align=\"left\">Using some algebra:<\/td>\r\n<td style=\"height: 15px; width: 44.3906px;\" align=\"right\">700<\/td>\r\n<td style=\"height: 15px; width: 18.5469px;\" align=\"center\">=<\/td>\r\n<td style=\"height: 15px; width: 332.078px;\" align=\"left\">N \u2013 750<\/td>\r\n<\/tr>\r\n<tr class=\"e\" style=\"height: 10px;\">\r\n<td style=\"height: 10px; width: 212.297px;\" align=\"left\">Finally<\/td>\r\n<td style=\"height: 10px; width: 44.3906px;\" align=\"right\">N<\/td>\r\n<td style=\"height: 10px; width: 18.5469px;\" align=\"center\">=<\/td>\r\n<td style=\"height: 10px; width: 332.078px;\" align=\"left\">1450<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<p id=\"N10B9C\">In order to keep the same expected gain of $700, the company should increase that customer\u2019s premium to $1,450.<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\nThe purpose of this next activity is to give you guided practice in solving practical problems whose solution is based on the mean of random variables.\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\n[h5p id=\"114\"]\r\n\r\n<\/div>\r\n<\/div>\r\n<div id=\"N10B08\" class=\"section\">\r\n<div class=\"sectionContain\">\r\n<h2><span title=\"Quick scroll up\">Variance and Standard Deviation of a Discrete Random Variable<\/span><\/h2>\r\n<p id=\"N10B0F\">In Exploratory Data Analysis, we used the mean of a sample of quantitative values (their arithmetic average,\u00a0[latex]\\bar{x}[\/latex]) to tell the center of their distribution, and the standard deviation (s) to tell the typical distance of sample values from their mean. We described the center of a probability distribution for a random variable by reporting its mean\u00a0[latex] \\mu _{x}[\/latex], and now we would like to establish an accompanying measure of spread. Our measure of spread will still report the typical distance of values from their means, but in order to distinguish the spread of a population of all of a random variable\u2019s values from the spread (s) of sample values, we will denote the standard deviation of the random variable X with the Greek lower case \u201csigma,\u201d and use a subscript to remind us what is the variable of interest (there may be more than one in later problems):<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<div id=\"N10B35\" class=\"section\">\r\n<div class=\"sectionContain\">\r\n<h2><span title=\"Quick scroll up\">Notation:\u00a0[latex]\\sigma _{X}[\/latex]<\/span><\/h2>\r\n<p id=\"N10B4C\">We will also focus more frequently than before on the squared standard deviation, called the\u00a0<em>variance<\/em>, because some important rules we need to invoke are in terms of variance [latex]\\sigma _{X}^{2}[\/latex] rather than standard deviation [latex]\\sigma _{X}[\/latex].<\/p>\r\n\r\n<div class=\"examplewrap\">\r\n<div class=\"example clearfix\">\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Example<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<h4>Xavier\u2019s Production Line<\/h4>\r\n<div>\r\n<p id=\"N10B7C\"  >Recall that the number of defective parts produced each hour by Xavier's production line is a random variable X with the following probability distribution:<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"_i_0\" class=\" img-responsive popimg\" title=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; The data in column format (X: P(X=x)): 0: .15; 1: .30; 2: .25; 3: .20; 4: .10;\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image012.gif\" alt=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; The data in column format (X: P(X=x)): 0: .15; 1: .30; 2: .25; 3: .20; 4: .10;\" \/><\/span><\/span>\r\n<p id=\"N10B85\"  >We found the mean number of defective parts produced per hour to be\u00a0[latex] \\mu _{x}[\/latex]\u00a0= 1.8. Obviously, there is variation about this mean: some hours as few as 0 defective parts are produced, whereas in other hours as many as 4 are produced. Typically, how far does the number of defective parts fall from the mean of 1.8? As we did for the spread of sample values, we measure the spread of a random variable by calculating the square root of the average squared deviation from the mean. Now \"average\" is a weighted average, where more probable values of the random variable are accordingly given more weight. Let's begin with the variance, or average squared deviation from the mean, and then take its square root to find the standard deviation:<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"_i_1\" class=\" img-responsive popimg\" title=\"A table describing for several characteristics of of possible values of X. For X=0, Dev. from mean = (0-1.8), Sq. deviation = (0-1.8)\u00b2, and P(X=0) = .15 . For X=1, Dev. from mean = (1-1.8), Sq. deviation = (1-1.8)\u00b2, and P(X=1) = .30. For X=2, Dev. from mean = (2-1.8), Sq. deviations = (2-1.8)\u00b2, and P(X=2) = .25 . For X=3, Dev. from mean = (3-1.8), Sq. deviations = (3-1.8)\u00b2, and P(X=3) = .20 . For X=4, Dev. from mean = (4-1.8), Sq. deviations = (4-1.8)\u00b2, and P(X=4) = .10 .\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image033.gif\" alt=\"A table describing for several characteristics of of possible values of X. For X=0, Dev. from mean = (0-1.8), Sq. deviation = (0-1.8)\u00b2, and P(X=0) = .15 . For X=1, Dev. from mean = (1-1.8), Sq. deviation = (1-1.8)\u00b2, and P(X=1) = .30. For X=2, Dev. from mean = (2-1.8), Sq. deviations = (2-1.8)\u00b2, and P(X=2) = .25 . For X=3, Dev. from mean = (3-1.8), Sq. deviations = (3-1.8)\u00b2, and P(X=3) = .20 . For X=4, Dev. from mean = (4-1.8), Sq. deviations = (4-1.8)\u00b2, and P(X=4) = .10 .\" \/><\/span><\/span>\r\n<p id=\"N10B9E\"  >Variance = [latex]\\sigma ^{2}_{X}=(0-1.8)^{2}(0.15)+(1-1.8)^{2}(0.30)+(2-1.8)^{2}(0.25)+(3-1.8)^{2}(0.20)+(4-1.8)^{2}(0.1)=1.46[\/latex]<\/p>\r\n<p id=\"N10CC8\"  >standard deviation = [latex]\\sigma _{X}=\\sqrt{1.46}=1.21[\/latex]<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\nHow do we interpret the standard deviation of X?\r\n\r\n<\/div>\r\n<\/div>\r\n<p id=\"N10D0D\">Xavier\u2019s production line produces an average of 1.80 defective parts per hour. The number of defective parts varies from hour to hour; typically (or, on average), it is about 1.21 away from 1.80.<\/p>\r\n<p id=\"N10D10\">Here is the formal definition:<\/p>\r\n\r\n<dl>\r\n \t<dt>standard deviation of a discrete random variable<\/dt>\r\n \t<dd>\r\n<div class=\"meaning\">For any discrete random variable X with a probability distribution of<\/div>\r\n<div><\/div>\r\n<div class=\"meaning\"><span class=\"imagewrap\"><span class=\"image\"><img id=\"_i_3\" class=\"img-responsive popimg aligncenter\" title=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x)&quot;. Here is the data in the table, given in column format (X: P(X=x)): x_1: p_1; x_2: p_2; x_3: p_3; ... x_n: p_n;\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image020.gif\" alt=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x)&quot;. Here is the data in the table, given in column format (X: P(X=x)): x_1: p_1; x_2: p_2; x_3: p_3; ... x_n: p_n;\" \/><\/span><\/span><\/div>\r\n<div><\/div>\r\n<div><\/div>\r\nthe <em  >variance<\/em>\u00a0of X is defined to be\u00a0[latex]\\sigma _{X}^{2}=(x_{1}-\\mu _{X})^{2}p_{1}+(x_{2}-\\mu _{X})^{2}p_{2}+...+(x_{n}-\\mu _{X})^{2}p_{n}=\\sum_{i=1}^{n}(x_{i}-\\mu _{X})p_{i}[\/latex]\r\n\r\nand the\u00a0<em  >standard deviation<\/em>\u00a0is\u00a0[latex]\\sigma _{X}=\\sqrt{\\sigma ^{2}_{X}}[\/latex]\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<p id=\"N10075\"  >Here again is the probability distribution of Y\u2014the number of defective parts in an hour in Yves' production line:<\/p>\r\n\r\n<div class=\"image shouldbeleft\"><img id=\"N10077\" class=\"img-responsive popimg aligncenter\" title=\"\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/dig003.gif\" alt=\"\" \/><\/div>\r\n<div><\/div>\r\n<div>Review the following expressions to answer the question below:<\/div>\r\n<\/div>\r\n<div><img id=\"N1007D\" class=\"img-responsive popimg aligncenter\" title=\"\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/webcontent\/flash\/_u4_m3_digtutor5a_image1.gif\" alt=\"\" \/><\/div>\r\n<div><\/div>\r\n<div class=\"textbox__content\">\r\n<div>[h5p id=\"115\"]<\/div>\r\n<\/div>\r\n<\/div>\r\nThe purpose of the next activity is to give you better intuition about the mean and standard deviation of a random variable.\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\nKeeping in mind that the mean describes where a histogram is centered, and the standard deviation describes spread by reporting the typical distance of values from their mean, compare the histograms in the four exercises here and match each to the correct combination of mean and standard deviation.\r\n\r\n<img id=\"N1007D\" class=\"img-responsive popimg aligncenter\" title=\"\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/webcontent\/flash\/_u4_m3_lbdtutor4a_image1.jpg\" alt=\"\" \/>\r\n\r\n[h5p id=\"116\"]\r\n\r\n<img id=\"N1007D\" class=\"img-responsive popimg aligncenter\" title=\"\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/webcontent\/flash\/_u4_m3_lbdtutor4b_image1.jpg\" alt=\"\" \/>\r\n\r\n[h5p id=\"117\"]\r\n\r\n<img id=\"N1007D\" class=\"img-responsive popimg aligncenter\" title=\"\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/webcontent\/flash\/_u4_m3_lbdtutor4c_image1.jpg\" alt=\"\" \/>\r\n\r\n[h5p id=\"118\"]\r\n\r\n<img id=\"N1007D\" class=\"img-responsive popimg aligncenter\" title=\"\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/webcontent\/flash\/_u4_m3_lbdtutor4d_image1.jpg\" alt=\"\" \/>\r\n\r\n[h5p id=\"119\"]\r\n\r\n<\/div>\r\n<\/div>\r\n<div>\r\n<p id=\"N10B20\">The concept of standard deviation is a bit harder to grasp than that of the mean. The purpose of the following examples and activities is to help you gain a better feel for the standard deviation of a random variable:<\/p>\r\n\r\n<div class=\"examplewrap\">\r\n<div class=\"example clearfix\">\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Example<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<h4>Xavier\u2019s and Yves\u2019 Production Lines<\/h4>\r\n<div>\r\n<p id=\"N10B27\">Recall the probability distribution of the random variable X, representing the number of defective parts per hour produced by Xavier\u2019s production line, and the probability distribution of the random variable Y, representing the number of defective parts per hour produced by Yves\u2019 production line:<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img class=\"img-responsive popimg aligncenter\" title=\"Two probability distribution tables. The first has two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; The data in column format (X: P(X=x)): 0: .15; 1: .30; 2: .25; 3: .20; 4: .10; The second table also has two rows, labeled &quot;Y&quot; and &quot;P(Y=y).&quot; The data in column format (Y: P(Y=y)): 0: .05; 1: .05; 2: .10; 3: .75; 4: .05;\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image039.gif\" alt=\"Two probability distribution tables. The first has two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; The data in column format (X: P(X=x)): 0: .15; 1: .30; 2: .25; 3: .20; 4: .10; The second table also has two rows, labeled &quot;Y&quot; and &quot;P(Y=y).&quot; The data in column format (Y: P(Y=y)): 0: .05; 1: .05; 2: .10; 3: .75; 4: .05;\" \/><\/span><\/span>\r\n\r\nLook carefully at both probability distributions. Both X and Y take the same possible values (0, 1, 2, 3, 4). However, they are very different in the way the probability is distributed among these values. We saw before that this makes a difference in means:\r\n\r\n<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">X<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">.<\/span><\/span><span class=\"mjx-mn MJXc-space1\"><span class=\"mjx-char MJXc-TeX-main-R\">8<\/span><\/span><\/span><\/span><\/span>\r\n<p id=\"N10B52\"><span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">Y<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">.<\/span><\/span><span class=\"mjx-mn MJXc-space1\"><span class=\"mjx-char MJXc-TeX-main-R\">7<\/span><\/span><\/span><\/span><\/span><\/p>\r\n<p id=\"N10B71\">We now want to get a sense about how the different probability distributions impact their standard deviations.<\/p>\r\n<p id=\"N10B74\">Recall that the standard deviation of a random variable can be interpreted as a typical (or the long-run average) distance between the value of X and its mean.<\/p>\r\n[h5p id=\"120\"]\r\n<p id=\"N10B99\">So, 75% of the time Y will assume a value (3) that is very close to its mean (2.7), while X will assume a value (2) that is close to its mean (1.8) much less often\u2014only 25% of the time. The long-run average, then, of the distance between the values of Y and their mean will be much smaller than the long-run average of the distance between the values of X and their mean.<\/p>\r\nTherefore,\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03c3<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">Y<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">&lt;<\/span><\/span><span class=\"mjx-msub MJXc-space3\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03c3<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">X<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">.<\/span><\/span><span class=\"mjx-mn MJXc-space1\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/span><\/span>\u00a0Actually,\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03c3<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">Y<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">0<\/span><\/span><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">.<\/span><\/span><span class=\"mjx-mn MJXc-space1\"><span class=\"mjx-char MJXc-TeX-main-R\">8<\/span><\/span><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">5<\/span><\/span><\/span><\/span><\/span>, so we can draw the following conclusion:\r\n<p id=\"N10BEC\">Yves\u2019 production line produces an average of 2.70 defective parts per hour. The number of defective parts varies from hour to hour; typically (or, on average), it is about .85 away from 2.70.<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<h2><span title=\"Quick scroll up\">Summary<\/span><\/h2>\r\n<p id=\"N10BF6\">Here are the histograms for the production lines:<\/p>\r\n<img class=\"aligncenter\" title=\"Histogram for Xavier's production line. The vertical axis is labeled &quot;Probability&quot; and the horizontal axis is labeled &quot;X.&quot; The data in the histogram is the same as the data in the probability table for Xavier's line. Moving from left to right across the horizontal axis we see that a peak in probability is reached at X=1, but it is not much higher than X=0. In addition, going right from X=1, the values decay, ultimately to 0.10 at X=4. The mean for Xavier's line is at X=1.8 .\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image_xavier_histo_mean.jpg\" alt=\"Histogram for Xavier's production line. The vertical axis is labeled &quot;Probability&quot; and the horizontal axis is labeled &quot;X.&quot; The data in the histogram is the same as the data in the probability table for Xavier's line. Moving from left to right across the horizontal axis we see that a peak in probability is reached at X=1, but it is not much higher than X=0. In addition, going right from X=1, the values decay, ultimately to 0.10 at X=4. The mean for Xavier's line is at X=1.8 .\" width=\"346\" height=\"233\" \/>\u00a0<img class=\"aligncenter\" title=\"For Yves's line is another histogram with the same axes. Going left to right, we see a peak at X=3, which is much higher than the other values. All of the other values are roughly the same. The mean is at X=2.7 .\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image_yves_histo_mean.jpg\" alt=\"For Yves's line is another histogram with the same axes. Going left to right, we see a peak at X=3, which is much higher than the other values. All of the other values are roughly the same. The mean is at X=2.7 .\" width=\"346\" height=\"233\" \/>\r\n<p id=\"N10C09\">When we compare distributions, the distribution in which it is\u00a0<em>more likely<\/em>\u00a0to find values that are further from the mean will have a\u00a0<em>larger<\/em>\u00a0standard deviation. Likewise, the distribution in which it is\u00a0<em>less likely<\/em>\u00a0to find values that are further from the mean will have the\u00a0<em>smaller<\/em>\u00a0standard deviation.<\/p>\r\n\r\n<div class=\"asx\">\r\n<div id=\"du3_m3_meanvariance5_tutor2\" class=\"activitywrap purpose learnbydoing flash\">\r\n<div class=\"activityhead\">\r\n<div class=\"purposeType purposelearnbydoing\" title=\"\">\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<div class=\"actContain\">\r\n<div class=\"activity flash\">\r\n<div id=\"u3_m3_meanvariance5_tutor2\" class=\"flash_obj asx testFlash mark_flash\">\r\n<div id=\"ou3_m3_meanvariance5_tutor2\" class=\"page 2963997\">\r\n<div id=\"2963997\" class=\"question ddfb\">\r\n<p id=\"N1007A\">[h5p id=\"121\"]<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<p id=\"N10C37\"  >The following graphs will be used in the next \"Did I Get This?\" exercise.<\/p>\r\n<img id=\"_i_3\" title=\"A histogram titled &quot;Graph A&quot; showing the following data (presented in &quot;horizontal value: vertical value format&quot;): 1: .07; 2: .10; 3: .12; 4: .13; 5: .16; 6: .13; 7: .12; 8: .10; 9: .07;\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/u4_m3_meanvariance5_digt_imagea.jpg\" alt=\"A histogram titled &quot;Graph A&quot; showing the following data (presented in &quot;horizontal value: vertical value format&quot;): 1: .07; 2: .10; 3: .12; 4: .13; 5: .16; 6: .13; 7: .12; 8: .10; 9: .07;\" width=\"288\" height=\"195\" \/>\u00a0<img id=\"_i_4\" title=\"A histogram titled &quot;Graph B&quot; showing the following data (presented in &quot;horizontal value: vertical value format&quot;): 1: .02; 2: .08; 3: .10; 4: .15; 5: .30; 6: .15; 7: .10; 8: .08; 9: .02;\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/u4_m3_meanvariance5_digt_imageb.jpg\" alt=\"A histogram titled &quot;Graph B&quot; showing the following data (presented in &quot;horizontal value: vertical value format&quot;): 1: .02; 2: .08; 3: .10; 4: .15; 5: .30; 6: .15; 7: .10; 8: .08; 9: .02;\" width=\"288\" height=\"195\" \/>\u00a0<img id=\"_i_5\" title=\"A histogram titled &quot;Graph C&quot; showing the following data (presented in &quot;horizontal value: vertical value format&quot;): 1: .01; 2: .01; 3: .10; 4: .18; 5: .40; 6: .18; 7: .10; 8: .01; 9: .01;\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/u4_m3_meanvariance5_digt_imagec.jpg\" alt=\"A histogram titled &quot;Graph C&quot; showing the following data (presented in &quot;horizontal value: vertical value format&quot;): 1: .01; 2: .01; 3: .10; 4: .18; 5: .40; 6: .18; 7: .10; 8: .01; 9: .01;\" width=\"288\" height=\"195\" \/>\u00a0<img id=\"_i_6\" title=\"A histogram titled &quot;Graph D&quot; showing the following data (presented in &quot;horizontal value: vertical value format&quot;): 1: .01; 2: .01; 3: .02; 4: .11; 5: .70; 6: .11; 7: .02; 8: .01; 9: .01;\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/u4_m3_meanvariance5_digt_imaged.jpg\" alt=\"A histogram titled &quot;Graph D&quot; showing the following data (presented in &quot;horizontal value: vertical value format&quot;): 1: .01; 2: .01; 3: .02; 4: .11; 5: .70; 6: .11; 7: .02; 8: .01; 9: .01;\" width=\"288\" height=\"195\" \/>\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\n[h5p id=\"122\"]\r\n\r\n<\/div>\r\n<\/div>\r\n<div id=\"N10C79\" class=\"section\">\r\n<div class=\"sectionContain\">\r\n<h3>Comment<\/h3>\r\n<p id=\"N10C7F\">As we have stated before, using the mean and standard deviation gives us another way to assess which values of a random variable are unusual. Any values of a random variable that fall within 2 standard deviations of the mean would be considered ordinary (not unusual).<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<div class=\"examplewrap\">\r\n<div class=\"example clearfix\">\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Example<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<h4>Xavier\u2019s Production Line\u2014Unusual or Not?<\/h4>\r\n<div>\r\n<p id=\"N10C87\">Looking once again at the probability distribution for Xavier\u2019s production line:<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"_i_7\" class=\"img-responsive popimg aligncenter\" title=\"Histogram for Xavier's production line. The vertical axis is labeled &quot;Probability&quot; and the horizontal axis is labeled &quot;X.&quot; The data in the histogram is the same as the data in the probability table for Xavier's line. Moving from left to right across the horizontal axis we see that a peak in probability is reached at X=1, but it is not much higher than X=0. In addition, going right from X=1, the values decay, to about 0.10 at X=4. The mean for Xavier's line is at X=1.8 .\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image_xavier_histo_mean.jpg\" alt=\"Histogram for Xavier's production line. The vertical axis is labeled &quot;Probability&quot; and the horizontal axis is labeled &quot;X.&quot; The data in the histogram is the same as the data in the probability table for Xavier's line. Moving from left to right across the horizontal axis we see that a peak in probability is reached at X=1, but it is not much higher than X=0. In addition, going right from X=1, the values decay, to about 0.10 at X=4. The mean for Xavier's line is at X=1.8 .\" width=\"374\" height=\"253\" \/><\/span><\/span>\r\n<p id=\"N10C92\">Would it be considered unusual to have 4 defective parts per hour?<\/p>\r\n<p id=\"N10C95\">We know that\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">X<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">.<\/span><\/span><span class=\"mjx-mn MJXc-space1\"><span class=\"mjx-char MJXc-TeX-main-R\">8<\/span><\/span><\/span><\/span><\/span>\u00a0and\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03c3<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">X<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">.<\/span><\/span><span class=\"mjx-mn MJXc-space1\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/span><\/span>.<\/p>\r\n<p id=\"N10CD2\">Ordinary values are within 2 standard deviations of the mean. 1.8 \u2013 2(1.21) = -.62 and 1.8 + 2(1.21) = 4.22. This gives us an interval from -.62 to 4.22. Since we cannot have a negative number of defective parts, the interval is essentially from 0 to 4.22. Because 4 is within this interval, it would be considered ordinary. Therefore, it is\u00a0<em>not unusual<\/em>.<\/p>\r\n<p id=\"N10CD8\">Would it be considered unusual to have no defective parts? Zero is within 2 standard deviations of the mean, so it would not be considered unusual to have no defective parts.<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\nThe following activity will reinforce this idea.\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<p id=\"N10CE7\"  >Recall the probability distribution for changing majors.<\/p>\r\n<p id=\"N10CEC\"  >We have made the following calculations for the mean and standard deviation. For some extra practice, feel free to verify our calculations.<\/p>\r\n<p id=\"N10CEF\"  ><span id=\"MathJax-Element-7-Frame\" class=\"mjx-chtml MathJax_CHTML\"><span id=\"MJXc-Node-69\" class=\"mjx-math\"><span id=\"MJXc-Node-70\" class=\"mjx-mrow\"><span id=\"MJXc-Node-71\" class=\"mjx-msub\"><span class=\"mjx-base\"><span id=\"MJXc-Node-72\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span id=\"MJXc-Node-73\" class=\"mjx-mrow\"><span id=\"MJXc-Node-74\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">X<\/span><\/span><\/span><\/span><\/span><span id=\"MJXc-Node-75\" class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span id=\"MJXc-Node-76\" class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">1.23<\/span><\/span><\/span><\/span><\/span>\u00a0and\u00a0<span id=\"MathJax-Element-8-Frame\" class=\"mjx-chtml MathJax_CHTML\"><span id=\"MJXc-Node-77\" class=\"mjx-math\"><span id=\"MJXc-Node-78\" class=\"mjx-mrow\"><span id=\"MJXc-Node-79\" class=\"mjx-msub\"><span class=\"mjx-base\"><span id=\"MJXc-Node-80\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03c3<\/span><\/span><\/span><span class=\"mjx-sub\"><span id=\"MJXc-Node-81\" class=\"mjx-mrow\"><span id=\"MJXc-Node-82\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">X<\/span><\/span><\/span><\/span><\/span><span id=\"MJXc-Node-83\" class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span id=\"MJXc-Node-84\" class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">1.08<\/span><\/span><\/span><\/span><\/span><\/p>\r\n<p  >[h5p id=\"123\"]<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<p id=\"N10D3C\">\u201cRisk\u201d in investments provides a useful application for the concept of variability. If there is no variability at all in possible outcomes, then the outcome is something we can count on, with no risk involved. At the other extreme, if there is a large amount of variability with possibilities for either tremendous loss or gain, then the associated risk is quite high.<\/p>\r\n<p id=\"N10D3F\">If a variable\u2019s possible values just differ somewhat, with some only marginally favorable and others unfavorable, then the underlying random experiment entails just a moderate amount of risk. The following example demonstrates how differing values of standard deviation reflect the amount of risk in a situation.<\/p>\r\n\r\n<div class=\"examplewrap\">\r\n<div class=\"example clearfix\">\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<h3 class=\"textbox__title\">Example<\/h3>\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<h4>Comparing Investments<\/h4>\r\n<div>\r\n<p id=\"N10D46\">Consider three possible investments, with returns denoted as X, Y, and Z, respectively, and probability distributions outlined in the tables below.<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"_i_8\" class=\"img-responsive popimg aligncenter\" title=\"A probability table with has two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; The data in column format (X: P(X=x)): 14,000: 1; In other words, X only has one value, 14,000, and P(X=14,000) = 1.\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image043.gif\" alt=\"A probability table with has two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; The data in column format (X: P(X=x)): 14,000: 1; In other words, X only has one value, 14,000, and P(X=14,000) = 1.\" \/><\/span><\/span>\r\n<p id=\"N10D4F\">Investment X is what we\u2019d call a \u201csure thing,\u201d with a guaranteed return of $14,000: there is no risk involved at all.<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"_i_9\" class=\"img-responsive popimg aligncenter\" title=\"A probability table with two rows, labeled &quot;Y&quot; and &quot;P(Y=y).&quot; The data in column format (Y: P(Y=y)): 0: .98; 1,000,000: .02; In other words, P(Y = 0) = .98 and P(Y = 1,000,000) = .02\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image044.gif\" alt=\"A probability table with two rows, labeled &quot;Y&quot; and &quot;P(Y=y).&quot; The data in column format (Y: P(Y=y)): 0: .98; 1,000,000: .02; In other words, P(Y = 0) = .98 and P(Y = 1,000,000) = .02\" \/><\/span><\/span>\r\n<p id=\"N10D58\">Investment Y is extremely risky, with a high probability (.98) of no gain at all, contrasted by a slight probability (.02) of \u201cmaking a killing\u201d with a return of a million dollars.<\/p>\r\n<span class=\"imagewrap\"><span class=\"image\"><img id=\"_i_10\" class=\"img-responsive popimg aligncenter\" title=\"A probability table with two rows, labeled &quot;Z&quot; and &quot;P(Z=z).&quot; The data in column format (Y: P(Z=z)): 10,000: .5; 20,000: .5; In other words, P(Z = 10,000) = .5 and P(Z = 20,000) = .5\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image045.gif\" alt=\"A probability table with two rows, labeled &quot;Z&quot; and &quot;P(Z=z).&quot; The data in column format (Y: P(Z=z)): 10,000: .5; 20,000: .5; In other words, P(Z = 10,000) = .5 and P(Z = 20,000) = .5\" \/><\/span><\/span>\r\n\r\nInvestment Z is somewhere in between: there is an equal chance for either a return that\u2019s on the low side or a return that\u2019s on the high side.\r\n<p id=\"N10D64\">If you only consider the mean return on each investment, would you prefer X, Y, or Z? The means for X, Y, and Z are calculated as follows:<\/p>\r\n<p id=\"N10D67\">[latex]\\mu _{X} = 14000(1)=14000[\/latex]<\/p>\r\n<p id=\"N10DA7\">[latex]\\mu _{Y} = 0(0.98)+1000000(.02)=20000[\/latex]<\/p>\r\n<p id=\"N10E0E\">[latex]\\mu _{Z} = 10000(0.5)+20000(0.5)=15000[\/latex]<\/p>\r\n<p id=\"N10E75\">Clearly, the mean return for Y is highest, and so investment in Y would seem to be preferable.<\/p>\r\n<p id=\"N10E78\">Now consider the standard deviations, and consider which investment you\u2019d prefer\u2014X, Y, or Z.<\/p>\r\n<p id=\"N10E7B\">The standard deviations are:<\/p>\r\n<p id=\"N10E7E\">[latex]\\sigma _{X}^{2}=(14000-14000)^{2}(1)=0[\/latex]<\/p>\r\n<p id=\"N10ED9\">[latex]\\sigma _{X}=0[\/latex]<\/p>\r\n<p id=\"N10EF2\">[latex]\\sigma _{Y}^{2}=(0-20000)^{2}(0.98)+(1,000,000-20000)^{2}(0.2)=1.96\\times 10^{10}[\/latex]<\/p>\r\n<p id=\"N10FB9\">[latex]\\sigma _{Y}=140,000[\/latex]<\/p>\r\n<p id=\"N10FE4\">[latex]\\sigma _{Z}^{2}=(10000-15000)^{2}(0.5)+(20000-15000)^{2}(0.5)=25,000,000[\/latex]<\/p>\r\n<p id=\"N110A2\">[latex]\\sigma _{Z}=5000[\/latex]<\/p>\r\n<p id=\"N110C4\">Granted, the mean returns suggest that investment X is least profitable and investment Y is most profitable. On the other hand, the standard deviations are telling us that the return for X is a sure thing; for Y, the remote chance of making a huge profit is offset by a high risk of losing the investment entirely; for Z, there is a modest amount of risk involved. If you can\u2019t afford to lose any money, then investment X would be the way to go. If you have enough assets to take a chance, then investment Y would be worthwhile. In particular, if a large company routinely makes many such investments, then in the long run there will occasionally be such enormous gains that the company is willing to absorb many smaller losses. Investment Z represents the middle ground, somewhere between the other two.<\/p>\r\n\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div><\/dd>\r\n<\/dl>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>","rendered":"<div id=\"lobjh\" class=\"\">\n<div class=\"textbox textbox--learning-objectives\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Learning Objective<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<ul>\n<li id=\"find_mean_variance_discrete_random\">Find the mean and variance of a discrete random variable, and apply these concepts to solve real-world problems.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<p>In the Exploratory Data Analysis (EDA) section, we displayed the distribution of one quantitative variable with a histogram, and supplemented it with numerical measures of center and spread. We are doing the same thing here. We display the probability distribution of a discrete random variable with a table, formula or histogram, and supplement it with numerical measures of the center and spread of the probability distribution. These measures are the\u00a0<em>mean and standard deviation of the random variable.<\/em><\/p>\n<\/div>\n<p id=\"N10B0E\">This section will be devoted to introducing these measures. As before, we\u2019ll start with the numerical measure of center, the mean. Let\u2019s begin by revisiting an example we saw in EDA.<\/p>\n<div id=\"N10B11\" class=\"section\">\n<div class=\"sectionContain\">\n<h2><span title=\"Quick scroll up\">World Cup Soccer<\/span><\/h2>\n<p id=\"N10B18\">Recall that we used the following data from 3 World Cup tournaments (a total of 192 games) to introduce the idea of a\u00a0<em>weighted average<\/em>.<\/p>\n<p id=\"N10B1E\">We\u2019ve added a third column to our table that gives us relative frequencies.<\/p>\n<table class=\"grid\">\n<thead>\n<tr>\n<th>total # goals\/game<\/th>\n<th>frequency<\/th>\n<th>relative frequency<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td align=\"center\">0<\/td>\n<td align=\"center\">17<\/td>\n<td align=\"center\">17 \/ 192 = .089<\/td>\n<\/tr>\n<tr class=\"e\">\n<td align=\"center\">1<\/td>\n<td align=\"center\">45<\/td>\n<td align=\"center\">45 \/ 192 = .234<\/td>\n<\/tr>\n<tr>\n<td align=\"center\">2<\/td>\n<td align=\"center\">51<\/td>\n<td align=\"center\">51 \/ 192 = .266<\/td>\n<\/tr>\n<tr class=\"e\">\n<td align=\"center\">3<\/td>\n<td align=\"center\">37<\/td>\n<td align=\"center\">37 \/ 192 = .193<\/td>\n<\/tr>\n<tr>\n<td align=\"center\">4<\/td>\n<td align=\"center\">25<\/td>\n<td align=\"center\">25 \/ 192 = .130<\/td>\n<\/tr>\n<tr class=\"e\">\n<td align=\"center\">5<\/td>\n<td align=\"center\">11<\/td>\n<td align=\"center\">11 \/ 192 = .057<\/td>\n<\/tr>\n<tr>\n<td align=\"center\">6<\/td>\n<td align=\"center\">3<\/td>\n<td align=\"center\">3 \/ 192 = .016<\/td>\n<\/tr>\n<tr class=\"e\">\n<td align=\"center\">7<\/td>\n<td align=\"center\">2<\/td>\n<td align=\"center\">2 \/ 192 = .010<\/td>\n<\/tr>\n<tr>\n<td align=\"center\">8<\/td>\n<td align=\"center\">1<\/td>\n<td align=\"center\">1 \/ 192 = .005<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table class=\"wbtable plain\">\n<tbody>\n<tr class=\"e\">\n<td align=\"right\">the mean for this data<\/td>\n<td align=\"center\">=<\/td>\n<td align=\"left\">[latex]\\frac{0\\left(17\\right)+1\\left(45\\right)+2\\left(51\\right)+3\\left(37\\right)+4\\left(25\\right)+5\\left(11\\right)+6\\left(3\\right)+7\\left(2\\right)+8\\left(1\\right)}{192}[\/latex]<\/td>\n<\/tr>\n<tr>\n<td align=\"right\">distributing the division by 192 we get:<\/td>\n<td align=\"center\">=<\/td>\n<td align=\"left\">[latex]0\\left(\\frac{17}{192}\\right)+1\\left(\\frac{45}{192}\\right)+2\\left(\\frac{51}{192}\\right)+3\\left(\\frac{37}{192}\\right)+4\\left(\\frac{25}{192}\\right)+5\\left(\\frac{11}{192}\\right)+6\\left(\\frac{3}{192}\\right)+7\\left(\\frac{2}{192}\\right)+8\\left(\\frac{1}{192}\\right)[\/latex]<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p id=\"N10D61\">Notice that the mean is each number of goals per game multiplied by its relative frequency. Since we usually write the relative frequencies as decimals, we can see that:<\/p>\n<table class=\"wbtable plain\">\n<tbody>\n<tr class=\"e\">\n<td align=\"right\">mean number of goals per game<\/td>\n<td align=\"center\">=<\/td>\n<td align=\"left\">0(.089) + 1(.234) + 2(.266) + 3(.193) + 4(.130) + 5(.057) + 6(.016) + 7(.010) + 8(.005)<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td align=\"center\">=<\/td>\n<td align=\"left\">2.36, rounded to two decimal places<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<\/div>\n<div id=\"N10D86\" class=\"section\">\n<div class=\"sectionContain\">\n<h2><span title=\"Quick scroll up\">Mean of a Random Variable<\/span><\/h2>\n<p id=\"N10D8D\">In Exploratory Data Analysis, we used the mean of a sample of quantitative values\u2014their arithmetic average\u2014to tell the center of their distribution. We also saw how a weighted mean was used when we had a frequency table. These frequencies can be changed to relative frequencies. So we are essentially using the relative frequency approach to find probabilities. We can use this to find the mean, or center, of a probability distribution for a random variable by reporting its mean, which will be a weighted average of its values; the more probable a value is, the more weight it gets. As always, it is important to distinguish between a concrete sample of observed values for a variable versus an abstract population of all values taken by a random variable in the long run.<\/p>\n<p id=\"N10D90\">Whereas we denoted the mean of a sample as\u00a0[latex]\\bar{x}[\/latex], we now denote the mean of a random variable as\u00a0[latex]\\mu_{x}[\/latex]. Let\u2019s see how this is done by looking at a specific example.<\/p>\n<div class=\"examplewrap\">\n<div class=\"example clearfix\">\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Example<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<div class=\"examplewrap\">\n<div class=\"example clearfix\">\n<h4>Xavier\u2019s Production Line<\/h4>\n<div>\n<p id=\"N10DB9\">Xavier\u2019s production line produces a variable number of defective parts in an hour, with probabilities shown in this table:<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"_i_0\" class=\"img-responsive popimg aligncenter\" title=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; The data in columns (X: P(X=x)): 0: .15; 1: .30; 2: .25; 3: .20; 4: .10;\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image012.gif\" alt=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; The data in columns (X: P(X=x)): 0: .15; 1: .30; 2: .25; 3: .20; 4: .10;\" \/><\/span><\/span><\/p>\n<p id=\"N10DC2\">How many defective parts are typically produced in an hour on Xavier&#8217;s production line? If we sum up the possible values of X, each weighted with its probability, we have<\/p>\n<p id=\"N10DC5\">[latex]\\mu_{x}=0(0.15)+1(0.30)+2(0.25)+3(0.20)+4(0.10)=1.8[\/latex]<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p id=\"N10E5E\">Here is the general definition of the mean of a discrete random variable:<\/p>\n<dl>\n<dt>mean of a discrete random variable<\/dt>\n<dd>\n<div class=\"meaning\">In general, for any discrete random variable X with probability distribution<span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"_i_1\" class=\"img-responsive popimg\" title=\"white space\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image011.gif\" alt=\"white space\" \/><\/span><\/span><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"_i_2\" class=\"img-responsive popimg\" title=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x)&quot;. Here is the data in the table, given in column format (X: P(X=x)): x_1: p_1; x_2: p_2; x_3: p_3; ... x_n: p_n;\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image020.gif\" alt=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x)&quot;. Here is the data in the table, given in column format (X: P(X=x)): x_1: p_1; x_2: p_2; x_3: p_3; ... x_n: p_n;\" \/><\/span><\/span>the\u00a0<em>mean<\/em>\u00a0of X is defined to be<\/div>\n<div class=\"meaning\">[latex]\\mu x=x_{1}p_{1}+x_{2}p_{2}+...+x_{n}p_{n}=\\sum_{i=1}^{n}x_{i}p_{i}[\/latex]<\/div>\n<\/dd>\n<\/dl>\n<p id=\"N10F32\">In general, the mean of a random variable tells us its &#8220;long-run&#8221; average value. It is sometimes referred to as the\u00a0<em>expected value<\/em>\u00a0of the random variable. But this expression may be somewhat misleading, because in many cases it is impossible for a random variable to actually equal its expected value. For example, the mean number of goals for a World Cup soccer game is 2.36. But we can never expect any single game to result in 2.36 goals, since it is not possible to score a fraction of a goal. Rather, 2.36 is the long-run average of all World Cup soccer games. In the case of Xavier&#8217;s production line, the mean number of defective parts produced in an hour is 1.8. But the actual number of defective parts produced in any given hour can never equal 1.8, since it must take whole number values.<\/p>\n<p id=\"N10F38\">To get a better feel for the mean of a random variable, let&#8217;s extend the defective parts example:<\/p>\n<p>In general, the mean of a random variable tells us its \u201clong-run\u201d average value. It is sometimes referred to as the\u00a0<em>expected value<\/em>\u00a0of the random variable. But this expression may be somewhat misleading, because in many cases it is impossible for a random variable to actually equal its expected value. For example, the mean number of goals for a World Cup soccer game is 2.36. But we can never expect any single game to result in 2.36 goals, since it is not possible to score a fraction of a goal. Rather, 2.36 is the long-run average of all World Cup soccer games. In the case of Xavier\u2019s production line, the mean number of defective parts produced in an hour is 1.8. But the actual number of defective parts produced in any given hour can never equal 1.8, since it must take whole number values.<\/p>\n<\/div>\n<\/div>\n<p>To get a better feel for the mean of a random variable, let\u2019s extend the defective parts example:<\/p>\n<div class=\"examplewrap\">\n<div class=\"example clearfix\">\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Example<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<h4>Xavier\u2019s and Yves\u2019 Production Lines<\/h4>\n<div>\n<p id=\"N10F40\">Recall the probability distribution of the random variable X, representing the number of defective parts in an hour produced by Xavier\u2019s production line.<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"_i_4\" class=\"img-responsive popimg aligncenter\" title=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; The data in columns (X: P(X=x)): 0: .15; 1: .30; 2: .25; 3: .20; 4: .10;\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image012.gif\" alt=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; The data in columns (X: P(X=x)): 0: .15; 1: .30; 2: .25; 3: .20; 4: .10;\" \/><\/span><\/span><\/p>\n<p id=\"N10F49\">The number of defective parts produced each hour by Yves\u2019 production line is a random variable Y with the following probability distribution:<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"_i_5\" class=\"img-responsive popimg aligncenter\" title=\"A probability distribution table with two rows, labeled &quot;Y&quot; and &quot;P(Y=y).&quot; The data in column format (Y: P(Y=y)): 0: .05; 1: .05; 2: .10; 3: .75; 4: .05;\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image022.gif\" alt=\"A probability distribution table with two rows, labeled &quot;Y&quot; and &quot;P(Y=y).&quot; The data in column format (Y: P(Y=y)): 0: .05; 1: .05; 2: .10; 3: .75; 4: .05;\" \/><\/span><\/span><\/p>\n<p id=\"N10F52\">Look at both probability distributions. Both X and Y take the same possible values (0, 1, 2, 3, 4).<\/p>\n<p id=\"N10F55\">However, they are very different in the way the probability is distributed among these values.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<div id=\"h5p-111\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-111\" class=\"h5p-iframe\" data-content-id=\"111\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"5.3 Learn by doing 1\"><\/iframe><\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p id=\"N10075\">Here again is the probability distribution of Y, the number of defective parts in an hour in Yves&#8217; production line:<\/p>\n<div class=\"image shouldbeleft\"><img decoding=\"async\" id=\"N10077\" class=\"img-responsive popimg aligncenter\" title=\"\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/dig003.gif\" alt=\"\" \/><\/div>\n<div>\n<div id=\"h5p-112\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-112\" class=\"h5p-iframe\" data-content-id=\"112\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"5.3 Did I get this? 1\"><\/iframe><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"N10B10\" class=\"section\">\n<div class=\"sectionContain\">\n<h2><span title=\"Quick scroll up\">Applications of the Mean<\/span><\/h2>\n<p id=\"N10B17\">Means of random variables are useful for telling us about long-run gains in sales, or for insurance companies.<\/p>\n<p id=\"N10B1A\">Here are two examples:<\/p>\n<\/div>\n<\/div>\n<div class=\"examplewrap\">\n<div class=\"example clearfix\">\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Example<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<h4>Pizza Delivery #1<\/h4>\n<div>\n<p id=\"N10B23\">Your favorite pizza place delivers only one kind of pizza, which is sold for $10, and costs the pizza place $6 to make. The pizza place has the following policy regarding delivery: if the pizza takes longer than half an hour to arrive, there is no charge. Let the random variable X be the pizza place\u2019s gain for any one pizza.<\/p>\n<p id=\"N10B26\">Experience has shown that delivery takes longer than half an hour only 10 percent of the time.<\/p>\n<p id=\"N10B29\">Find the mean gain per pizza, [latex]\\mu_{x}[\/latex].<\/p>\n<p id=\"N10B3C\">In order to find the mean of X, we first need to establish its probability distribution\u2014the possible values and their probabilities.<\/p>\n<p id=\"N10B3F\">The random variable X has two possible values: either the pizza costs them $6 to make and they sell it for $10, in which case X takes the value $10 \u2013 $6 = $4, or it costs them $6 to make and they give it away, in which case X takes the value $0 \u2013 $6 = -$6. The probability of the latter case is given to be 10 percent, or .1, so using complements, the former has probability .9. Here, then is the probability distribution of X:<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" class=\"img-responsive popimg aligncenter\" title=\"A probability distribution table with two rows, labeled &quot;X&quot; &quot;P(X=x).&quot; Here is the data in columns (X: P(X=x)): +4: .9; -6: .1; In other words, when pizza delivery is not longer than half an hour, X = +4, and P(X = +4) = .9 . When pizza delivery takes longer than half an hour, X=-6, and P(X = -6) = .1 .\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image024a.gif\" alt=\"A probability distribution table with two rows, labeled &quot;X&quot; &quot;P(X=x).&quot; Here is the data in columns (X: P(X=x)): +4: .9; -6: .1; In other words, when pizza delivery is not longer than half an hour, X = +4, and P(X = +4) = .9 . When pizza delivery takes longer than half an hour, X=-6, and P(X = -6) = .1 .\" \/><\/span><\/span><\/p>\n<p id=\"N10B48\">Therefore, [latex]\\mu _{x}=(+4)(.9)+(-6)(.1)=+3[\/latex]<\/p>\n<p id=\"N10B9D\">In the long run, the pizza place gains an average of $3 per pizza delivered.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"examplewrap\">\n<div class=\"example clearfix\">\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Example<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<h4>Pizza Delivery #2<\/h4>\n<div>\n<p id=\"N10BA5\">If the pizza place wants to increase its mean gain per pizza to $3.90, how much should it raise the price from $10? We need to replace the original cost of 10 with an as-yet-to-be-determined new cost N, resulting in this probability distribution table:<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" class=\"img-responsive popimg\" title=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; Here is the data in columns (X: P(X=x)): N-6: .9; -6: .1; In other words, when pizza delivery is not longer than half an hour, X = N-6, and P(X = N-6) = .9 . When pizza delivery takes longer than half an hour, X=-6, and P(X = -6) = .1 .\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image026a.gif\" alt=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; Here is the data in columns (X: P(X=x)): N-6: .9; -6: .1; In other words, when pizza delivery is not longer than half an hour, X = N-6, and P(X = N-6) = .9 . When pizza delivery takes longer than half an hour, X=-6, and P(X = -6) = .1 .\" \/><\/span><\/span><\/p>\n<p id=\"N10BAE\">Next, setting [latex]\\mu_{x}[\/latex] equal to +3.90 instead of +3, we solve<\/p>\n<p id=\"N10BC1\">[latex]3.9=(N-6)(.9)+(-6)(.1)=.9N-6[\/latex] or<\/p>\n<p id=\"N10C1C\">[latex].9N=9.9[\/latex]<\/p>\n<p id=\"N10C38\">Therefore, the new price must be 11 dollars.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<div id=\"h5p-113\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-113\" class=\"h5p-iframe\" data-content-id=\"113\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"5.3 Learn by doing 2\"><\/iframe><\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"example clearfix\">\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Example<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<h4>Raffle<\/h4>\n<div>\n<p id=\"N10CAD\">In order to raise money, a charity decides to raffle off some prizes. The charity sells 2,000 raffle tickets for $5 each. The prizes are:<\/p>\n<ul>\n<li>10 movie packages (two tickets plus popcorn) worth $25 each<\/li>\n<li>5 dinners for two worth $50 each<\/li>\n<li>2 smart phones worth $200 each<\/li>\n<li>1 flat-screen TV worth $1,500<\/li>\n<\/ul>\n<p id=\"N10CBF\">What is the expected gain or loss if you buy a single raffle ticket? The expected value can be written as E(X).<\/p>\n<p id=\"N10CC2\">There are 5 possible outcomes when you buy a ticket: win movie package, win dinner for two, win smart phone, win TV, win nothing.<\/p>\n<table class=\"wbtable alternating\" style=\"border-spacing: 0px; margin: auto;\">\n<thead>\n<tr>\n<th>prize<\/th>\n<th>net gain or loss<\/th>\n<th>probability<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>movie package<\/td>\n<td>25 &#8211; 5<\/td>\n<td>10 \/ 2000<\/td>\n<\/tr>\n<tr class=\"e\">\n<td>dinner for two<\/td>\n<td>50 &#8211; 5<\/td>\n<td>5 \/ 2000<\/td>\n<\/tr>\n<tr>\n<td>smart phone<\/td>\n<td>200 &#8211; 5<\/td>\n<td>2 \/ 2000<\/td>\n<\/tr>\n<tr class=\"e\">\n<td>TV<\/td>\n<td>1500 &#8211; 5<\/td>\n<td>1 \/ 2000<\/td>\n<\/tr>\n<tr>\n<td>nothing<\/td>\n<td>0 &#8211; 5<\/td>\n<td>(2000 &#8211; 10 &#8211; 5 &#8211; 2 &#8211; 1) \/ 2000<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p id=\"N10D00\">The previous information is summarized below in a probability distribution:<\/p>\n<div class=\"image shouldbeleft\"><img decoding=\"async\" class=\"img-responsive popimg aligncenter\" title=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; Here is the data in column oriented format (X: P(X=x), comment): 20: 10\/20000 (movie package); 45: 5\/2000 (dinner for two); 195: 2\/2000 (smart phone); 1495: 1\/2000 (TV); -5: 1982\/2000 (Nothing);\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image_raffle.gif\" alt=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; Here is the data in column oriented format (X: P(X=x), comment): 20: 10\/20000 (movie package); 45: 5\/2000 (dinner for two); 195: 2\/2000 (smart phone); 1495: 1\/2000 (TV); -5: 1982\/2000 (Nothing);\" \/><\/div>\n<p id=\"N10D08\">[latex]E\\left ( X \\right )=\\frac{-7600}{2000}=3.80\\mu_{x}=E\\left ( X \\right )=20\\left ( \\frac{10}{2000} \\right )+\\left ( \\frac{5}{2000} \\right )+195\\left ( \\frac{2}{2000} \\right )+1495\\left ( \\frac{1}{2000} \\right )+\\left ( -5 \\right )\\left ( \\frac{1982}{2000} \\right )[\/latex]<\/p>\n<p id=\"N10D93\">[latex]E\\left ( X \\right )=\\frac{-7600}{2000}=3.80[\/latex]<\/p>\n<p id=\"N10DB6\">Since we got a negative number, we have an expected loss of $3.80 for each raffle ticket purchased. Recall that this is based upon a long-run average.<\/p>\n<p>Each raffle ticket has only 5 possible outcomes:<\/p>\n<ul>\n<li>\n<p id=\"N10DBF\">$20 net gain if you win the movie package<\/p>\n<\/li>\n<li>\n<p id=\"N10DC3\">$45 net gain if you win the dinner for two<\/p>\n<\/li>\n<li>\n<p id=\"N10DC7\">$195 net gain if you win the smart phone<\/p>\n<\/li>\n<li>\n<p id=\"N10DCB\">$1,495 net gain if you win the TV<\/p>\n<\/li>\n<li>\n<p id=\"N10DCF\">$5 net loss if you do not win a prize<\/p>\n<\/li>\n<\/ul>\n<p id=\"N10DD3\">It should not be surprising that you have an expected loss. After all, the charity&#8217;s goal is to raise money. If you have an expected loss of $3.80 per ticket, they will have an expected gain of $3.80 per ticket. Each ticket gives the charity +5 (it was -5 for you). The prizes are reversed, too. For example, the movie package is -20 + 5 for the charity (it was 20 &#8211; 5 for you).<\/p>\n<\/div>\n<\/div>\n<\/div>\n<p>Here is another example:<\/p>\n<div class=\"examplewrap\">\n<div class=\"example clearfix\">\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Example<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<h4>Life Insurance #1<\/h4>\n<div>\n<p>Suppose you work for an insurance company, and you sell a $100,000 whole-life insurance policy at an annual premium of $1,200. (This means that the person who bought this policy pays $1,200 per year so that in the event that he or she dies, the policy beneficiaries will get $100,000). Actuarial tables show that the probability of death during the next year for a person of your customer\u2019s age, sex, health, etc. is .005. Let the random variable X be the company\u2019s gain from such a policy.<\/p>\n<p>What is the expected or mean gain (amount of money made by the company) for a policy of this type?<\/p>\n<p id=\"N10B1D\">In other words, we need to find\u00a0[latex]\\mu _{x}[\/latex].<\/p>\n<p id=\"N10B30\">Since this is a whole-life policy, there are two possibilities here; either the customer dies this year (which you are given will happen with probability .005), or the customer does not die this year (which, by the complement rule, must be .995).<\/p>\n<p id=\"N10B33\">In both cases, the company gets the $1,200 premium. If the customer lives, the company just gains the $1,200, but if the customer dies, the company needs to pay $100,000 to the customer\u2019s beneficiaries. Therefore, here is the probability distribution of X:<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" class=\"img-responsive popimg aligncenter\" title=\"A two row probability distribution table, in which the rows are labeled &quot;X&quot; and &quot;P(X=x)&quot;. Here is the data in column oriented format (X: P(X=x), comment): +1200: .995 (live); 1200-100,000: .005 (die);\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image029a.gif\" alt=\"A two row probability distribution table, in which the rows are labeled &quot;X&quot; and &quot;P(X=x)&quot;. Here is the data in column oriented format (X: P(X=x), comment): +1200: .995 (live); 1200-100,000: .005 (die);\" \/><\/span><\/span><\/p>\n<p>Their average, or expected, gain overall is<\/p>\n<p>[latex]\\mu _{x}[\/latex]\u00a0= 1200(.995) + (1200 \u2013 100,000)(.005) = 700 dollars.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"examplewrap\">\n<div class=\"example clearfix\">\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Example<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<h4>Life Insurance #2<\/h4>\n<div>\n<p id=\"N10B57\">Suppose that five years have passed and your actuarial tables indicate that the probability of death during the next year for a person of your customer\u2019s current age has gone up to .0075. Obviously, this change in probability should be reflected in the annual premium (since it is slightly more risky for the insurance company to insure the customer).<\/p>\n<p id=\"N10B5A\">What should the annual premium be (instead of $1,200) if the company wants to keep the same expected gain?<\/p>\n<p id=\"N10B5D\">Now we substitute .0075 for .005, replace 1,200 with an unknown new premium N, and set the mean gain equal to 700, as it was before:<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" class=\"img-responsive popimg\" title=\"A two-row probability distribution table, in which the rows are labeled &quot;X&quot; and &quot;P(X=x)&quot;. Here is the data in column oriented format (X: P(X=x), comment): N: .9925 (live); N-100,000: .0075 (die);\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image030a.gif\" alt=\"A two-row probability distribution table, in which the rows are labeled &quot;X&quot; and &quot;P(X=x)&quot;. Here is the data in column oriented format (X: P(X=x), comment): N: .9925 (live); N-100,000: .0075 (die);\" \/><\/span><\/span><\/p>\n<table class=\"grid\" style=\"height: 40px;\">\n<tbody>\n<tr class=\"e\" style=\"height: 15px;\">\n<td style=\"height: 15px; width: 212.297px;\" align=\"left\">We need to solve:<\/td>\n<td style=\"height: 15px; width: 44.3906px;\" align=\"right\">700<\/td>\n<td style=\"height: 15px; width: 18.5469px;\" align=\"center\">=<\/td>\n<td style=\"height: 15px; width: 332.078px;\" align=\"left\">(N)(.9925) + (N \u2013 100,000)(.0075)<\/td>\n<\/tr>\n<tr style=\"height: 15px;\">\n<td style=\"height: 15px; width: 212.297px;\" align=\"left\">Using some algebra:<\/td>\n<td style=\"height: 15px; width: 44.3906px;\" align=\"right\">700<\/td>\n<td style=\"height: 15px; width: 18.5469px;\" align=\"center\">=<\/td>\n<td style=\"height: 15px; width: 332.078px;\" align=\"left\">N \u2013 750<\/td>\n<\/tr>\n<tr class=\"e\" style=\"height: 10px;\">\n<td style=\"height: 10px; width: 212.297px;\" align=\"left\">Finally<\/td>\n<td style=\"height: 10px; width: 44.3906px;\" align=\"right\">N<\/td>\n<td style=\"height: 10px; width: 18.5469px;\" align=\"center\">=<\/td>\n<td style=\"height: 10px; width: 332.078px;\" align=\"left\">1450<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p id=\"N10B9C\">In order to keep the same expected gain of $700, the company should increase that customer\u2019s premium to $1,450.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<p>The purpose of this next activity is to give you guided practice in solving practical problems whose solution is based on the mean of random variables.<\/p>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<div id=\"h5p-114\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-114\" class=\"h5p-iframe\" data-content-id=\"114\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"5.3 Learn by doing 4\"><\/iframe><\/div>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"N10B08\" class=\"section\">\n<div class=\"sectionContain\">\n<h2><span title=\"Quick scroll up\">Variance and Standard Deviation of a Discrete Random Variable<\/span><\/h2>\n<p id=\"N10B0F\">In Exploratory Data Analysis, we used the mean of a sample of quantitative values (their arithmetic average,\u00a0[latex]\\bar{x}[\/latex]) to tell the center of their distribution, and the standard deviation (s) to tell the typical distance of sample values from their mean. We described the center of a probability distribution for a random variable by reporting its mean\u00a0[latex]\\mu _{x}[\/latex], and now we would like to establish an accompanying measure of spread. Our measure of spread will still report the typical distance of values from their means, but in order to distinguish the spread of a population of all of a random variable\u2019s values from the spread (s) of sample values, we will denote the standard deviation of the random variable X with the Greek lower case \u201csigma,\u201d and use a subscript to remind us what is the variable of interest (there may be more than one in later problems):<\/p>\n<\/div>\n<\/div>\n<div id=\"N10B35\" class=\"section\">\n<div class=\"sectionContain\">\n<h2><span title=\"Quick scroll up\">Notation:\u00a0[latex]\\sigma _{X}[\/latex]<\/span><\/h2>\n<p id=\"N10B4C\">We will also focus more frequently than before on the squared standard deviation, called the\u00a0<em>variance<\/em>, because some important rules we need to invoke are in terms of variance [latex]\\sigma _{X}^{2}[\/latex] rather than standard deviation [latex]\\sigma _{X}[\/latex].<\/p>\n<div class=\"examplewrap\">\n<div class=\"example clearfix\">\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Example<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<h4>Xavier\u2019s Production Line<\/h4>\n<div>\n<p id=\"N10B7C\">Recall that the number of defective parts produced each hour by Xavier&#8217;s production line is a random variable X with the following probability distribution:<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" class=\"img-responsive popimg\" title=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; The data in column format (X: P(X=x)): 0: .15; 1: .30; 2: .25; 3: .20; 4: .10;\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image012.gif\" alt=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; The data in column format (X: P(X=x)): 0: .15; 1: .30; 2: .25; 3: .20; 4: .10;\" \/><\/span><\/span><\/p>\n<p id=\"N10B85\">We found the mean number of defective parts produced per hour to be\u00a0[latex]\\mu _{x}[\/latex]\u00a0= 1.8. Obviously, there is variation about this mean: some hours as few as 0 defective parts are produced, whereas in other hours as many as 4 are produced. Typically, how far does the number of defective parts fall from the mean of 1.8? As we did for the spread of sample values, we measure the spread of a random variable by calculating the square root of the average squared deviation from the mean. Now &#8220;average&#8221; is a weighted average, where more probable values of the random variable are accordingly given more weight. Let&#8217;s begin with the variance, or average squared deviation from the mean, and then take its square root to find the standard deviation:<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" class=\"img-responsive popimg\" title=\"A table describing for several characteristics of of possible values of X. For X=0, Dev. from mean = (0-1.8), Sq. deviation = (0-1.8)\u00b2, and P(X=0) = .15 . For X=1, Dev. from mean = (1-1.8), Sq. deviation = (1-1.8)\u00b2, and P(X=1) = .30. For X=2, Dev. from mean = (2-1.8), Sq. deviations = (2-1.8)\u00b2, and P(X=2) = .25 . For X=3, Dev. from mean = (3-1.8), Sq. deviations = (3-1.8)\u00b2, and P(X=3) = .20 . For X=4, Dev. from mean = (4-1.8), Sq. deviations = (4-1.8)\u00b2, and P(X=4) = .10 .\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image033.gif\" alt=\"A table describing for several characteristics of of possible values of X. For X=0, Dev. from mean = (0-1.8), Sq. deviation = (0-1.8)\u00b2, and P(X=0) = .15 . For X=1, Dev. from mean = (1-1.8), Sq. deviation = (1-1.8)\u00b2, and P(X=1) = .30. For X=2, Dev. from mean = (2-1.8), Sq. deviations = (2-1.8)\u00b2, and P(X=2) = .25 . For X=3, Dev. from mean = (3-1.8), Sq. deviations = (3-1.8)\u00b2, and P(X=3) = .20 . For X=4, Dev. from mean = (4-1.8), Sq. deviations = (4-1.8)\u00b2, and P(X=4) = .10 .\" \/><\/span><\/span><\/p>\n<p id=\"N10B9E\">Variance = [latex]\\sigma ^{2}_{X}=(0-1.8)^{2}(0.15)+(1-1.8)^{2}(0.30)+(2-1.8)^{2}(0.25)+(3-1.8)^{2}(0.20)+(4-1.8)^{2}(0.1)=1.46[\/latex]<\/p>\n<p id=\"N10CC8\">standard deviation = [latex]\\sigma _{X}=\\sqrt{1.46}=1.21[\/latex]<\/p>\n<\/div>\n<\/div>\n<\/div>\n<p>How do we interpret the standard deviation of X?<\/p>\n<\/div>\n<\/div>\n<p id=\"N10D0D\">Xavier\u2019s production line produces an average of 1.80 defective parts per hour. The number of defective parts varies from hour to hour; typically (or, on average), it is about 1.21 away from 1.80.<\/p>\n<p id=\"N10D10\">Here is the formal definition:<\/p>\n<dl>\n<dt>standard deviation of a discrete random variable<\/dt>\n<dd>\n<div class=\"meaning\">For any discrete random variable X with a probability distribution of<\/div>\n<div><\/div>\n<div class=\"meaning\"><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"_i_3\" class=\"img-responsive popimg aligncenter\" title=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x)&quot;. Here is the data in the table, given in column format (X: P(X=x)): x_1: p_1; x_2: p_2; x_3: p_3; ... x_n: p_n;\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image020.gif\" alt=\"A probability distribution table with two rows, labeled &quot;X&quot; and &quot;P(X=x)&quot;. Here is the data in the table, given in column format (X: P(X=x)): x_1: p_1; x_2: p_2; x_3: p_3; ... x_n: p_n;\" \/><\/span><\/span><\/div>\n<div><\/div>\n<div><\/div>\n<p>the <em>variance<\/em>\u00a0of X is defined to be\u00a0[latex]\\sigma _{X}^{2}=(x_{1}-\\mu _{X})^{2}p_{1}+(x_{2}-\\mu _{X})^{2}p_{2}+...+(x_{n}-\\mu _{X})^{2}p_{n}=\\sum_{i=1}^{n}(x_{i}-\\mu _{X})p_{i}[\/latex]<\/p>\n<p>and the\u00a0<em>standard deviation<\/em>\u00a0is\u00a0[latex]\\sigma _{X}=\\sqrt{\\sigma ^{2}_{X}}[\/latex]<\/p>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p>Here again is the probability distribution of Y\u2014the number of defective parts in an hour in Yves&#8217; production line:<\/p>\n<div class=\"image shouldbeleft\"><img decoding=\"async\" class=\"img-responsive popimg aligncenter\" title=\"\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/dig003.gif\" alt=\"\" \/><\/div>\n<div><\/div>\n<div>Review the following expressions to answer the question below:<\/div>\n<\/div>\n<div><img decoding=\"async\" id=\"N1007D\" class=\"img-responsive popimg aligncenter\" title=\"\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/webcontent\/flash\/_u4_m3_digtutor5a_image1.gif\" alt=\"\" \/><\/div>\n<div><\/div>\n<div class=\"textbox__content\">\n<div>\n<div id=\"h5p-115\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-115\" class=\"h5p-iframe\" data-content-id=\"115\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"5.3 Did I get this? 2\"><\/iframe><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p>The purpose of the next activity is to give you better intuition about the mean and standard deviation of a random variable.<\/p>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p>Keeping in mind that the mean describes where a histogram is centered, and the standard deviation describes spread by reporting the typical distance of values from their mean, compare the histograms in the four exercises here and match each to the correct combination of mean and standard deviation.<\/p>\n<p><img decoding=\"async\" class=\"img-responsive popimg aligncenter\" title=\"\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/webcontent\/flash\/_u4_m3_lbdtutor4a_image1.jpg\" alt=\"\" \/><\/p>\n<div id=\"h5p-116\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-116\" class=\"h5p-iframe\" data-content-id=\"116\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"5.3 Learn by doing 5\"><\/iframe><\/div>\n<\/div>\n<p><img decoding=\"async\" class=\"img-responsive popimg aligncenter\" title=\"\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/webcontent\/flash\/_u4_m3_lbdtutor4b_image1.jpg\" alt=\"\" \/><\/p>\n<div id=\"h5p-117\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-117\" class=\"h5p-iframe\" data-content-id=\"117\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"5.3 Learn by doing 6\"><\/iframe><\/div>\n<\/div>\n<p><img decoding=\"async\" class=\"img-responsive popimg aligncenter\" title=\"\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/webcontent\/flash\/_u4_m3_lbdtutor4c_image1.jpg\" alt=\"\" \/><\/p>\n<div id=\"h5p-118\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-118\" class=\"h5p-iframe\" data-content-id=\"118\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"5.3 Learn by doing 7\"><\/iframe><\/div>\n<\/div>\n<p><img decoding=\"async\" class=\"img-responsive popimg aligncenter\" title=\"\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/webcontent\/flash\/_u4_m3_lbdtutor4d_image1.jpg\" alt=\"\" \/><\/p>\n<div id=\"h5p-119\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-119\" class=\"h5p-iframe\" data-content-id=\"119\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"5.3 Learn by doing 8\"><\/iframe><\/div>\n<\/div>\n<\/div>\n<\/div>\n<div>\n<p id=\"N10B20\">The concept of standard deviation is a bit harder to grasp than that of the mean. The purpose of the following examples and activities is to help you gain a better feel for the standard deviation of a random variable:<\/p>\n<div class=\"examplewrap\">\n<div class=\"example clearfix\">\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Example<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<h4>Xavier\u2019s and Yves\u2019 Production Lines<\/h4>\n<div>\n<p id=\"N10B27\">Recall the probability distribution of the random variable X, representing the number of defective parts per hour produced by Xavier\u2019s production line, and the probability distribution of the random variable Y, representing the number of defective parts per hour produced by Yves\u2019 production line:<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" class=\"img-responsive popimg aligncenter\" title=\"Two probability distribution tables. The first has two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; The data in column format (X: P(X=x)): 0: .15; 1: .30; 2: .25; 3: .20; 4: .10; The second table also has two rows, labeled &quot;Y&quot; and &quot;P(Y=y).&quot; The data in column format (Y: P(Y=y)): 0: .05; 1: .05; 2: .10; 3: .75; 4: .05;\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image039.gif\" alt=\"Two probability distribution tables. The first has two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; The data in column format (X: P(X=x)): 0: .15; 1: .30; 2: .25; 3: .20; 4: .10; The second table also has two rows, labeled &quot;Y&quot; and &quot;P(Y=y).&quot; The data in column format (Y: P(Y=y)): 0: .05; 1: .05; 2: .10; 3: .75; 4: .05;\" \/><\/span><\/span><\/p>\n<p>Look carefully at both probability distributions. Both X and Y take the same possible values (0, 1, 2, 3, 4). However, they are very different in the way the probability is distributed among these values. We saw before that this makes a difference in means:<\/p>\n<p><span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">X<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">.<\/span><\/span><span class=\"mjx-mn MJXc-space1\"><span class=\"mjx-char MJXc-TeX-main-R\">8<\/span><\/span><\/span><\/span><\/span><\/p>\n<p id=\"N10B52\"><span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">Y<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">.<\/span><\/span><span class=\"mjx-mn MJXc-space1\"><span class=\"mjx-char MJXc-TeX-main-R\">7<\/span><\/span><\/span><\/span><\/span><\/p>\n<p id=\"N10B71\">We now want to get a sense about how the different probability distributions impact their standard deviations.<\/p>\n<p id=\"N10B74\">Recall that the standard deviation of a random variable can be interpreted as a typical (or the long-run average) distance between the value of X and its mean.<\/p>\n<div id=\"h5p-120\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-120\" class=\"h5p-iframe\" data-content-id=\"120\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"5.3 Learn by doing 9\"><\/iframe><\/div>\n<\/div>\n<p id=\"N10B99\">So, 75% of the time Y will assume a value (3) that is very close to its mean (2.7), while X will assume a value (2) that is close to its mean (1.8) much less often\u2014only 25% of the time. The long-run average, then, of the distance between the values of Y and their mean will be much smaller than the long-run average of the distance between the values of X and their mean.<\/p>\n<p>Therefore,\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03c3<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">Y<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">&lt;<\/span><\/span><span class=\"mjx-msub MJXc-space3\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03c3<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">X<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">.<\/span><\/span><span class=\"mjx-mn MJXc-space1\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/span><\/span>\u00a0Actually,\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03c3<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">Y<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">0<\/span><\/span><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">.<\/span><\/span><span class=\"mjx-mn MJXc-space1\"><span class=\"mjx-char MJXc-TeX-main-R\">8<\/span><\/span><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">5<\/span><\/span><\/span><\/span><\/span>, so we can draw the following conclusion:<\/p>\n<p id=\"N10BEC\">Yves\u2019 production line produces an average of 2.70 defective parts per hour. The number of defective parts varies from hour to hour; typically (or, on average), it is about .85 away from 2.70.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<h2><span title=\"Quick scroll up\">Summary<\/span><\/h2>\n<p id=\"N10BF6\">Here are the histograms for the production lines:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter\" title=\"Histogram for Xavier's production line. The vertical axis is labeled &quot;Probability&quot; and the horizontal axis is labeled &quot;X.&quot; The data in the histogram is the same as the data in the probability table for Xavier's line. Moving from left to right across the horizontal axis we see that a peak in probability is reached at X=1, but it is not much higher than X=0. In addition, going right from X=1, the values decay, ultimately to 0.10 at X=4. The mean for Xavier's line is at X=1.8 .\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image_xavier_histo_mean.jpg\" alt=\"Histogram for Xavier's production line. The vertical axis is labeled &quot;Probability&quot; and the horizontal axis is labeled &quot;X.&quot; The data in the histogram is the same as the data in the probability table for Xavier's line. Moving from left to right across the horizontal axis we see that a peak in probability is reached at X=1, but it is not much higher than X=0. In addition, going right from X=1, the values decay, ultimately to 0.10 at X=4. The mean for Xavier's line is at X=1.8 .\" width=\"346\" height=\"233\" \/>\u00a0<img loading=\"lazy\" decoding=\"async\" class=\"aligncenter\" title=\"For Yves's line is another histogram with the same axes. Going left to right, we see a peak at X=3, which is much higher than the other values. All of the other values are roughly the same. The mean is at X=2.7 .\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image_yves_histo_mean.jpg\" alt=\"For Yves's line is another histogram with the same axes. Going left to right, we see a peak at X=3, which is much higher than the other values. All of the other values are roughly the same. The mean is at X=2.7 .\" width=\"346\" height=\"233\" \/><\/p>\n<p id=\"N10C09\">When we compare distributions, the distribution in which it is\u00a0<em>more likely<\/em>\u00a0to find values that are further from the mean will have a\u00a0<em>larger<\/em>\u00a0standard deviation. Likewise, the distribution in which it is\u00a0<em>less likely<\/em>\u00a0to find values that are further from the mean will have the\u00a0<em>smaller<\/em>\u00a0standard deviation.<\/p>\n<div class=\"asx\">\n<div id=\"du3_m3_meanvariance5_tutor2\" class=\"activitywrap purpose learnbydoing flash\">\n<div class=\"activityhead\">\n<div class=\"purposeType purposelearnbydoing\" title=\"\">\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<div class=\"actContain\">\n<div class=\"activity flash\">\n<div id=\"u3_m3_meanvariance5_tutor2\" class=\"flash_obj asx testFlash mark_flash\">\n<div id=\"ou3_m3_meanvariance5_tutor2\" class=\"page 2963997\">\n<div id=\"2963997\" class=\"question ddfb\">\n<p id=\"N1007A\">\n<div id=\"h5p-121\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-121\" class=\"h5p-iframe\" data-content-id=\"121\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"5.3 Learn by doing 10\"><\/iframe><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p id=\"N10C37\">The following graphs will be used in the next &#8220;Did I Get This?&#8221; exercise.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" title=\"A histogram titled &quot;Graph A&quot; showing the following data (presented in &quot;horizontal value: vertical value format&quot;): 1: .07; 2: .10; 3: .12; 4: .13; 5: .16; 6: .13; 7: .12; 8: .10; 9: .07;\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/u4_m3_meanvariance5_digt_imagea.jpg\" alt=\"A histogram titled &quot;Graph A&quot; showing the following data (presented in &quot;horizontal value: vertical value format&quot;): 1: .07; 2: .10; 3: .12; 4: .13; 5: .16; 6: .13; 7: .12; 8: .10; 9: .07;\" width=\"288\" height=\"195\" \/>\u00a0<img loading=\"lazy\" decoding=\"async\" title=\"A histogram titled &quot;Graph B&quot; showing the following data (presented in &quot;horizontal value: vertical value format&quot;): 1: .02; 2: .08; 3: .10; 4: .15; 5: .30; 6: .15; 7: .10; 8: .08; 9: .02;\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/u4_m3_meanvariance5_digt_imageb.jpg\" alt=\"A histogram titled &quot;Graph B&quot; showing the following data (presented in &quot;horizontal value: vertical value format&quot;): 1: .02; 2: .08; 3: .10; 4: .15; 5: .30; 6: .15; 7: .10; 8: .08; 9: .02;\" width=\"288\" height=\"195\" \/>\u00a0<img loading=\"lazy\" decoding=\"async\" title=\"A histogram titled &quot;Graph C&quot; showing the following data (presented in &quot;horizontal value: vertical value format&quot;): 1: .01; 2: .01; 3: .10; 4: .18; 5: .40; 6: .18; 7: .10; 8: .01; 9: .01;\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/u4_m3_meanvariance5_digt_imagec.jpg\" alt=\"A histogram titled &quot;Graph C&quot; showing the following data (presented in &quot;horizontal value: vertical value format&quot;): 1: .01; 2: .01; 3: .10; 4: .18; 5: .40; 6: .18; 7: .10; 8: .01; 9: .01;\" width=\"288\" height=\"195\" \/>\u00a0<img loading=\"lazy\" decoding=\"async\" id=\"_i_6\" title=\"A histogram titled &quot;Graph D&quot; showing the following data (presented in &quot;horizontal value: vertical value format&quot;): 1: .01; 2: .01; 3: .02; 4: .11; 5: .70; 6: .11; 7: .02; 8: .01; 9: .01;\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/u4_m3_meanvariance5_digt_imaged.jpg\" alt=\"A histogram titled &quot;Graph D&quot; showing the following data (presented in &quot;horizontal value: vertical value format&quot;): 1: .01; 2: .01; 3: .02; 4: .11; 5: .70; 6: .11; 7: .02; 8: .01; 9: .01;\" width=\"288\" height=\"195\" \/><\/p>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<div id=\"h5p-122\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-122\" class=\"h5p-iframe\" data-content-id=\"122\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"5.3 Did I get this? 3\"><\/iframe><\/div>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"N10C79\" class=\"section\">\n<div class=\"sectionContain\">\n<h3>Comment<\/h3>\n<p id=\"N10C7F\">As we have stated before, using the mean and standard deviation gives us another way to assess which values of a random variable are unusual. Any values of a random variable that fall within 2 standard deviations of the mean would be considered ordinary (not unusual).<\/p>\n<\/div>\n<\/div>\n<div class=\"examplewrap\">\n<div class=\"example clearfix\">\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Example<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<h4>Xavier\u2019s Production Line\u2014Unusual or Not?<\/h4>\n<div>\n<p id=\"N10C87\">Looking once again at the probability distribution for Xavier\u2019s production line:<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img loading=\"lazy\" decoding=\"async\" id=\"_i_7\" class=\"img-responsive popimg aligncenter\" title=\"Histogram for Xavier's production line. The vertical axis is labeled &quot;Probability&quot; and the horizontal axis is labeled &quot;X.&quot; The data in the histogram is the same as the data in the probability table for Xavier's line. Moving from left to right across the horizontal axis we see that a peak in probability is reached at X=1, but it is not much higher than X=0. In addition, going right from X=1, the values decay, to about 0.10 at X=4. The mean for Xavier's line is at X=1.8 .\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image_xavier_histo_mean.jpg\" alt=\"Histogram for Xavier's production line. The vertical axis is labeled &quot;Probability&quot; and the horizontal axis is labeled &quot;X.&quot; The data in the histogram is the same as the data in the probability table for Xavier's line. Moving from left to right across the horizontal axis we see that a peak in probability is reached at X=1, but it is not much higher than X=0. In addition, going right from X=1, the values decay, to about 0.10 at X=4. The mean for Xavier's line is at X=1.8 .\" width=\"374\" height=\"253\" \/><\/span><\/span><\/p>\n<p id=\"N10C92\">Would it be considered unusual to have 4 defective parts per hour?<\/p>\n<p id=\"N10C95\">We know that\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">X<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">.<\/span><\/span><span class=\"mjx-mn MJXc-space1\"><span class=\"mjx-char MJXc-TeX-main-R\">8<\/span><\/span><\/span><\/span><\/span>\u00a0and\u00a0<span class=\"mjx-chtml MathJax_CHTML\"><span class=\"mjx-math\"><span class=\"mjx-mrow\"><span class=\"mjx-msub\"><span class=\"mjx-base\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03c3<\/span><\/span><\/span><span class=\"mjx-sub\"><span class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">X<\/span><\/span><\/span><\/span><span class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><span class=\"mjx-mo\"><span class=\"mjx-char MJXc-TeX-main-R\">.<\/span><\/span><span class=\"mjx-mn MJXc-space1\"><span class=\"mjx-char MJXc-TeX-main-R\">2<\/span><\/span><span class=\"mjx-mn\"><span class=\"mjx-char MJXc-TeX-main-R\">1<\/span><\/span><\/span><\/span><\/span>.<\/p>\n<p id=\"N10CD2\">Ordinary values are within 2 standard deviations of the mean. 1.8 \u2013 2(1.21) = -.62 and 1.8 + 2(1.21) = 4.22. This gives us an interval from -.62 to 4.22. Since we cannot have a negative number of defective parts, the interval is essentially from 0 to 4.22. Because 4 is within this interval, it would be considered ordinary. Therefore, it is\u00a0<em>not unusual<\/em>.<\/p>\n<p id=\"N10CD8\">Would it be considered unusual to have no defective parts? Zero is within 2 standard deviations of the mean, so it would not be considered unusual to have no defective parts.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<p>The following activity will reinforce this idea.<\/p>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Did I get this?<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<p id=\"N10CE7\">Recall the probability distribution for changing majors.<\/p>\n<p id=\"N10CEC\">We have made the following calculations for the mean and standard deviation. For some extra practice, feel free to verify our calculations.<\/p>\n<p id=\"N10CEF\"><span id=\"MathJax-Element-7-Frame\" class=\"mjx-chtml MathJax_CHTML\"><span id=\"MJXc-Node-69\" class=\"mjx-math\"><span id=\"MJXc-Node-70\" class=\"mjx-mrow\"><span id=\"MJXc-Node-71\" class=\"mjx-msub\"><span class=\"mjx-base\"><span id=\"MJXc-Node-72\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03bc<\/span><\/span><\/span><span class=\"mjx-sub\"><span id=\"MJXc-Node-73\" class=\"mjx-mrow\"><span id=\"MJXc-Node-74\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">X<\/span><\/span><\/span><\/span><\/span><span id=\"MJXc-Node-75\" class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span id=\"MJXc-Node-76\" class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">1.23<\/span><\/span><\/span><\/span><\/span>\u00a0and\u00a0<span id=\"MathJax-Element-8-Frame\" class=\"mjx-chtml MathJax_CHTML\"><span id=\"MJXc-Node-77\" class=\"mjx-math\"><span id=\"MJXc-Node-78\" class=\"mjx-mrow\"><span id=\"MJXc-Node-79\" class=\"mjx-msub\"><span class=\"mjx-base\"><span id=\"MJXc-Node-80\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">\u03c3<\/span><\/span><\/span><span class=\"mjx-sub\"><span id=\"MJXc-Node-81\" class=\"mjx-mrow\"><span id=\"MJXc-Node-82\" class=\"mjx-mi\"><span class=\"mjx-char MJXc-TeX-math-I\">X<\/span><\/span><\/span><\/span><\/span><span id=\"MJXc-Node-83\" class=\"mjx-mo MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">=<\/span><\/span><span id=\"MJXc-Node-84\" class=\"mjx-mn MJXc-space3\"><span class=\"mjx-char MJXc-TeX-main-R\">1.08<\/span><\/span><\/span><\/span><\/span><\/p>\n<div id=\"h5p-123\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-123\" class=\"h5p-iframe\" data-content-id=\"123\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"5.3 Learn by doing 11\"><\/iframe><\/div>\n<\/div>\n<\/div>\n<\/div>\n<p id=\"N10D3C\">\u201cRisk\u201d in investments provides a useful application for the concept of variability. If there is no variability at all in possible outcomes, then the outcome is something we can count on, with no risk involved. At the other extreme, if there is a large amount of variability with possibilities for either tremendous loss or gain, then the associated risk is quite high.<\/p>\n<p id=\"N10D3F\">If a variable\u2019s possible values just differ somewhat, with some only marginally favorable and others unfavorable, then the underlying random experiment entails just a moderate amount of risk. The following example demonstrates how differing values of standard deviation reflect the amount of risk in a situation.<\/p>\n<div class=\"examplewrap\">\n<div class=\"example clearfix\">\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<h3 class=\"textbox__title\">Example<\/h3>\n<\/header>\n<div class=\"textbox__content\">\n<h4>Comparing Investments<\/h4>\n<div>\n<p id=\"N10D46\">Consider three possible investments, with returns denoted as X, Y, and Z, respectively, and probability distributions outlined in the tables below.<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"_i_8\" class=\"img-responsive popimg aligncenter\" title=\"A probability table with has two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; The data in column format (X: P(X=x)): 14,000: 1; In other words, X only has one value, 14,000, and P(X=14,000) = 1.\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image043.gif\" alt=\"A probability table with has two rows, labeled &quot;X&quot; and &quot;P(X=x).&quot; The data in column format (X: P(X=x)): 14,000: 1; In other words, X only has one value, 14,000, and P(X=14,000) = 1.\" \/><\/span><\/span><\/p>\n<p id=\"N10D4F\">Investment X is what we\u2019d call a \u201csure thing,\u201d with a guaranteed return of $14,000: there is no risk involved at all.<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"_i_9\" class=\"img-responsive popimg aligncenter\" title=\"A probability table with two rows, labeled &quot;Y&quot; and &quot;P(Y=y).&quot; The data in column format (Y: P(Y=y)): 0: .98; 1,000,000: .02; In other words, P(Y = 0) = .98 and P(Y = 1,000,000) = .02\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image044.gif\" alt=\"A probability table with two rows, labeled &quot;Y&quot; and &quot;P(Y=y).&quot; The data in column format (Y: P(Y=y)): 0: .98; 1,000,000: .02; In other words, P(Y = 0) = .98 and P(Y = 1,000,000) = .02\" \/><\/span><\/span><\/p>\n<p id=\"N10D58\">Investment Y is extremely risky, with a high probability (.98) of no gain at all, contrasted by a slight probability (.02) of \u201cmaking a killing\u201d with a return of a million dollars.<\/p>\n<p><span class=\"imagewrap\"><span class=\"image\"><img decoding=\"async\" id=\"_i_10\" class=\"img-responsive popimg aligncenter\" title=\"A probability table with two rows, labeled &quot;Z&quot; and &quot;P(Z=z).&quot; The data in column format (Y: P(Z=z)): 10,000: .5; 20,000: .5; In other words, P(Z = 10,000) = .5 and P(Z = 20,000) = .5\" src=\"https:\/\/oli.cmu.edu\/repository\/webcontent\/72712ec00a0001dc418a87e73e8ebb77\/_u4_probability\/_m3_random_variables\/webcontent\/image045.gif\" alt=\"A probability table with two rows, labeled &quot;Z&quot; and &quot;P(Z=z).&quot; The data in column format (Y: P(Z=z)): 10,000: .5; 20,000: .5; In other words, P(Z = 10,000) = .5 and P(Z = 20,000) = .5\" \/><\/span><\/span><\/p>\n<p>Investment Z is somewhere in between: there is an equal chance for either a return that\u2019s on the low side or a return that\u2019s on the high side.<\/p>\n<p id=\"N10D64\">If you only consider the mean return on each investment, would you prefer X, Y, or Z? The means for X, Y, and Z are calculated as follows:<\/p>\n<p id=\"N10D67\">[latex]\\mu _{X} = 14000(1)=14000[\/latex]<\/p>\n<p id=\"N10DA7\">[latex]\\mu _{Y} = 0(0.98)+1000000(.02)=20000[\/latex]<\/p>\n<p id=\"N10E0E\">[latex]\\mu _{Z} = 10000(0.5)+20000(0.5)=15000[\/latex]<\/p>\n<p id=\"N10E75\">Clearly, the mean return for Y is highest, and so investment in Y would seem to be preferable.<\/p>\n<p id=\"N10E78\">Now consider the standard deviations, and consider which investment you\u2019d prefer\u2014X, Y, or Z.<\/p>\n<p id=\"N10E7B\">The standard deviations are:<\/p>\n<p id=\"N10E7E\">[latex]\\sigma _{X}^{2}=(14000-14000)^{2}(1)=0[\/latex]<\/p>\n<p id=\"N10ED9\">[latex]\\sigma _{X}=0[\/latex]<\/p>\n<p id=\"N10EF2\">[latex]\\sigma _{Y}^{2}=(0-20000)^{2}(0.98)+(1,000,000-20000)^{2}(0.2)=1.96\\times 10^{10}[\/latex]<\/p>\n<p id=\"N10FB9\">[latex]\\sigma _{Y}=140,000[\/latex]<\/p>\n<p id=\"N10FE4\">[latex]\\sigma _{Z}^{2}=(10000-15000)^{2}(0.5)+(20000-15000)^{2}(0.5)=25,000,000[\/latex]<\/p>\n<p id=\"N110A2\">[latex]\\sigma _{Z}=5000[\/latex]<\/p>\n<p id=\"N110C4\">Granted, the mean returns suggest that investment X is least profitable and investment Y is most profitable. On the other hand, the standard deviations are telling us that the return for X is a sure thing; for Y, the remote chance of making a huge profit is offset by a high risk of losing the investment entirely; for Z, there is a modest amount of risk involved. If you can\u2019t afford to lose any money, then investment X would be the way to go. If you have enough assets to take a chance, then investment Y would be worthwhile. In particular, if a large company routinely makes many such investments, then in the long run there will occasionally be such enormous gains that the company is willing to absorb many smaller losses. Investment Z represents the middle ground, somewhere between the other two.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/dd>\n<\/dl>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"author":150,"menu_order":8,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[48],"contributor":[],"license":[],"class_list":["post-510","chapter","type-chapter","status-publish","hentry","chapter-type-numberless"],"part":419,"_links":{"self":[{"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/pressbooks\/v2\/chapters\/510","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/wp\/v2\/users\/150"}],"version-history":[{"count":36,"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/pressbooks\/v2\/chapters\/510\/revisions"}],"predecessor-version":[{"id":866,"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/pressbooks\/v2\/chapters\/510\/revisions\/866"}],"part":[{"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/pressbooks\/v2\/parts\/419"}],"metadata":[{"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/pressbooks\/v2\/chapters\/510\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/wp\/v2\/media?parent=510"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/pressbooks\/v2\/chapter-type?post=510"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/wp\/v2\/contributor?post=510"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/pressbooks.ccconline.org\/mat1260\/wp-json\/wp\/v2\/license?post=510"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}