<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<meta content="text/html; charset=UTF-8" http-equiv="content-type" />
<meta http-equiv="Content-Language" content="en" />
<meta name="generator" content="Pressbooks 5.15.0" />
<meta name="pb-authors" content="Kelli McCarthy" />
<meta name="pb-editors" content="" />
<meta name="pb-translators" content="" />
<meta name="pb-reviewers" content="" />
<meta name="pb-illustrators" content="" />
<meta name="pb-contributors" content="" />
<meta name="pb-title" content="Introduction to Statistics" />
<meta name="pb-language" content="en" />
<meta name="pb-cover-image" content="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/APR0775-Edit-copy-2.jpg" />
<meta name="pb-copyright-year" content="2022" />
<meta name="pb-primary-subject" content="PB" />
<meta name="pb-publisher" content="St. Clair College AA&amp;T" />
<meta name="pb-publisher-city" content="Windsor, ON" />
<meta name="pb-publication-date" content="1656633600" />
<meta name="pb-copyright-holder" content="St. Clair College" />
<meta name="pb-book-license" content="cc-by-nc-sa" />
<meta name="pb-custom-copyright" content="College Statistics is an Adaptation from Introductory Statistics by Barbara Illowsky and Susan Dean. is licensed under  Creative Commons Attribution 4.0 International (CC BY) LicenseThe new and revised material in this adaptation is copyrighted 2022 by the adapting author Mike LePine and is released under a Creative Commons License (CC BY-NC-SA) This adaptation has seen substantial reordering and reformatting of the original texts, minor wording adjustments, the addition of new content, replacement of images, and deletions." />
<meta name="pb-is-based-on" content="https://ecampusontario.pressbooks.pub/sccstatistics" />
<meta name="pb-additional-subjects" content="PBT" />
<title>Introduction to Statistics</title>
</head>
<body lang='en' >
<div id="half-title-page"><h1 class="title">Introduction to Statistics</h1></div>
<div id="title-page"><h1 class="title">Introduction to Statistics</h1><h2 class="subtitle"></h2><h3 class="author">Kelli McCarthy</h3><h3 class="author"></h3><h4 class="publisher">St. Clair College AA&amp;T</h4><h5 class="publisher-city">Windsor, ON</h5></div>
<div id="copyright-page"><div class="ugc">
<div class="license-attribution"><p><img src="https://pressbooks.ccconline.org/accintrostats/wp-content/themes/pressbooks-book/packages/buckram/assets/images/cc-by-nc-sa.svg" alt="Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License" /></p><p>Introduction to Statistics by St. Clair College is licensed under a <a rel="license" href="https://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>, except where otherwise noted.</p></div>
<p>College Statistics is an Adaptation from <a href="https://opentextbc.ca/introstatopenstax/">Introductory Statistics</a> by Barbara Illowsky and Susan Dean. is licensed under  <a href="https://creativecommons.org/licenses/by/4.0/" rel="license">Creative Commons Attribution 4.0 International (CC BY) License</a></p><p>The new and revised material in this adaptation is copyrighted 2022 by the adapting author Mike LePine and is released under a <a href="https://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons License (CC BY-NC-SA) </a></p><p>This adaptation has seen substantial reordering and reformatting of the original texts, minor wording adjustments, the addition of new content, replacement of images, and deletions.</p></div></div>
<div id="toc"><h1>Contents</h1><ul><li class="front-matter miscellaneous"><a href="#front-matter-acknowledgments"><span class="toc-chapter-title">Acknowledgments</span></a></li><li class="front-matter introduction"><a href="#front-matter-introduction"><span class="toc-chapter-title">Introduction</span></a></li><li class="front-matter miscellaneous post-introduction"><a href="#front-matter-preface"><span class="toc-chapter-title">Preface</span></a></li><li class="part"><a href="#part-sampling-and-data">Chapter 1: Sampling and Data</a></li><li class="chapter standard"><a href="#chapter-introduction"><span class="toc-chapter-title">Chapter 1.1: Introduction</span></a></li><li class="chapter standard"><a href="#chapter-definitions-of-statistics-probability-and-key-terms"><span class="toc-chapter-title">Chapter 1.2: Definitions of Statistics, Probability, and Key Terms</span></a></li><li class="chapter standard"><a href="#chapter-data-sampling-and-variation-in-data-and-sampling"><span class="toc-chapter-title">Chapter 1.3: Data, Sampling, and Variation in Data and Sampling</span></a></li><li class="chapter standard"><a href="#chapter-experimental-design-and-ethics"><span class="toc-chapter-title">Chapter 1.4: Experimental Design and Ethics</span></a></li><li class="chapter standard"><a href="#chapter-data-collection-experiment"><span class="toc-chapter-title">Activity 1.5: Data Collection Experiment</span></a></li><li class="chapter standard"><a href="#chapter-sampling-experiment"><span class="toc-chapter-title">Activity 1.6: Sampling Experiment</span></a></li><li class="part"><a href="#part-descriptive-statistics">Chapter 2: Descriptive Statistics</a></li><li class="chapter standard"><a href="#chapter-introduction-14"><span class="toc-chapter-title">Chapter 2.1: Introduction</span></a></li><li class="chapter standard"><a href="#chapter-frequency-frequency-tables-and-levels-of-measurement"><span class="toc-chapter-title">Chapter 2.2: Frequency, Frequency Tables, and Levels of Measurement</span></a></li><li class="chapter standard"><a href="#chapter-stem-and-leaf-graphs-stemplots-line-graphs-and-bar-graphs"><span class="toc-chapter-title">Chapter 2.3: Bar Graphs, Histrograms, and Stem-and-Leaf Graphs (Stemplots)</span></a></li><li class="chapter standard"><a href="#chapter-measures-of-the-center-of-the-data"><span class="toc-chapter-title">Chapter 2.4: Measures of the Center of the Data</span></a></li><li class="chapter standard"><a href="#chapter-skewness-and-the-mean-median-and-mode"><span class="toc-chapter-title">Chapter 2.5: Skewness and the Mean, Median, and Mode</span></a></li><li class="chapter standard"><a href="#chapter-measures-of-the-spread-of-the-data"><span class="toc-chapter-title">Chapter 2.6: Measures of the Spread of the Data</span></a></li><li class="chapter standard"><a href="#chapter-measures-of-the-location-of-the-data"><span class="toc-chapter-title">Chapter 2.7: Measures of Position</span></a></li><li class="chapter standard"><a href="#chapter-box-plots"><span class="toc-chapter-title">Chapter 2.8: Box Plots</span></a></li><li class="chapter standard"><a href="#chapter-descriptive-statistics"><span class="toc-chapter-title">Activity 2.9: Descriptive Statistics</span></a></li><li class="part"><a href="#part-linear-regression-and-correlation">Chapter 3: Linear Regression and Correlation</a></li><li class="chapter standard"><a href="#chapter-introduction-24"><span class="toc-chapter-title">Chapter 3.1: Introduction</span></a></li><li class="chapter standard"><a href="#chapter-linear-equations"><span class="toc-chapter-title">Chapter 3.2: Linear Equations</span></a></li><li class="chapter standard"><a href="#chapter-scatter-plots"><span class="toc-chapter-title">Chapter 3.3: Scatter Plots</span></a></li><li class="chapter standard"><a href="#chapter-the-regression-equation"><span class="toc-chapter-title">Chapter 3.4: The Regression Equation</span></a></li><li class="chapter standard"><a href="#chapter-prediction"><span class="toc-chapter-title">Chapter 3.5: Prediction</span></a></li><li class="chapter standard"><a href="#chapter-regression-distance-from-school"><span class="toc-chapter-title">Activity 3.6: Regression (Distance from School)</span></a></li><li class="chapter standard"><a href="#chapter-regression-textbook-cost"><span class="toc-chapter-title">Activity 3.7: Regression (Textbook Cost)</span></a></li><li class="chapter standard"><a href="#chapter-regression-fuel-efficiency"><span class="toc-chapter-title">Activity 3.8: Regression (Fuel Efficiency)</span></a></li><li class="part"><a href="#part-probability-topics">Chapter 4: Probability Topics</a></li><li class="chapter standard"><a href="#chapter-introduction-15"><span class="toc-chapter-title">Chapter 4.1: Introduction</span></a></li><li class="chapter standard"><a href="#chapter-terminology"><span class="toc-chapter-title">Chapter 4.2: Terminology</span></a></li><li class="chapter standard"><a href="#chapter-independent-and-mutually-exclusive-events"><span class="toc-chapter-title">Chapter 4.3: Independent and Mutually Exclusive Events</span></a></li><li class="chapter standard"><a href="#chapter-two-basic-rules-of-probability"><span class="toc-chapter-title">Chapter 4.4: Two Basic Rules of Probability</span></a></li><li class="chapter standard"><a href="#chapter-contingency-tables"><span class="toc-chapter-title">Chapter 4.5: Contingency Tables</span></a></li><li class="chapter standard"><a href="#chapter-tree-and-venn-diagrams"><span class="toc-chapter-title">Chapter 4.6: Tree and Venn Diagrams</span></a></li><li class="chapter standard"><a href="#chapter-probability-topics"><span class="toc-chapter-title">Activity 4.7: Probability Topics</span></a></li><li class="part"><a href="#part-discrete-random-variables">Chapter 5: Discrete Random Variables</a></li><li class="chapter standard"><a href="#chapter-introduction-16"><span class="toc-chapter-title">Chapter 5.1: Introduction</span></a></li><li class="chapter standard"><a href="#chapter-probability-distribution-function-pdf-for-a-discrete-random-variable"><span class="toc-chapter-title">Chapter 5.2: Probability Distribution Function (PDF) for a Discrete Random Variable</span></a></li><li class="chapter standard"><a href="#chapter-mean-or-expected-value-and-standard-deviation"><span class="toc-chapter-title">Chapter 5.3: Mean or Expected Value and Standard Deviation</span></a></li><li class="chapter standard"><a href="#chapter-binomial-distribution"><span class="toc-chapter-title">Chapter 5.4: Binomial Distribution</span></a></li><li class="chapter standard"><a href="#chapter-discrete-distribution-playing-card-experiment"><span class="toc-chapter-title">Activity 5.5: Discrete Distribution (Playing Card Experiment)</span></a></li><li class="chapter standard"><a href="#chapter-discrete-distribution-lucky-dice-experiment"><span class="toc-chapter-title">Activity 5.6: Discrete Distribution (Lucky Dice Experiment)</span></a></li><li class="part"><a href="#part-continuous-random-variables">Chapter 6: Continuous Random Variables</a></li><li class="chapter standard"><a href="#chapter-introduction-17"><span class="toc-chapter-title">Chapter 6.1: Introduction</span></a></li><li class="chapter standard"><a href="#chapter-continuous-probability-functions"><span class="toc-chapter-title">Chapter 6.2: Continuous Probability Functions</span></a></li><li class="chapter standard"><a href="#chapter-the-uniform-distribution"><span class="toc-chapter-title">Chapter 6.3: The Uniform Distribution</span></a></li><li class="chapter standard"><a href="#chapter-continuous-distribution"><span class="toc-chapter-title">Activity 6.4: Continuous Distribution</span></a></li><li class="part"><a href="#part-the-normal-distribution">Chapter 7: The Normal Distribution</a></li><li class="chapter standard"><a href="#chapter-introduction-18"><span class="toc-chapter-title">Chapter 7.1: Introduction</span></a></li><li class="chapter standard"><a href="#chapter-the-standard-normal-distribution"><span class="toc-chapter-title">Chapter 7.2: The Standard Normal Distribution</span></a></li><li class="chapter standard"><a href="#chapter-using-the-normal-distribution"><span class="toc-chapter-title">Chapter 7.3: Using the Normal Distribution</span></a></li><li class="chapter standard"><a href="#chapter-normal-distribution-lap-times"><span class="toc-chapter-title">Activity 7.4: Normal Distribution (Lap Times)</span></a></li><li class="chapter standard"><a href="#chapter-normal-distribution-pinkie-length"><span class="toc-chapter-title">Activity 7.5: Normal Distribution (Pinkie Length)</span></a></li><li class="part"><a href="#part-the-central-limit-theorem">Chapter 8: The Central Limit Theorem</a></li><li class="chapter standard"><a href="#chapter-introduction-19"><span class="toc-chapter-title">Chapter 8.1: Introduction</span></a></li><li class="chapter standard"><a href="#chapter-the-central-limit-theorem-for-sample-means-averages"><span class="toc-chapter-title">Chapter 8.2: The Central Limit Theorem for Sample Means (Averages)</span></a></li><li class="chapter standard"><a href="#chapter-central-limit-theorem-pocket-change"><span class="toc-chapter-title">Activity 8.3: Central Limit Theorem (Pocket Change)</span></a></li><li class="chapter standard"><a href="#chapter-central-limit-theorem-cookie-recipes"><span class="toc-chapter-title">Activity 8.4: Central Limit Theorem (Cookie Recipes)</span></a></li><li class="part"><a href="#part-confidence-intervals">Chapter 9: Confidence Intervals</a></li><li class="chapter standard"><a href="#chapter-introduction-20"><span class="toc-chapter-title">Chapter 9.1: Introduction</span></a></li><li class="chapter standard"><a href="#chapter-a-population-proportion"><span class="toc-chapter-title">Chapter 9.2: A Population Proportion</span></a></li><li class="chapter standard"><a href="#chapter-a-single-population-mean-using-the-student-t-distribution"><span class="toc-chapter-title">Chapter 9.3: A Single Population Mean using the Student t Distribution</span></a></li><li class="chapter standard"><a href="#chapter-confidence-interval-place-of-birth"><span class="toc-chapter-title">Activity 9.4: Confidence Interval (Place of Birth)</span></a></li><li class="chapter standard"><a href="#chapter-confidence-interval-home-costs"><span class="toc-chapter-title">Activity 9.5: Confidence Interval (Home Costs)</span></a></li><li class="part"><a href="#part-hypothesis-testing-with-one-sample">Chapter 10: Hypothesis Testing with One Sample</a></li><li class="chapter standard"><a href="#chapter-introduction-21"><span class="toc-chapter-title">Chapter 10.1: Introduction</span></a></li><li class="chapter standard"><a href="#chapter-null-and-alternative-hypotheses"><span class="toc-chapter-title">Chapter 10.2: Null and Alternative Hypotheses</span></a></li><li class="chapter standard"><a href="#chapter-outcomes-and-the-type-i-and-type-ii-errors"><span class="toc-chapter-title">Chapter 10.3: Outcomes and the Type I and Type II Errors</span></a></li><li class="chapter standard"><a href="#chapter-distribution-needed-for-hypothesis-testing"><span class="toc-chapter-title">Chapter 10.4: Distribution Needed for Hypothesis Testing</span></a></li><li class="chapter standard"><a href="#chapter-rare-events-the-sample-decision-and-conclusion"><span class="toc-chapter-title">Chapter 10.5: Rare Events, the Sample, Decision and Conclusion</span></a></li><li class="chapter standard"><a href="#chapter-additional-information-and-full-hypothesis-test-examples"><span class="toc-chapter-title">Chapter 10.6: Additional Information and Full Hypothesis Test Examples</span></a></li><li class="chapter standard"><a href="#chapter-hypothesis-testing-of-a-single-mean-and-single-proportion"><span class="toc-chapter-title">Activity 10.7: Hypothesis Testing of a Single Mean and Single Proportion</span></a></li><li class="part"><a href="#part-hypothesis-testing-with-two-samples">Chapter 11: Hypothesis Testing with Two Samples</a></li><li class="chapter standard"><a href="#chapter-introduction-22"><span class="toc-chapter-title">Chapter 11.1: Introduction</span></a></li><li class="chapter standard"><a href="#chapter-comparing-two-independent-population-proportions"><span class="toc-chapter-title">Chapter 11.2: Comparing Two Independent Population Proportions</span></a></li><li class="chapter standard"><a href="#chapter-matched-or-paired-samples"><span class="toc-chapter-title">Chapter 11.3: Matched or Paired Samples</span></a></li><li class="chapter standard"><a href="#chapter-two-population-means-with-unknown-standard-deviations"><span class="toc-chapter-title">Chapter 11.4: Two Population Means with Unknown Standard Deviations</span></a></li><li class="chapter standard"><a href="#chapter-hypothesis-testing-for-two-means-and-two-proportions"><span class="toc-chapter-title">Activity 11.5: Hypothesis Testing for Two Means and Two Proportions</span></a></li><li class="back-matter appendix"><a href="#back-matter-appendix"><span class="toc-chapter-title">Appendix</span></a></li><li class="back-matter miscellaneous"><a href="#back-matter-group-and-partner-projects"><span class="toc-chapter-title">Group and Partner Projects</span></a></li><li class="back-matter miscellaneous"><a href="#back-matter-data-sets"><span class="toc-chapter-title">Data Sets</span></a></li><li class="back-matter miscellaneous"><a href="#back-matter-solution-sheets"><span class="toc-chapter-title">Solution Sheets</span></a></li><li class="back-matter miscellaneous"><a href="#back-matter-mathematical-phrases-symbols-and-formulas"><span class="toc-chapter-title">Mathematical Phrases, Symbols, and Formulas</span></a></li><li class="back-matter miscellaneous"><a href="#back-matter-notes-for-the-ti-83-83-84-84-calculators"><span class="toc-chapter-title">Notes for the TI-83, 83+, 84, 84+ Calculators</span></a></li><li class="back-matter miscellaneous"><a href="#back-matter-tables"><span class="toc-chapter-title">Tables</span></a></li></ul></div>
<div class="front-matter miscellaneous" id="front-matter-acknowledgments" title="Acknowledgments"><div class="front-matter-title-wrap"><h3 class="front-matter-number">1</h3><h1 class="front-matter-title">Acknowledgments</h1></div><div class="ugc front-matter-ugc"><p>This material was adapted from: College Statistics by St. Clair College is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.</p> <p>&nbsp;</p> <p>&nbsp;</p> <p><a href="http://creativecommons.org/licenses/by-nc/4.0/" rel="license"><img style="border-width: 0;" src="https://i.creativecommons.org/l/by-nc/4.0/88x31.png" alt="Creative Commons License" /></a><br /> This work is licensed under a <a href="http://creativecommons.org/licenses/by-nc/4.0/" rel="license">Creative Commons Attribution-NonCommercial 4.0 International License</a>.</p> </div></div>
<div class="front-matter introduction" id="front-matter-introduction" title="Introduction"><div class="front-matter-title-wrap"><h3 class="front-matter-number">2</h3><h1 class="front-matter-title"><span class="display-none">Introduction</span></h1></div><div class="ugc front-matter-ugc"><p>Welcome to MAT 1260 Introduction to Statistics!</p> <p>Statistics is such a wonderful course, full of concepts that you can take with you down any career path. You will learn skills to present data in a meaningful way and techniques test conjectures about populations.</p> </div></div>
<div class="front-matter miscellaneous post-introduction" id="front-matter-preface" title="Preface"><div class="front-matter-title-wrap"><h3 class="front-matter-number">3</h3><h1 class="front-matter-title"><span class="display-none">Preface</span></h1></div><div class="ugc front-matter-ugc"><p>[latexpage]</p> <div class="textbox textbox--learning-objectives"><h3>Learning Objectives</h3> <p>Introductory Statistics is intended for the one-semester introduction to statistics course for students who are not mathematics or engineering majors. It focuses on the interpretation of statistical results, especially in real world settings, and assumes that students have an understanding of intermediate algebra. In addition to end of section practice and homework sets, examples of each topic are explained step-by-step throughout the text and followed by a Try It problem that is designed as extra practice for students. This book also includes collaborative exercises and statistics labs designed to give students the opportunity to work together and explore key concepts. To support today’s student in understanding technology, this book features TI 83, 83+, 84, or 84+ calculator instructions at strategic points throughout. While the book has been built so that each chapter builds on the previous, it can be rearranged to accommodate any instructor’s particular needs.</p> </div> <p id="eip-109">Welcome to <em data-effect="italics">Introductory Statistics</em>, an OpenStax resource. This textbook was written to increase student access to high-quality learning materials, maintaining highest standards of academic rigor at little to no cost.</p> <p id="eip-970">The foundation of this textbook is <em data-effect="italics">Collaborative Statistics</em>, by Barbara Illowsky and Susan Dean. Additional topics, examples, and innovations in terminology and practical applications have been added, all with a goal of increasing relevance and accessibility for students.</p> <div id="eip-10" class="bc-section section" data-depth="1"><h3 data-type="title">About OpenStax</h3> <p id="eip-479">OpenStax is a nonprofit based at Rice University, and it’s our mission to improve student access to education. Our first openly licensed college textbook was published in 2012, and our library has since scaled to over 25 books for college and AP<sup>®</sup> courses used by hundreds of thousands of students. OpenStax Tutor, our low-cost personalized learning tool, is being used in college courses throughout the country. Through our partnerships with philanthropic foundations and our alliance with other educational resource organizations, OpenStax is breaking down the most common barriers to learning and empowering students and instructors to succeed.</p> </div> <div id="eip-242" class="bc-section section" data-depth="1"><h3 data-type="title">About OpenStax&#8217;s resources</h3> <div id="eip-id1172001427154" class="bc-section section" data-depth="2"><p id="eip-507"><span style="font-family: 'Cormorant Garamond', serif;font-size: 1.42425em;font-style: italic">Errata</span></p> </div> <div id="eip-621" class="bc-section section" data-depth="2"><p id="eip-555">All OpenStax textbooks undergo a rigorous review process. However, like any professional-grade textbook, errors sometimes occur. Since our books are web based, we can make updates periodically when deemed pedagogically necessary. If you have a correction to suggest, submit it through the link on your book page on OpenStax.org. Subject matter experts review all errata suggestions. OpenStax is committed to remaining transparent about all updates, so you will also find a list of past errata changes on your book page on OpenStax.org.</p> </div> <div id="eip-107" class="bc-section section" data-depth="2"><h4 data-type="title">Format</h4> <p id="eip-911">You can access this textbook for free in web view or PDF through Pressbooks.</p> </div> </div> <div id="fs-idm526890688" class="bc-section section" data-depth="1"><h3 data-type="title">About <em data-effect="italics">Introductory Statistics</em></h3> <p id="fs-idm498143472"><em data-effect="italics">Introductory Statistics</em> follows scope and sequence requirements of a one-semester introduction to statistics course and is geared toward students majoring in fields other than math or engineering. The text assumes some knowledge of intermediate algebra and focuses on statistics application over theory. <em data-effect="italics">Introductory Statistics</em> includes innovative practical applications that make the text relevant and accessible, as well as collaborative exercises, technology integration problems, and statistics labs.</p> <div id="eip-663" class="bc-section section" data-depth="2"><h4 data-type="title">Coverage and scope</h4> <p id="eip-890"><span style="text-align: initial;font-size: 1em">Chapter 1 Sampling and Data</span></p> <div id="eip-876" class="bc-section section" data-depth="2"><p id="eip-23">Chapter 2 Descriptive Statistics<span data-type="newline"><br /> </span> Chapter 3 Linear Regression and Correlation<span data-type="newline"><br /> </span> Chapter 4 Probability Topics<span data-type="newline"><br /> </span> Chapter 5 Discrete Random Variables<span data-type="newline"><br /> </span> Chapter 6 Continuous Random Variables<span data-type="newline"><br /> </span> Chapter 7 The Normal Distribution<span data-type="newline"><br /> </span> Chapter 8 The Central Limit Theorem<span data-type="newline"><br /> </span> Chapter 9 Confidence Intervals<span data-type="newline"><br /> </span> Chapter 10 Hypothesis Testing with One Sample<span data-type="newline"><br /> </span> Chapter 11 Hypothesis Testing with Two Samples<span data-type="newline"><br /> </span></p> </div> <div id="eip-930" class="bc-section section" data-depth="2"><h4 data-type="title">Pedagogical foundation and features</h4> <ul id="eip-303"><li><strong>Examples</strong> are placed strategically throughout the text to show students the step-by-step process of interpreting and solving statistical problems. To keep the text relevant for students, the examples are drawn from a broad spectrum of practical topics, including examples about college life and learning, health and medicine, retail and business, and sports and entertainment.</li> <li><strong>Try It</strong> practice problems immediately follow many examples and give students the opportunity to practice as they read the text. <strong>They are usually based on practical and familiar topics, like the Examples themselves</strong>.</li> <li><strong>Collaborative Exercises</strong> provide an in-class scenario for students to work together to explore presented concepts.</li> <li><strong>Using the TI-83, 83+, 84, 84+ Calculator</strong> shows students step-by-step instructions to input problems into their calculator.</li> <li><strong>The Technology Icon</strong> indicates where the use of a TI calculator or computer software is recommended.</li> <li><strong>Practice, Homework, and Bringing It Together</strong> problems give the students problems at various degrees of difficulty while also including real-world scenarios to engage students.</li> </ul> </div> <div id="eip-600" class="bc-section section" data-depth="2"><h4 data-type="title">Statistics labs</h4> <p id="eip-48">These innovative activities were developed by Barbara Illowsky and Susan Dean in order to offer students the experience of designing, implementing, and interpreting statistical analyses. They are drawn from actual experiments and data-gathering processes and offer a unique hands-on and collaborative experience. The labs provide a foundation for further learning and classroom interaction that will produce a meaningful application of statistics.</p> <p id="eip-id1166298040766">Statistics Labs appear at the end of each chapter and begin with student learning outcomes, general estimates for time on task, and any global implementation notes. Students are then provided with step-by-step guidance, including sample data tables and calculation prompts. The detailed assistance will help the students successfully apply the concepts in the text and lay the groundwork for future collaborative or individual work.</p> </div> </div> <div id="eip-382" class="bc-section section" data-depth="1"><div id="eip-id8911316" class="bc-section section" data-depth="2"><p>&nbsp;</p> <p id="eip-id1168173254982"><span style="font-family: 'Cormorant Garamond', serif;font-size: 1.42425em;font-style: italic">Partner resources</span></p> </div> <div id="eip-833" class="bc-section section" data-depth="2"><p id="eip-837">OpenStax Partners are our allies in the mission to make high-quality learning materials affordable and accessible to students and instructors everywhere. Their tools integrate seamlessly with our OpenStax titles at a low cost. To access the partner resources for your text, visit your book page on OpenStax.org.</p> </div> </div> <div id="eip-441" class="bc-section section" data-depth="1"><h3 data-type="title">About the authors</h3> <div id="eip-id1165226216199" class="sr-contrib-auth" data-depth="2"><h4 data-type="title">Senior contributing authors</h4> <p id="eip-157"><strong>Barbara Illowsky, De Anza College</strong><span data-type="newline"><br /> </span> <strong>Susan Dean, De Anza College</strong></p> </div> <div id="eip-id1165231659584" class="contrib-auth" data-depth="2"><h4 data-type="title">Contributing authors</h4> <p id="eip-14">Birgit Aquilonius, West Valley College<span data-type="newline"><br /> </span> Charles Ashbacher, Upper Iowa University, Cedar Rapids<span data-type="newline"><br /> </span> Abraham Biggs, Broward Community College<span data-type="newline"><br /> </span> Daniel Birmajer, Nazareth College<span data-type="newline"><br /> </span> Roberta Bloom, De Anza College<span data-type="newline"><br /> </span> Bryan Blount, Kentucky Wesleyan College<span data-type="newline"><br /> </span> Ernest Bonat, Portland Community College<span data-type="newline"><br /> </span> Sarah Boslaugh, Kennesaw State University<span data-type="newline"><br /> </span> David Bosworth, Hutchinson Community College<span data-type="newline"><br /> </span> Sheri Boyd, Rollins College<span data-type="newline"><br /> </span> George Bratton, University of Central Arkansas<span data-type="newline"><br /> </span> Jing Chang, College of Saint Mary <span data-type="newline"><br /> </span> Laurel Chiappetta, University of Pittsburgh<span data-type="newline"><br /> </span> Lenore Desilets, De Anza College<span data-type="newline"><br /> </span> Matthew Einsohn, Prescott College<span data-type="newline"><br /> </span> Ann Flanigan, Kapiolani Community College<span data-type="newline"><br /> </span> David French, Tidewater Community College<span data-type="newline"><br /> </span> Mo Geraghty, De Anza College<span data-type="newline"><br /> </span> Larry Green, Lake Tahoe Community College<span data-type="newline"><br /> </span> Michael Greenwich, College of Southern Nevada<span data-type="newline"><br /> </span> Inna Grushko, De Anza College<span data-type="newline"><br /> </span> Valier Hauber, De Anza College<span data-type="newline"><br /> </span> Janice Hector, De Anza College<span data-type="newline"><br /> </span> Jim Helmreich, Marist College<span data-type="newline"><br /> </span> Robert Henderson, Stephen F. Austin State University<span data-type="newline"><br /> </span> Mel Jacobsen, Snow College<span data-type="newline"><br /> </span> Mary Jo Kane, De Anza College<span data-type="newline"><br /> </span> Lynette Kenyon, Collin County Community College<span data-type="newline"><br /> </span> Charles Klein, De Anza College<span data-type="newline"><br /> </span> Alexander Kolovos<span data-type="newline"><br /> </span> Sheldon Lee, Viterbo University<span data-type="newline"><br /> </span> Sara Lenhart, Christopher Newport University<span data-type="newline"><br /> </span> Wendy Lightheart, Lane Community College<span data-type="newline"><br /> </span> Vladimir Logvenenko, De Anza College<span data-type="newline"><br /> </span> Jim Lucas, De Anza College<span data-type="newline"><br /> </span> Lisa Markus, De Anza College<span data-type="newline"><br /> </span> Miriam Masullo, SUNY Purchase<span data-type="newline"><br /> </span> Diane Mathios, De Anza College<span data-type="newline"><br /> </span> Robert McDevitt, Germanna Community College<span data-type="newline"><br /> </span> Mark Mills, Central College<span data-type="newline"><br /> </span> Cindy Moss, Skyline College<span data-type="newline"><br /> </span> Nydia Nelson, St. Petersburg College<span data-type="newline"><br /> </span> Benjamin Ngwudike, Jackson State University<span data-type="newline"><br /> </span> Jonathan Oaks, Macomb Community College<span data-type="newline"><br /> </span> Carol Olmstead, De Anza College<span data-type="newline"><br /> </span> Adam Pennell, Greensboro College<span data-type="newline"><br /> </span> Kathy Plum, De Anza College<span data-type="newline"><br /> </span> Lisa Rosenberg, Elon University<span data-type="newline"><br /> </span> Sudipta Roy, Kankakee Community College<span data-type="newline"><br /> </span> Javier Rueda, De Anza College<span data-type="newline"><br /> </span> Yvonne Sandoval, Pima Community College<span data-type="newline"><br /> </span> Rupinder Sekhon, De Anza College<span data-type="newline"><br /> </span> Travis Short, St. Petersburg College<span data-type="newline"><br /> </span> Frank Snow, De Anza College<span data-type="newline"><br /> </span> Abdulhamid Sukar, Cameron University<span data-type="newline"><br /> </span> Jeffery Taub, Maine Maritime Academy<span data-type="newline"><br /> </span> Mary Teegarden, San Diego Mesa College<span data-type="newline"><br /> </span> John Thomas, College of Lake County<span data-type="newline"><br /> </span> Philip J. Verrecchia, York College of Pennsylvania<span data-type="newline"><br /> </span> Dennis Walsh, Middle Tennessee State University<span data-type="newline"><br /> </span> Cheryl Wartman, University of Prince Edward Island<span data-type="newline"><br /> </span> Carol Weideman, St. Petersburg College<span data-type="newline"><br /> </span> Andrew Wiesner, Pennsylvania State University</p> <p>Kelli McCarthy, Arapahoe Community College</p> </div> </div> </div> </div></div>
<div class="part " id="part-sampling-and-data"><div class="part-title-wrap"><h3 class="part-number">I</h3><h1 class="part-title">Chapter 1: Sampling and Data</h1></div><div class="ugc part-ugc"></div></div>
<div class="chapter standard" id="chapter-introduction" title="Chapter 1.1: Introduction"><div class="chapter-title-wrap"><h3 class="chapter-number">1</h3><h2 class="chapter-title"><span class="display-none">Chapter 1.1: Introduction</span></h2></div><div class="ugc chapter-ugc"><p>[latexpage]</p> <div class="textbox textbox--learning-objectives"><h3>Learning Objectives</h3> <p>Introduction</p> </div> <div id="fig-ch01_00_01" class="splash"><div class="bc-figcaption figcaption">We encounter statistics in our daily lives more often than we probably realize and from many different sources, like the news. (credit: David Sim)</div> <p><span id="fs-id2354221" data-type="media" data-alt="This photo shows a large open news room with enough space to seat about 200 employees."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/CNX_Stats_C01_COs-1-scaled-1.jpg" alt="This photo shows a large open news room with enough space to seat about 200 employees." width="600" data-media-type="image/jpg" /></span></p> </div> <div id="fs-idp22683264" class="chapter-objectives" data-type="note" data-has-label="true" data-label=""><div data-type="title">Chapter Objectives</div> <p id="fs-idm17092352">By the end of this chapter, the student should be able to:</p> <ul id="objectives-list"><li>Recognize and differentiate between key terms.</li> <li>Apply various types of sampling methods to data collection.</li> <li>Create and interpret frequency tables.</li> </ul> </div> <p>You are probably asking yourself the question, &#8220;When and where will I use statistics?&#8221; If you read any newspaper, watch television, or use the Internet, you will see statistical information. There are statistics about crime, sports, education, politics, and real estate. Typically, when you read a newspaper article or watch a television news program, you are given sample information. With this information, you may make a decision about the correctness of a statement, claim, or &#8220;fact.&#8221; Statistical methods can help you make the &#8220;best educated guess.&#8221;</p> <p id="eip-1000">Since you will undoubtedly be given statistical information at some point in your life, you need to know some techniques for analyzing the information thoughtfully. Think about buying a house or managing a budget. Think about your chosen profession. The fields of economics, business, psychology, education, biology, law, computer science, police science, and early childhood development require at least one course in statistics.</p> <p id="eip-829">Included in this chapter are the basic ideas and words of probability and statistics. You will soon understand that statistics and probability work together. You will also learn how data are gathered and what &#8220;good&#8221; data can be distinguished from &#8220;bad.&#8221;</p> </div></div>
<div class="chapter standard" id="chapter-definitions-of-statistics-probability-and-key-terms" title="Chapter 1.2: Definitions of Statistics, Probability, and Key Terms"><div class="chapter-title-wrap"><h3 class="chapter-number">2</h3><h2 class="chapter-title"><span class="display-none">Chapter 1.2: Definitions of Statistics, Probability, and Key Terms</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p id="id8112751">The science of <span data-type="term">statistics</span> deals with the collection, analysis, interpretation, and presentation of <span data-type="term">data</span>. We see and use data in our everyday lives.</p> <div id="fs-idm27544368" class="statistics collab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Collaborative Exercise</div> <p id="eip-34">In your classroom, try this exercise. Have class members write down the average time (in hours, to the nearest half-hour) they sleep per night. Your instructor will record the data. Then create a simple graph (called a <strong>dot plot</strong>) of the data. A dot plot consists of a number line and dots (or points) positioned above the number line. For example, consider the following data:</p> <p id="eip-idm80945824"><span data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">5 </span><span data-type="item">5.5 </span><span data-type="item">6 </span><span data-type="item">6 </span><span data-type="item">6 </span><span data-type="item">6.5 </span><span data-type="item">6.5 </span><span data-type="item">6.5 </span><span data-type="item">6.5 </span><span data-type="item">7 </span><span data-type="item">7 </span><span data-type="item">8 </span><span data-type="item">8 </span><span data-type="item">9</span></span></p> <p id="eip-71">The dot plot for this data would be as follows:</p> <div id="eip-idp25549888" class="bc-figure figure"><span id="id44761709a" data-type="media" data-alt="This is a dot plot showing average hours of sleep. The number line is marked in intervals of 1 from 5 to 9. Dots above the line show 1 person reporting 5 hours, 1 with 5.5, 3 with 6, 4 with 6.5, 2 with 7, 2 with 8, and 1 with 9 hours." data-longdesc="m16020_DotPlot_description.html"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch01_02_01n-1.png" alt="This is a dot plot showing average hours of sleep. The number line is marked in intervals of 1 from 5 to 9. Dots above the line show 1 person reporting 5 hours, 1 with 5.5, 3 with 6, 4 with 6.5, 2 with 7, 2 with 8, and 1 with 9 hours." width="380" longdesc="m16020_DotPlot_description.html" data-media-type="image/png" data-longdesc="m16020_DotPlot_description.html" /></span></div> <p id="id10326643">Does your dot plot look the same as or different from the example? Why? If you did the same example in an English class with the same number of students, do you think the results would be the same? Why or why not?</p> <p id="id9336363">Where do your data appear to cluster? How might you interpret the clustering?</p> <p id="id9969457">The questions above ask you to analyze and interpret your data. With this example, you have begun your study of statistics.</p> </div> <p id="eip-624">In this course, you will learn how to organize and summarize data. Organizing and summarizing data is called <span data-type="term">descriptive statistics</span>. Two ways to summarize data are by graphing and by using numbers (for example, finding an average). After you have studied probability and probability distributions, you will use formal methods for drawing conclusions from &#8220;good&#8221; data. The formal methods are called <span data-type="term">inferential statistics</span>. Statistical inference uses probability to determine how confident we can be that our conclusions are correct.</p> <p>Effective interpretation of data (inference) is based on good procedures for producing data and thoughtful examination of the data. You will encounter what will seem to be too many mathematical formulas for interpreting data. The goal of statistics is not to perform numerous calculations using the formulas, but to gain an understanding of your data. The calculations can be done using a calculator or a computer. The understanding must come from you. If you can thoroughly grasp the basics of statistics, you can be more confident in the decisions you make in life.</p> <div id="eip-177" class="bc-section section" data-depth="1"><h3 data-type="title">Probability</h3> <p id="fs-idm37641920"><span data-type="term">Probability</span> is a mathematical tool used to study randomness. It deals with the chance (the likelihood) of an event occurring. For example, if you toss a <strong>fair</strong> coin four times, the outcomes may not be two heads and two tails. However, if you toss the same coin 4,000 times, the outcomes will be close to half heads and half tails. The expected theoretical probability of heads in any one toss is \(\frac{1}{2}\) or 0.5. Even though the outcomes of a few repetitions are uncertain, there is a regular pattern of outcomes when there are many repetitions. After reading about the English statistician Karl <span data-type="term">Pearson</span> who tossed a coin 24,000 times with a result of 12,012 heads, one of the authors tossed a coin 2,000 times. The results were 996 heads. The fraction [latex]\frac{996}{2000}[/latex] is equal to 0.498 which is very close to 0.5, the expected probability.</p> <p id="fs-idm51841168">The theory of probability began with the study of games of chance such as poker. Predictions take the form of probabilities. To predict the likelihood of an earthquake, of rain, or whether you will get an A in this course, we use probabilities. Doctors use probability to determine the chance of a vaccination causing the disease the vaccination is supposed to prevent. A stockbroker uses probability to determine the rate of return on a client&#8217;s investments. You might use probability to decide to buy a lottery ticket or not. In your study of statistics, you will use the power of mathematics through probability calculations to analyze and interpret your data.</p> </div> <div class="bc-section section" data-depth="1"><h3 data-type="title">Key Terms</h3> <p id="fs-idm63797184">In statistics, we generally want to study a <span data-type="term">population</span>. You can think of a population as a collection of persons, things, or objects under study. To study the population, we select a <span data-type="term">sample</span>. The idea of <span data-type="term">sampling</span> is to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population.</p> <p id="fs-idp7610128">Because it takes a lot of time and money to examine an entire population, sampling is a very practical technique. If you wished to compute the overall grade point average at your school, it would make sense to select a sample of students who attend the school. The data collected from the sample would be the students&#8217; grade point averages. In presidential elections, opinion poll samples of 1,000–2,000 people are taken. The opinion poll is supposed to represent the views of the people in the entire country. Manufacturers of canned carbonated drinks take samples to determine if a 16 ounce can contains 16 ounces of carbonated drink.</p> <p id="fs-idm52897344">From the sample data, we can calculate a statistic. A <span data-type="term">statistic</span> is a number that represents a property of the sample. For example, if we consider one math class to be a sample of the population of all math classes, then the average number of points earned by students in that one math class at the end of the term is an example of a statistic. The statistic is an estimate of a population parameter. A <span data-type="term">parameter</span> is a numerical characteristic of the whole population that can be estimated by a statistic. Since we considered all math classes to be the population, then the average number of points earned per student over all the math classes is an example of a parameter.</p> <p id="fs-idm69156368">One of the main concerns in the field of statistics is how accurately a statistic estimates a parameter. The accuracy really depends on how well the sample represents the population. The sample must contain the characteristics of the population in order to be a <span data-type="term">representative sample</span>. We are interested in both the sample statistic and the population parameter in inferential statistics. In a later chapter, we will use the sample statistic to test the validity of the established population parameter.</p> <p id="fs-idm62450928">A <span data-type="term">variable</span>, usually notated by capital letters such as <em data-effect="italics">X</em> and <em data-effect="italics">Y</em>, is a characteristic or measurement that can be determined for each member of a population. Variables may be <strong>numerical</strong> or <strong>categorical</strong>. <span data-type="term">Numerical variables</span> take on values with equal units such as weight in pounds and time in hours. <span data-type="term">Categorical variables</span> place the person or thing into a category. If we let <em data-effect="italics">X</em> equal the number of points earned by one math student at the end of a term, then <em data-effect="italics">X</em> is a numerical variable. If we let <em data-effect="italics">Y</em> be a person&#8217;s party affiliation, then some examples of <em data-effect="italics">Y</em> include Republican, Democrat, and Independent. <em data-effect="italics">Y</em> is a categorical variable. We could do some math with values of <em data-effect="italics">X</em> (calculate the average number of points earned, for example), but it makes no sense to do math with values of <em data-effect="italics">Y</em> (calculating an average party affiliation makes no sense).</p> <p id="fs-idm69507040"><span data-type="term">Data</span> are the actual values of the variable. They may be numbers or they may be words. <strong>Datum</strong> is a single value.</p> <p id="fs-idm150607232">Two words that come up often in statistics are <span data-type="term">mean</span> and <span data-type="term">proportion</span>. If you were to take three exams in your math classes and obtain scores of 86, 75, and 92, you would calculate your mean score by adding the three exam scores and dividing by three (your mean score would be 84.3 to one decimal place). If, in your math class, there are 30 students and 22 are men and 8 are women, then the proportion of men students is  [latex]\frac{22}{30}[/latex] and the proportion of women students is  [latex]\frac{8}{30}[/latex]. Mean and proportion are discussed in more detail in later chapters.</p> <div id="fs-idp118463488" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="fs-idm25880032">The words &#8220;<span data-type="term">mean</span>&#8221; and &#8220;<span data-type="term">average</span>&#8221; are often used interchangeably. The substitution of one word for the other is common practice. The technical term is &#8220;arithmetic mean,&#8221; and &#8220;average&#8221; is technically a center location. However, in practice among non-statisticians, &#8220;average&#8221; is commonly accepted for &#8220;arithmetic mean.&#8221;</p> </div> <div id="fs-idm41492224" class="textbox textbox--examples" data-type="example"><div id="fs-idm10444032" data-type="exercise"><div id="fs-idm63060112" data-type="problem"><p id="fs-idm51690528">Determine what the key terms refer to in the following study. We want to know the average (mean) amount of money first year college students spend at ABC College on school supplies that do not include books. We randomly surveyed 100 first year students at the college. Three of those students spent 150, 200, and 225, respectively.</p> </div> <div id="fs-idm78922624" data-type="solution"><p id="fs-idm66686064">The <strong>population</strong> is all first year students attending ABC College this term.</p> <p id="fs-idm40196160">The <strong>sample</strong> could be all students enrolled in one section of a beginning statistics course at ABC College (although this sample may not represent the entire population).</p> <p id="fs-idm58257248">The <strong>parameter</strong> is the average (mean) amount of money spent (excluding books) by first year college students at ABC College this term.</p> <p id="fs-idm56787136">The <strong>statistic</strong> is the average (mean) amount of money spent (excluding books) by first year college students in the sample.</p> <p id="fs-idm39343328">The <strong>variable</strong> could be the amount of money spent (excluding books) by one first year student. Let <em data-effect="italics">X</em> = the amount of money spent (excluding books) by one first year student attending ABC College.</p> <p id="fs-idm9924960">The <strong>data</strong> are the dollar amounts spent by the first year students. Examples of the data are 150, 200, and 225.</p> </div> </div> </div> <div id="fs-idp138853216" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp65729424" data-type="exercise"><div id="fs-idp121971984" data-type="problem"><p id="fs-idm19818720">Determine what the key terms refer to in the following study. We want to know the average (mean) amount of money spent on school uniforms each year by families with children at Knoll Academy. We randomly survey 100 families with children in the school. Three of the families spent 65, 75, and 95, respectively.</p> </div> </div> </div> <div id="fs-idm42503168" class="textbox textbox--examples" data-type="example"><div id="fs-idm125077712" data-type="exercise"><div id="fs-idm36699520" data-type="problem"><p id="fs-idm41922176">Determine what the key terms refer to in the following study.</p> <p id="fs-idm14780304">A study was conducted at a local college to analyze the average cumulative GPA’s of students who graduated last year. Fill in the letter of the phrase that best describes each of the items below.</p> <p id="fs-idm49702528">1. Population_____ 2. Statistic _____ 3. Parameter _____ 4. Sample _____ 5. Variable _____ 6. Data _____</p> <ol id="fs-idm9097856" type="a"><li>all students who attended the college last year</li> <li>the cumulative GPA of one student who graduated from the college last year</li> <li>3.65, 2.80, 1.50, 3.90</li> <li>a group of students who graduated from the college last year, randomly selected</li> <li>the average cumulative GPA of students who graduated from the college last year</li> <li>all students who graduated from the college last year</li> <li>the average cumulative GPA of students in the study who graduated from the college last year</li> </ol> </div> <div id="fs-idm17652592" data-type="solution"><p id="fs-idm53173376"><span data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">1. f</span><span data-type="item">2. g</span><span data-type="item">3. e</span><span data-type="item">4. d</span><span data-type="item">5. b</span><span data-type="item">6. c</span></span></p> </div> </div> </div> <div id="fs-idm70107152" class="textbox textbox--examples" data-type="example"><div id="fs-idm53269072" data-type="exercise"><div id="fs-idm41679344" data-type="problem"><p id="fs-idm53163856">Determine what the key terms refer to in the following study.</p> <p id="fs-idm19702944">As part of a study designed to test the safety of automobiles, the National Transportation Safety Board collected and reviewed data about the effects of an automobile crash on test dummies. Here is the criterion they used:</p> <table id="ch01_mod01_tbl003" summary=""><tbody><tr><td>Speed at which Cars Crashed</td> <td>Location of “drive” (i.e. dummies)</td> </tr> <tr><td>35 miles/hour</td> <td>Front Seat</td> </tr> </tbody> </table> <p id="fs-idm69168288">Cars with dummies in the front seats were crashed into a wall at a speed of 35 miles per hour. We want to know the proportion of dummies in the driver’s seat that would have had head injuries, if they had been actual drivers. We start with a simple random sample of 75 cars.</p> </div> <div id="fs-idm69138080" data-type="solution"><p id="fs-idm32368">The <strong>population</strong> is all cars containing dummies in the front seat.</p> <p id="fs-idm42507664">The <strong>sample</strong> is the 75 cars, selected by a simple random sample.</p> <p id="fs-idm45479952">The <strong>parameter</strong> is the proportion of driver dummies (if they had been real people) who would have suffered head injuries in the population.</p> <p id="fs-idm59321456">The <strong>statistic</strong> is proportion of driver dummies (if they had been real people) who would have suffered head injuries in the sample.</p> <p id="fs-idm41923744">The <strong>variable</strong> <em data-effect="italics">X</em> = the number of driver dummies (if they had been real people) who would have suffered head injuries.</p> <p id="fs-idm77048688">The <strong>data</strong> are either: yes, had head injury, or no, did not.</p> </div> </div> </div> <div id="fs-idm64455456" class="textbox textbox--examples" data-type="example"><div id="fs-idm55175232" data-type="exercise"><div id="fs-idm56962416" data-type="problem"><p id="fs-idm74506992">Determine what the key terms refer to in the following study.</p> <p id="fs-idm35536880">An insurance company would like to determine the proportion of all medical doctors who have been involved in one or more malpractice lawsuits. The company selects 500 doctors at random from a professional directory and determines the number in the sample who have been involved in a malpractice lawsuit.</p> </div> <div id="fs-idm36890624" data-type="solution"><p id="fs-idm43486032">The <strong>population</strong> is all medical doctors listed in the professional directory.</p> <p id="fs-idp3077328">The <strong>parameter</strong> is the proportion of medical doctors who have been involved in one or more malpractice suits in the population.</p> <p id="fs-idm45173104">The <strong>sample</strong> is the 500 doctors selected at random from the professional directory.</p> <p id="fs-idp4875616">The <strong>statistic</strong> is the proportion of medical doctors who have been involved in one or more malpractice suits in the sample.</p> <p id="fs-idm52360480">The <strong>variable</strong> <em data-effect="italics">X</em> = the number of medical doctors who have been involved in one or more malpractice suits.</p> <p id="fs-idm41698096">The <strong>data</strong> are either: yes, was involved in one or more malpractice lawsuits, or no, was not.</p> </div> </div> </div> <div id="fs-idm62184048" class="statistics collab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Collaborative Exercise</div> <p id="fs-idp44556000">Do the following exercise collaboratively with up to four people per group. Find a population, a sample, the parameter, the statistic, a variable, and data for the following study: You want to determine the average (mean) number of glasses of milk college students drink per day. Suppose yesterday, in your English class, you asked five students how many glasses of milk they drank the day before. The answers were 1, 0, 1, 3, and 4 glasses of milk.</p> </div> </div> <div id="eip-525" class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="eip-idm8127520">The Data and Story Library, http://lib.stat.cmu.edu/DASL/Stories/CrashTestDummies.html (accessed May 1, 2013).</p> </div> <div id="fs-idp10393488" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idm52853984">The mathematical theory of statistics is easier to learn when you know the language. This module presents important terms that will be used throughout the text.</p> </div> <div id="fs-idm17793024" class="practice" data-depth="1"><h3 data-type="title">Practice</h3> <p id="id6060529"><em data-effect="italics">Use the following information to answer the next five exercises.</em> Studies are often done by pharmaceutical companies to determine the effectiveness of a treatment program. Suppose that a new AIDS antibody drug is currently under study. It is given to patients once the AIDS symptoms have revealed themselves. Of interest is the average (mean) length of time in months patients live once they start the treatment. Two researchers each follow a different set of 40 patients with AIDS from the start of treatment until their deaths. The following data (in months) are collected.</p> <p id="element-388"><span data-type="title">Researcher A: </span><span id="set-element-743" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">3  </span><span data-type="item">4  </span><span data-type="item">1  1  </span><span data-type="item">1  5  </span><span data-type="item">1  6  </span><span data-type="item">1  7  </span><span data-type="item">2  2  </span><span data-type="item">4  4  </span><span data-type="item">3  7  </span><span data-type="item">1  6  </span><span data-type="item">1  4  </span><span data-type="item">2  4  </span><span data-type="item">2  5  </span><span data-type="item">1  5  </span><span data-type="item">2  6  </span><span data-type="item">2  7  </span><span data-type="item">3  3  </span><span data-type="item">2  9 </span><span data-type="item">3  5  </span><span data-type="item">4  4  </span><span data-type="item">1  3  </span><span data-type="item">2  1  </span><span data-type="item">2  2  </span><span data-type="item">1  0  </span><span data-type="item">1  2  </span><span data-type="item">8  </span><span data-type="item">4  0  </span><span data-type="item">3  2  </span><span data-type="item">2  6  </span><span data-type="item">2  7  </span><span data-type="item">3  1  </span><span data-type="item">3  4  </span><span data-type="item">2  9  </span><span data-type="item">1  7  </span><span data-type="item">8  </span><span data-type="item">2  4  </span><span data-type="item">1  8  </span><span data-type="item">4  7  </span><span data-type="item">3  3  </span><span data-type="item">3  4</span></span></p> <p id="fs-idp53299072"><span data-type="title">Researcher B:  </span><span id="set-element-556" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">3  </span><span data-type="item">1  4  </span><span data-type="item">1  1  </span><span data-type="item">5  </span><span data-type="item">1  6  </span><span data-type="item">1  7  </span><span data-type="item">2  8  </span><span data-type="item">4  1  </span><span data-type="item">3  1  </span><span data-type="item">1  8  </span><span data-type="item">1  4  </span><span data-type="item">1  4  </span><span data-type="item">2  6  </span><span data-type="item">2  5  </span><span data-type="item">2  1  </span><span data-type="item">2  2  </span><span data-type="item">3  1  </span><span data-type="item">2  </span><span data-type="item">3  5  </span><span data-type="item">4  4  </span><span data-type="item">2  3  </span><span data-type="item">2  1  </span><span data-type="item">2  1  </span><span data-type="item">1  6  </span><span data-type="item">1  2  </span><span data-type="item">1  8  </span><span data-type="item">4  1  </span><span data-type="item">2  2  </span><span data-type="item">1  6  </span><span data-type="item">2  5  </span><span data-type="item">3  3  </span><span data-type="item">3  4  </span><span data-type="item">2  9  </span><span data-type="item">1  3  </span><span data-type="item">1  8  </span><span data-type="item">2  4  </span><span data-type="item">2  3  </span><span data-type="item">4  2  </span><span data-type="item">3  3  </span><span data-type="item">2  9</span></span></p> <p id="id606052900">Determine what the key terms refer to in the example for Researcher A.</p> <div id="exerciseone" data-type="exercise"><div id="id14666923" data-type="problem"><p id="prob_1">population</p> </div> <div id="eip-idp43397312" data-type="solution"><p id="eip-idp50975232">AIDS patients.</p> </div> </div> <div id="exercisetwo" data-type="exercise"><div id="id14666952" data-type="problem"><p id="prob_2">sample</p> </div> <p>AIDS patients were sampled from researcher A and researcher B.</p> </div> <div id="exercisethree" data-type="exercise"><div id="id14666982" data-type="problem"><p id="prob_3">parameter</p> </div> <div id="eip-idm55102528" data-type="solution"><p id="eip-idm55102272">The average length of time (in months) AIDS patients live after treatment.</p> </div> </div> <div id="exercisefour" data-type="exercise"><div id="id14667012" data-type="problem"><p id="prob_4">statistic</p> </div> <p>The average length of time (in months) AIDS patients from the sample live after treatment.</p> </div> <div id="exercisefive" data-type="exercise"><div id="id14667042" data-type="problem"><p id="prob_5">variable</p> </div> <div id="eip-idp32103664" data-type="solution"><p id="eip-idm64123264"><em data-effect="italics">X</em> = the length of time (in months) AIDS patients live after treatment</p> </div> </div> </div> <div id="fs-idm57628896" class="free-response" data-depth="1"><h3 data-type="title">HOMEWORK</h3> <p id="eip-idm102156112"><em data-effect="italics"> For each of the following eight exercises, identify: a. the population, b. the sample, c. the parameter, d. the statistic, e. the variable, and f. the data. Give examples where appropriate.</em></p> <div id="element-188" data-type="exercise"><div id="id30125588" data-type="problem"><p id="element-424">1) A fitness center is interested in the mean amount of time a client exercises in the center each week.</p> <p>&nbsp;</p> </div> </div> <div id="element-951" data-type="exercise"><div id="id30142690" data-type="problem"><p id="element-812">2) Ski resorts are interested in the mean age that children take their first ski and snowboard lessons. They need this information to plan their ski classes optimally.</p> <p>&nbsp;</p> </div> </div> <div id="element-776" data-type="exercise"><div id="id30142935" data-type="problem"><p id="element-85">3) A cardiologist is interested in the mean recovery period of her patients who have had heart attacks.</p> <p>&nbsp;</p> </div> </div> <div id="element-54" data-type="exercise"><div id="id30143063" data-type="problem"><p id="element-530">4) Insurance companies are interested in the mean health costs each year of their clients, so that they can determine the costs of health insurance.</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id30200997" data-type="problem"><p id="element-540">5) A politician is interested in the proportion of voters in his district who think he is doing a good job.</p> <p>&nbsp;</p> </div> </div> <div id="element-430" data-type="exercise"><div id="id30201124" data-type="problem"><p id="element-145">6) A marriage counselor is interested in the proportion of clients she counsels who stay married.</p> <p>&nbsp;</p> </div> </div> <div id="element-417" data-type="exercise"><div id="id30201364" data-type="problem"><p>7) Political pollsters may be interested in the proportion of people who will vote for a particular cause.</p> <p>&nbsp;</p> </div> </div> <div id="element-467" data-type="exercise"><div id="id30201491" data-type="problem"><p id="element-404">8) A marketing company is interested in the proportion of people who will buy a particular product.</p> <p>&nbsp;</p> </div> </div> <p id="fs-idm1484432"><em data-effect="italics">Use the following information to answer the next three exercises:</em> A Lake Tahoe Community College instructor is interested in the mean number of days Lake Tahoe Community College math students are absent from class during a quarter.</p> <p>&nbsp;</p> <div id="element-230" data-type="exercise"><div id="id30103712" data-type="problem"><p id="fs-idp15729072">9) What is the population she is interested in?</p> <ol id="element-978" type="a" data-mark-suffix="."><li>all Lake Tahoe Community College students</li> <li>all Lake Tahoe Community College English students</li> <li>all Lake Tahoe Community College students in her classes</li> <li>all Lake Tahoe Community College math students</li> </ol> <p>&nbsp;</p> </div> </div> <div id="element-590" data-type="exercise"><div id="id30103831" data-type="problem"><p id="element-311">10) Consider the following:</p> <p id="element-131">(X) = number of days a Lake Tahoe Community College math student is absent</p> <p id="element-561">In this case, <em data-effect="italics">X</em> is an example of a:</p> <ol id="element-959" type="a" data-mark-suffix="."><li>variable.</li> <li>population.</li> <li>statistic.</li> <li>data.</li> </ol> </div> </div> <div id="element-992" data-type="exercise"><div id="id30104101" data-type="problem"><p>&nbsp;</p> <p id="element-827">11) The instructor’s sample produces a mean number of days absent of 3.5 days. This value is an example of a:</p> <ol id="element-303" type="a" data-mark-suffix="."><li>parameter.</li> <li>data.</li> <li>statistic.</li> <li>variable.</li> </ol> </div> <p><strong>Answers to odd quesitons</strong></p> <p>1)  The population is all of the clients of the fitness center. A sample of the clients that use the fitness center for a given week. The average amount of time that all clients exercise in one week. The average amount of time that a sample of clients exercises in one week. The amount of time that a client exercises in one week. Examples are: 2 hours, 5 hours, and 7.5 hours</p> <p>&nbsp;</p> <p>3) the cardiologist’s patients a group of the cardiologist’s patients the mean recovery period of all of the cardiologist’s patients the mean recovery period of the group of the cardiologist’s patients X = the mean recovery period of one patient values for X, such as 10 days, 14 days, 20 days, and so on</p> <p>&nbsp;</p> <p>5) all voters in the politician’s district a random selection of voters in the politician’s district the proportion of voters in this district who think this politician is doing a good job the proportion of voters in this district who think this politician is doing a good job in the sample X = the number of voters in the district who think this politician is doing a good job Yes, he is doing a good job. No, he is not doing a good job.</p> <p>&nbsp;</p> <p>7)  &lt;solution id=&#8221;eip-idm168008800&#8243;&gt; all voters (in a certain geographic area) a random selection of all the voters the proportion of voters who are interested in this particular cause the proportion of voters who are interested in this particular cause in the sample X = the number of voters who are interested in this particular cause yes, no</p> <p>&nbsp;</p> <p>9) d</p> <p>&nbsp;</p> <p>11)  c</p> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="average"><dt>Average</dt> <dd id="id16316921">also called mean; a number that describes the central tendency of the data</dd> </dl> <dl id="fs-idm97064528"><dt>Categorical Variable</dt> <dd id="fs-idm30253968">variables that take on values that are names or labels</dd> </dl> <dl id="data"><dt>Data</dt> <dd id="id15539900">a set of observations (a set of possible outcomes); most data can be put into two groups: <strong>qualitative</strong> (an attribute whose value is indicated by a label) or <strong>quantitative</strong> (an attribute whose value is indicated by a number). Quantitative data can be separated into two subgroups: <strong>discrete</strong> and <strong>continuous</strong>. Data is discrete if it is the result of counting (such as the number of students of a given ethnic group in a class or the number of books on a shelf). Data is continuous if it is the result of measuring (such as distance traveled or weight of luggage)</dd> </dl> <dl id="fs-idm26909648"><dt>Numerical Variable</dt> <dd id="fs-idm96926144">variables that take on values that are indicated by numbers</dd> </dl> <dl id="fs-idm26606704"><dt>Parameter</dt> <dd id="fs-idm15244864">a number that is used to represent a population characteristic and that generally cannot be determined easily</dd> </dl> <dl id="fs-idm7877680"><dt>Population</dt> <dd id="fs-idp571488">all individuals, objects, or measurements whose properties are being studied</dd> </dl> <dl id="prob"><dt>Probability</dt> <dd id="id17934331">a number between zero and one, inclusive, that gives the likelihood that a specific event will occur</dd> </dl> <dl id="proportion"><dt>Proportion</dt> <dd id="id15701010">the number of successes divided by the total number in the sample</dd> </dl> <dl id="fs-idm64138416"><dt>Representative Sample</dt> <dd id="fs-idm26214992">a subset of the population that has the same characteristics as the population</dd> </dl> <dl id="fs-idm100365008"><dt>Sample</dt> <dd id="fs-idm51261392">a subset of the population studied</dd> </dl> <dl id="stat"><dt>Statistic</dt> <dd id="id17366233">a numerical characteristic of the sample; a statistic estimates the corresponding population parameter.</dd> </dl> <dl id="fs-idm19697904"><dt>Variable</dt> <dd id="fs-idm90760848">a characteristic of interest for each person or object in a population</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-data-sampling-and-variation-in-data-and-sampling" title="Chapter 1.3: Data, Sampling, and Variation in Data and Sampling"><div class="chapter-title-wrap"><h3 class="chapter-number">3</h3><h2 class="chapter-title"><span class="display-none">Chapter 1.3: Data, Sampling, and Variation in Data and Sampling</span></h2></div><div class="ugc chapter-ugc"> <p>&nbsp;</p> <p id="id7862377">Data may come from a population or from a sample. Lowercase letters like (x) or (y) generally are used to represent data values. Most data can be put into the following categories:</p> <ul id="id10607985"><li>Qualitative</li> <li>Quantitative</li> </ul> <p id="id9602938"><span data-type="term">Qualitative data</span> are the result of categorizing or describing attributes of a population. Qualitative data are also often called <span data-type="term">categorical data</span>. Hair color, blood type, ethnic group, the car a person drives, and the street a person lives on are examples of qualitative data. Qualitative data are generally described by words or letters. For instance, hair color might be black, dark brown, light brown, blonde, gray, or red. Blood type might be AB+, O-, or B+. Researchers often prefer to use quantitative data over qualitative data because it lends itself more easily to mathematical analysis. For example, it does not make sense to find an average hair color or blood type.</p> <p id="id3365343"><span data-type="term">Quantitative data</span> are always numbers. Quantitative data are the result of <strong>counting</strong> or <strong>measuring</strong> attributes of a population. Amount of money, pulse rate, weight, number of people living in your town, and number of students who take statistics are examples of quantitative data. Quantitative data may be either <span data-type="term">discrete</span> or <span data-type="term">continuous</span>.</p> <p id="id9750754">All data that are the result of counting are called <span data-type="term">quantitative discrete data</span>. These data take on only certain numerical values. If you count the number of phone calls you receive for each day of the week, you might get values such as zero, one, two, or three.</p> <p id="id5023881">Data that are not only made up of counting numbers, but that may include fractions, decimals, or irrational numbers, are called <span data-type="term">quantitative continuous data</span>. Continuous data are often the results of measurements like lengths, weights, or times. A list of the lengths in minutes for all the phone calls that you make in a week, with numbers like 2.4, 7.5, or 11.0, would be quantitative continuous data.</p> <div id="fs-idm37037280" class="textbox textbox--examples" data-type="example"><div data-type="title">Data Sample of Quantitative Discrete Data</div> <p id="id3406867">The data are the number of books students carry in their backpacks. You sample five students. Two students carry three books, one student carries four books, one student carries two books, and one student carries one book. The numbers of books (three, four, two, and one) are the quantitative discrete data.</p> </div> <div id="fs-idp82396944" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm53037056" data-type="exercise"><div id="fs-idp7436128" data-type="problem"><p id="fs-idp14474960">The data are the number of machines in a gym. You sample five gyms. One gym has 12 machines, one gym has 15 machines, one gym has ten machines, one gym has 22 machines, and the other gym has 20 machines. What type of data is this?</p> </div> </div> </div> <div id="fs-idm43599584" class="textbox textbox--examples" data-type="example"><div data-type="title">Data Sample of Quantitative Continuous Data</div> <p id="id3944554">The data are the weights of backpacks with books in them. You sample the same five students. The weights (in pounds) of their backpacks are 6.2, 7, 6.8, 9.1, 4.3. Notice that backpacks carrying three books can have different weights. Weights are quantitative continuous data.</p> </div> <div id="fs-idp29656272" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm6646528" data-type="exercise"><div id="fs-idm11686144" data-type="problem"><p id="fs-idm8677600">The data are the areas of lawns in square feet. You sample five houses. The areas of the lawns are 144 sq. feet, 160 sq. feet, 190 sq. feet, 180 sq. feet, and 210 sq. feet. What type of data is this?</p> </div> </div> </div> <div id="fs-idm68548176" class="textbox textbox--examples" data-type="example"><p id="fs-idm35908224">You go to the supermarket and purchase three cans of soup (19 ounces tomato bisque, 14.1 ounces lentil, and 19 ounces Italian wedding), two packages of nuts (walnuts and peanuts), four different kinds of vegetable (broccoli, cauliflower, spinach, and carrots), and two desserts (16 ounces pistachio ice cream and 32 ounces chocolate chip cookies).</p> <div id="eip-idm51320448" data-type="exercise"><div id="eip-idm51320192" data-type="problem"><p id="fs-idp7536032">Name data sets that are quantitative discrete, quantitative continuous, and qualitative.</p> </div> <div id="eip-idp9281200" data-type="solution"><p id="fs-idm84978080">One Possible Solution:</p> <ul id="fs-idp14197904"><li>The three cans of soup, two packages of nuts, four kinds of vegetables and two desserts are quantitative discrete data because you count them.</li> <li>The weights of the soups (19 ounces, 14.1 ounces, 19 ounces) are quantitative continuous data because you measure weights as precisely as possible.</li> <li>Types of soups, nuts, vegetables and desserts are qualitative data because they are categorical.</li> </ul> </div> </div> <p id="fs-idm59417696">Try to identify additional data sets in this example.</p> </div> <div id="fs-idm18169904" class="textbox textbox--examples" data-type="example"><p id="id11979238">The data are the colors of backpacks. Again, you sample the same five students. One student has a red backpack, two students have black backpacks, one student has a green backpack, and one student has a gray backpack. The colors red, black, black, green, and gray are qualitative data.</p> </div> <div id="fs-idp21253600" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm55991520" data-type="exercise"><div id="fs-idm137553376" data-type="problem"><p id="fs-idm81062896">The data are the colors of houses. You sample five houses. The colors of the houses are white, yellow, white, red, and white. What type of data is this?</p> </div> </div> </div> <div id="fs-idp153448320" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="eip-idp9388048">You may collect data as numbers and report it categorically. For example, the quiz scores for each student are recorded throughout the term. At the end of the term, the quiz scores are reported as A, B, C, D, or F.</p> </div> <div id="fs-idm36372528" class="textbox textbox--examples" data-type="example"><div id="element-652" data-type="exercise"><div id="id14888019" data-type="problem"><p id="element-318">Work collaboratively to determine the correct data type (quantitative or qualitative). Indicate whether quantitative data are continuous or discrete. Hint: Data that are discrete often start with the words “the number of.”</p> <ol id="element-861" type="a"><li>the number of pairs of shoes you own</li> <li>the type of car you drive</li> <li>the distance it is from your home to the nearest grocery store</li> <li>the number of classes you take per school year.</li> <li>the type of calculator you use</li> <li>weights of sumo wrestlers</li> <li>number of correct answers on a quiz</li> <li>IQ scores (This may cause some discussion.)</li> </ol> </div> <div id="id14523850" data-type="solution"><p id="exercise-solutions-1">Items a, d, and g are quantitative discrete; items c, f, and h are quantitative continuous; items b and e are qualitative, or categorical.</p> </div> </div> </div> <div id="fs-idp17156448" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm22853072" data-type="exercise"><div id="fs-idm26183776" data-type="problem"><p id="fs-idm81268256">Determine the correct data type (quantitative or qualitative) for the number of cars in a parking lot. Indicate whether quantitative data are continuous or discrete.</p> </div> </div> </div> <div id="fs-idm43655440" class="textbox textbox--examples" data-type="example"><div id="fs-idm43609536" data-type="exercise"><div id="fs-idm57798048" data-type="problem"><p id="fs-idm69332192">A statistics professor collects information about the classification of her students as freshmen, sophomores, juniors, or seniors. The data she collects are summarized in the pie chart <a class="autogenerated-content" href="#ch01_mod02_fig001" data-url="https://pressbooks.ccconline.org/accintrostats/chapter/data-sampling-and-variation-in-data-and-sampling/#ch01_mod02_fig001">(Figure)</a>. What type of data does this graph show?</p> <div id="ch01_mod02_fig001" class="bc-figure figure"><span id="fs-idm79319760" data-type="media" data-alt="This is a pie chart showing the class classification of statistics students. The chart has 4 sections labeled Freshman, Sophomore, Junior, Senior. A question is asked below the pie chart: what type of data does this graph show?"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/CNX_Stats_C01_M05_001-1.jpg" alt="This is a pie chart showing the class classification of statistics students. The chart has 4 sections labeled Freshman, Sophomore, Junior, Senior. A question is asked below the pie chart: what type of data does this graph show?" width="350" data-media-type="image/jpg" /></span></div> </div> <div id="fs-idm52167008" data-type="solution"><p id="fs-idm66868784">This pie chart shows the students in each year, which is <strong>qualitative (or categorical) data</strong>.</p> </div> </div> </div> <div id="fs-idm89068416" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm140347520" data-type="exercise"><div id="fs-idm111614288" data-type="problem"><p id="fs-idm90836000">The registrar at State University keeps records of the number of credit hours students complete each semester. The data he collects are summarized in the histogram. The class boundaries are 10 to less than 13, 13 to less than 16, 16 to less than 19, 19 to less than 22, and 22 to less than 25. <span data-type="newline"><br /> </span></p> <div id="ch01_mod02_fig002" class="bc-figure figure"><span id="fs-idm92736960" data-type="media" data-alt="This histogram consists of 5 bars with the x-axis marked at intervals of 3 from 10 - 25, and the y-axis in increments of 100 from 0 - 800. The height of bars shows the number of students in each interval."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C01_M05_002-1.png" alt="This histogram consists of 5 bars with the x-axis marked at intervals of 3 from 10 - 25, and the y-axis in increments of 100 from 0 - 800. The height of bars shows the number of students in each interval." width="380" data-media-type="image/png" /></span></div> <p><span data-type="newline"><br /> </span> What type of data does this graph show?</p> </div> </div> </div> <div id="eip-312" class="bc-section section" data-depth="1"><h3 data-type="title">Qualitative Data Discussion</h3> <p id="eip-298">Below are tables comparing the number of part-time and full-time students at De Anza College and Foothill College enrolled for the spring 2010 quarter. The tables display counts (frequencies) and percentages or proportions (relative frequencies). The percent columns make comparing the same categories in the colleges easier. Displaying percentages along with the numbers is often helpful, but it is particularly important when comparing sets of data that do not have the same totals, such as the total enrollments for both colleges in this example. Notice how much larger the percentage for part-time students at Foothill College is compared to De Anza College.</p> <table id="eip-953" summary="Fall Term 2007 (Census day)"><caption><span data-type="title">Fall Term 2007 (Census day)</span></caption> <thead><tr><th colspan="3" data-align="center">De Anza College</th> <th></th> <th colspan="3" data-align="center">Foothill College</th> </tr> </thead> <tbody><tr><td></td> <td>Number</td> <td>Percent</td> <td></td> <td></td> <td>Number</td> <td>Percent</td> </tr> <tr><td>Full-time</td> <td>9,200</td> <td>40.9%</td> <td></td> <td>Full-time</td> <td>4,059</td> <td>28.6%</td> </tr> <tr><td>Part-time</td> <td>13,296</td> <td>59.1%</td> <td></td> <td>Part-time</td> <td>10,124</td> <td>71.4%</td> </tr> <tr><td>Total</td> <td>22,496</td> <td>100%</td> <td></td> <td>Total</td> <td>14,183</td> <td>100%</td> </tr> </tbody> </table> <p id="eip-962">Tables are a good way of organizing and displaying data. But graphs can be even more helpful in understanding the data. There are no strict rules concerning which graphs to use. Two graphs that are used to display qualitative data are pie charts and bar graphs.</p> <p id="eip-884">In a <span data-type="term">pie chart</span>, categories of data are represented by wedges in a circle and are proportional in size to the percent of individuals in each category.</p> <p id="eip-317">In a <span data-type="term">bar graph</span>, the length of the bar for each category is proportional to the number or percent of individuals in each category. Bars may be vertical or horizontal.</p> <p id="eip-60">A <span data-type="term">Pareto chart</span> consists of bars that are sorted into order by category size (largest to smallest).</p> <p id="eip-378">Look at <a class="autogenerated-content" href="#ch01_mod02_fig003" data-url="https://pressbooks.ccconline.org/accintrostats/chapter/data-sampling-and-variation-in-data-and-sampling/#ch01_mod02_fig003">(Figure)</a> and <a class="autogenerated-content" href="#ch01_mod02_fig004" data-url="https://pressbooks.ccconline.org/accintrostats/chapter/data-sampling-and-variation-in-data-and-sampling/#ch01_mod02_fig004">(Figure)</a> and determine which graph (pie or bar) you think displays the comparisons better.</p> <p id="eip-510">It is a good idea to look at a variety of graphs to see which is the most helpful in displaying the data. We might make different choices of what we think is the “best” graph depending on the data and the context. Our choice also depends on what we are using the data for.</p> <div id="ch01_mod02_fig003" class="bc-figure figure" data-orient="horizontal"><div id="eip-idp58600416" class="bc-figure figure"><span id="eip-idp58600672" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch01_patchfile_01-1.jpg" alt="" width="300" data-media-type="image/jpg" /></span></div> <div id="eip-idm57620352" class="bc-figure figure"><span id="eip-idm57620096" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch01_patchfile_02-1.jpg" alt="" width="300" data-media-type="image/jpg" /></span></div> </div> <div id="ch01_mod02_fig004" class="bc-figure figure"><span id="eip-idp50918176" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch01_patchfile_03-1.jpg" alt="" width="380" data-media-type="image/jpg" /></span></div> <div id="eip-558" class="bc-section section" data-depth="2"><h4 data-type="title">Percentages That Add to More (or Less) Than 100%</h4> <p id="eip-613">Sometimes percentages add up to be more than 100% (or less than 100%). In the graph, the percentages add to more than 100% because students can be in more than one category. A bar graph is appropriate to compare the relative size of the categories. A pie chart cannot be used. It also could not be used if the percentages added to less than 100%.</p> <table id="eip-451" summary="De Anza College Spring 2010"><caption><span data-type="title">De Anza College Spring 2010</span></caption> <thead><tr><th>Characteristic/Category</th> <th>Percent</th> </tr> </thead> <tbody><tr><td>Full-Time Students</td> <td>40.9%</td> </tr> <tr><td>Students who intend to transfer to a 4-year educational institution</td> <td>48.6%</td> </tr> <tr><td>Students under age 25</td> <td>61.0%</td> </tr> <tr><td>TOTAL</td> <td>150.5%</td> </tr> </tbody> </table> <div id="ch01_mod02_fig005" class="bc-figure figure"><span id="eip-idm13986336" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch01_patchfile_04-1.jpg" alt="" width="380" data-media-type="image/jpg" /></span></div> </div> <div id="eip-993" class="bc-section section" data-depth="2"><h4 data-type="title">Omitting Categories/Missing Data</h4> <p id="eip-145">The table displays Ethnicity of Students but is missing the “Other/Unknown” category. This category contains people who did not feel they fit into any of the ethnicity categories or declined to respond. Notice that the frequencies do not add up to the total number of students. In this situation, create a bar graph and not a pie chart.</p> <table id="eip-251" summary="The table displays Ethnicity of Students"><caption><span data-type="title">Ethnicity of Students at De Anza College Fall Term 2007 (Census Day)</span></caption> <thead><tr><th></th> <th>Frequency</th> <th>Percent</th> </tr> </thead> <tbody><tr><td>Asian</td> <td>8,794</td> <td>36.1%</td> </tr> <tr><td>Black</td> <td>1,412</td> <td>5.8%</td> </tr> <tr><td>Filipino</td> <td>1,298</td> <td>5.3%</td> </tr> <tr><td>Hispanic</td> <td>4,180</td> <td>17.1%</td> </tr> <tr><td>Native American</td> <td>146</td> <td>0.6%</td> </tr> <tr><td>Pacific Islander</td> <td>236</td> <td>1.0%</td> </tr> <tr><td>White</td> <td>5,978</td> <td>24.5%</td> </tr> <tr><td>TOTAL</td> <td>22,044 out of 24,382</td> <td>90.4% out of 100%</td> </tr> </tbody> </table> <div id="ch01_mod02_fig006" class="bc-figure figure"><span id="eip-idm93533648" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch01_patchfile_05-1.jpg" alt="" width="480" data-media-type="image/jpg" /></span></div> <p id="eip-756">The following graph is the same as the previous graph but the “Other/Unknown” percent (9.6%) has been included. The “Other/Unknown” category is large compared to some of the other categories (Native American, 0.6%, Pacific Islander 1.0%). This is important to know when we think about what the data are telling us.</p> <p id="eip-187">This particular bar graph in <a class="autogenerated-content" href="#ch01_mod02_fig007" data-url="https://pressbooks.ccconline.org/accintrostats/chapter/data-sampling-and-variation-in-data-and-sampling/#ch01_mod02_fig007">(Figure)</a> can be difficult to understand visually. The graph in <a class="autogenerated-content" href="#ch01_mod02_fig008" data-url="https://pressbooks.ccconline.org/accintrostats/chapter/data-sampling-and-variation-in-data-and-sampling/#ch01_mod02_fig008">(Figure)</a> is a Pareto chart. The Pareto chart has the bars sorted from largest to smallest and is easier to read and interpret.</p> <div id="ch01_mod02_fig007" class="bc-figure figure"><div data-type="title">Bar Graph with Other/Unknown Category</div> <p><span id="eip-idp18575104" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch01_patchfile_06-1.jpg" alt="" width="480" data-media-type="image/jpg" /></span></p> </div> <div id="ch01_mod02_fig008" class="bc-figure figure"><div data-type="title">Pareto Chart With Bars Sorted by Size</div> <p><span id="eip-idp91563536" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch01_patchfile_07-1.jpg" alt="" width="480" data-media-type="image/jpg" /></span></p> </div> </div> <div id="eip-47" class="bc-section section" data-depth="2"><h4 data-type="title">Pie Charts: No Missing Data</h4> <p id="eip-7">The following pie charts have the “Other/Unknown” category included (since the percentages must add to 100%). The chart in <a class="autogenerated-content" href="#ch01_mod02_fig009b" data-url="https://pressbooks.ccconline.org/accintrostats/chapter/data-sampling-and-variation-in-data-and-sampling/#ch01_mod02_fig009b">(Figure)</a> is organized by the size of each wedge, which makes it a more visually informative graph than the unsorted, alphabetical graph in <a class="autogenerated-content" href="#ch01_mod02_fig009a" data-url="https://pressbooks.ccconline.org/accintrostats/chapter/data-sampling-and-variation-in-data-and-sampling/#ch01_mod02_fig009a">(Figure)</a>.</p> <div id="ch01_mod02_fig009" class="bc-figure figure" data-orient="horizontal"><div id="ch01_mod02_fig009a" class="bc-figure figure"><span id="eip-idm48032080" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch01_patchfile_08-1.jpg" alt="" width="300" data-media-type="image/jpg" /></span></div> <div id="ch01_mod02_fig009b" class="bc-figure figure"><span id="eip-idm51653728" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch01_patchfile_09-1.jpg" alt="" width="300" data-media-type="image/jpg" /></span></div> </div> </div> </div> <div id="eip-49" class="bc-section section" data-depth="1"><h3 data-type="title">Sampling</h3> <p id="id12361474">Gathering information about an entire population often costs too much or is virtually impossible. Instead, we use a sample of the population. <strong>A sample should have the same characteristics as the population it is representing.</strong> Most statisticians use various methods of random sampling in an attempt to achieve this goal. This section will describe a few of the most common methods. There are several different methods of <strong>random sampling</strong>. In each form of random sampling, each member of a population initially has an equal chance of being selected for the sample. Each method has pros and cons. The easiest method to describe is called a <strong>simple random sample</strong>. Any group of <em data-effect="italics">n</em> individuals is equally likely to be chosen as any other group of <em data-effect="italics">n</em> individuals if the simple random sampling technique is used. In other words, each sample of the same size has an equal chance of being selected. For example, suppose Lisa wants to form a four-person study group (herself and three other people) from her pre-calculus class, which has 31 members not including Lisa. To choose a simple random sample of size three from the other members of her class, Lisa could put all 31 names in a hat, shake the hat, close her eyes, and pick out three names. A more technological way is for Lisa to first list the last names of the members of her class together with a two-digit number, as in <a class="autogenerated-content" href="#element-621" data-url="https://pressbooks.ccconline.org/accintrostats/chapter/data-sampling-and-variation-in-data-and-sampling/#element-621">(Figure)</a>:</p> <table id="element-621" summary="This table presents a class roster arranged in alphabetical order. The first column lists a unique two-digit ID number for each student, with the second column displaying the student's last name."><caption><span data-type="title">Class Roster</span></caption> <thead><tr><th>ID</th> <th>Name</th> <th>ID</th> <th>Name</th> <th>ID</th> <th>Name</th> </tr> </thead> <tbody><tr><td>00</td> <td>Anselmo</td> <td>11</td> <td>King</td> <td>21</td> <td>Roquero</td> </tr> <tr><td>01</td> <td>Bautista</td> <td>12</td> <td>Legeny</td> <td>22</td> <td>Roth</td> </tr> <tr><td>02</td> <td>Bayani</td> <td>13</td> <td>Lundquist</td> <td>23</td> <td>Rowell</td> </tr> <tr><td>03</td> <td>Cheng</td> <td>14</td> <td>Macierz</td> <td>24</td> <td>Salangsang</td> </tr> <tr><td>04</td> <td>Cuarismo</td> <td>15</td> <td>Motogawa</td> <td>25</td> <td>Slade</td> </tr> <tr><td>05</td> <td>Cuningham</td> <td>16</td> <td>Okimoto</td> <td>26</td> <td>Stratcher</td> </tr> <tr><td>06</td> <td>Fontecha</td> <td>17</td> <td>Patel</td> <td>27</td> <td>Tallai</td> </tr> <tr><td>07</td> <td>Hong</td> <td>18</td> <td>Price</td> <td>28</td> <td>Tran</td> </tr> <tr><td>08</td> <td>Hoobler</td> <td>19</td> <td>Quizon</td> <td>29</td> <td>Wai</td> </tr> <tr><td>09</td> <td>Jiao</td> <td>20</td> <td>Reyes</td> <td>30</td> <td>Wood</td> </tr> <tr><td>10</td> <td>Khan</td> <td></td> <td></td> <td></td> <td></td> </tr> </tbody> </table> <p id="id10904793">Lisa can use a table of random numbers (found in many statistics books and mathematical handbooks), a calculator, or a computer to generate random numbers. For this example, suppose Lisa chooses to generate random numbers from a calculator. The numbers generated are as follows:</p> <p id="element-250"><span id="set-element-428" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">0.94360&nbsp; </span><span data-type="item">0.99832&nbsp; </span><span data-type="item">0.14669&nbsp; </span><span data-type="item">0.51470&nbsp; </span><span data-type="item">0.40581&nbsp; </span><span data-type="item">0.73381&nbsp; </span><span data-type="item">0.04399</span></span></p> <p id="id12648586">Lisa reads two-digit groups until she has chosen three class members (that is, she reads 0.94360 as the groups 94, 43, 36, 60). Each random number may only contribute one class member. If she needed to, Lisa could have generated more random numbers.</p> <p id="id12688561">The random numbers 0.94360 and 0.99832 do not contain appropriate two digit numbers. However the third random number, 0.14669, contains 14 (the fourth random number also contains 14), the fifth random number contains 05, and the seventh random number contains 04. The two-digit number 14 corresponds to Macierz, 05 corresponds to Cuningham, and 04 corresponds to Cuarismo. Besides herself, Lisa’s group will consist of Marcierz, Cuningham, and Cuarismo.</p> <p id="id11076554">Besides simple random sampling, there are other forms of sampling that involve a chance process for getting the sample. <strong>Other well-known random sampling methods are the stratified sample, the cluster sample, and the systematic sample.</strong></p> <p id="id12511076">To choose a <strong>stratified sample</strong>, divide the population into groups called strata and then take a <strong>proportionate</strong> number from each stratum. For example, you could stratify (group) your college population by department and then choose a proportionate simple random sample from each stratum (each department) to get a stratified random sample. To choose a simple random sample from each department, number each member of the first department, number each member of the second department, and do the same for the remaining departments. Then use simple random sampling to choose proportionate numbers from the first department and do the same for each of the remaining departments. Those numbers picked from the first department, picked from the second department, and so on represent the members who make up the stratified sample.</p> <p id="id13017093">To choose a <strong>cluster sample</strong>, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your college population, the four departments make up the cluster sample. Divide your college faculty by department. The departments are the clusters. Number each department, and then choose four different numbers using simple random sampling. All members of the four departments with those numbers are the cluster sample.</p> <p id="id12769433">To choose a <strong>systematic sample</strong>, randomly select a starting point and take every <em data-effect="italics">n</em><sup>th</sup> piece of data from a listing of the population. For example, suppose you have to do a phone survey. Your phone book contains 20,000 residence listings. You must choose 400 names for the sample. Number the population 1–20,000 and then use a simple random sample to pick a number that represents the first name in the sample. Then choose every fiftieth name thereafter until you have a total of 400 names (you might have to go back to the beginning of your phone list). Systematic sampling is frequently chosen because it is a simple method.</p> <p id="id12385449">A type of sampling that is non-random is convenience sampling. <strong>Convenience sampling</strong> involves using results that are readily available. For example, a computer software store conducts a marketing study by interviewing potential customers who happen to be in the store browsing through the available software. The results of convenience sampling may be very good in some cases and highly biased (favor certain outcomes) in others.</p> <p id="id10814456">Sampling data should be done very carefully. Collecting data carelessly can have devastating results. Surveys mailed to households and then returned may be very biased (they may favor a certain group). It is better for the person conducting the survey to select the sample respondents.</p> <p id="eip-781">True random sampling is done <strong>with replacement</strong>. That is, once a member is picked, that member goes back into the population and thus may be chosen more than once. However for practical reasons, in most populations, simple random sampling is done <strong>without replacement</strong>. Surveys are typically done without replacement. That is, a member of the population may be chosen only once. Most samples are taken from large populations and the sample tends to be small in comparison to the population. Since this is the case, sampling without replacement is approximately the same as sampling with replacement because the chance of picking the same individual more than once with replacement is very low.</p> <p id="eip-277">In a college population of 10,000 people, suppose you want to pick a sample of 1,000 randomly for a survey. <strong>For any particular sample of 1,000</strong>, if you are sampling <strong>with replacement</strong>,</p> <ul id="eip-447"><li>the chance of picking the first person is 1,000 out of 10,000 (0.1000);</li> <li>the chance of picking a different second person for this sample is 999 out of 10,000 (0.0999);</li> <li>the chance of picking the same person again is 1 out of 10,000 (very low).</li> </ul> <p id="eip-646">If you are sampling <strong>without replacement</strong>,</p> <ul id="eip-546"><li>the chance of picking the first person for any particular sample is 1000 out of 10,000 (0.1000);</li> <li>the chance of picking a different second person is 999 out of 9,999 (0.0999);</li> <li>you do not replace the first person before picking the next person.</li> </ul> <p id="eip-854">Compare the fractions 999/10,000 and 999/9,999. For accuracy, carry the decimal answers to four decimal places. To four decimal places, these numbers are equivalent (0.0999).</p> <p id="eip-211">Sampling without replacement instead of sampling with replacement becomes a mathematical issue only when the population is small. For example, if the population is 25 people, the sample is ten, and you are sampling <strong>with replacement for any particular sample</strong>, then the chance of picking the first person is ten out of 25, and the chance of picking a different second person is nine out of 25 (you replace the first person).</p> <p id="eip-647">If you sample <strong>without replacement</strong>, then the chance of picking the first person is ten out of 25, and then the chance of picking the second person (who is different) is nine out of 24 (you do not replace the first person).</p> <p id="eip-891">Compare the fractions 9/25 and 9/24. To four decimal places, 9/25 = 0.3600 and 9/24 = 0.3750. To four decimal places, these numbers are not equivalent.</p> <p id="id11715554">When you analyze data, it is important to be aware of <strong>sampling errors</strong> and nonsampling errors. The actual process of sampling causes sampling errors. For example, the sample may not be large enough. Factors not related to the sampling process cause <strong>nonsampling errors</strong>. A defective counting device can cause a nonsampling error.</p> <p id="fs-idp101680848">In reality, a sample will never be exactly representative of the population so there will always be some sampling error. As a rule, the larger the sample, the smaller the sampling error.</p> <p id="fs-idp40043264">In statistics, <strong>a sampling bias</strong> is created when a sample is collected from a population and some members of the population are not as likely to be chosen as others (remember, each member of the population should have an equally likely chance of being chosen). When a sampling bias happens, there can be incorrect conclusions drawn about the population that is being studied.</p> <div id="eip-318" class="bc-section section" data-depth="2"><h4 data-type="title">Critical Evaluation</h4> <p id="eip-612">We need to evaluate the statistical studies we read about critically and analyze them before accepting the results of the studies. Common problems to be aware of include</p> <ul><li>Problems with samples: A sample must be representative of the population. A sample that is not representative of the population is biased. Biased samples that are not representative of the population give results that are inaccurate and not valid.</li> <li>Self-selected samples: Responses only by people who choose to respond, such as call-in surveys, are often unreliable.</li> <li>Sample size issues: Samples that are too small may be unreliable. Larger samples are better, if possible. In some situations, having small samples is unavoidable and can still be used to draw conclusions. Examples: crash testing cars or medical testing for rare conditions</li> <li>Undue influence: &nbsp;collecting data or asking questions in a way that influences the response</li> <li>Non-response or refusal of subject to participate: &nbsp;The collected responses may no longer be representative of the population. &nbsp;Often, people with strong positive or negative opinions may answer surveys, which can affect the results.</li> <li>Causality: A relationship between two variables does not mean that one causes the other to occur. They may be related (correlated) because of their relationship through a different variable.</li> <li>Self-funded or self-interest studies: A study performed by a person or organization in order to support their claim. Is the study impartial? Read the study carefully to evaluate the work. Do not automatically assume that the study is good, but do not automatically assume the study is bad either. Evaluate it on its merits and the work done.</li> <li>Misleading use of data: improperly displayed graphs, incomplete data, or lack of context</li> <li>Confounding: &nbsp;When the effects of multiple factors on a response cannot be separated. &nbsp;Confounding makes it difficult or impossible to draw valid conclusions about the effect of each factor.</li> </ul> </div> <div id="eip-302" data-type="note" data-has-label="true" data-label=""><div data-type="title">Collaborative Exercise</div> <p id="id7878578">As a class, determine whether or not the following samples are representative. If they are not, discuss the reasons.</p> <ol id="element-785"><li>To find the average GPA of all students in a university, use all honor students at the university as the sample.</li> <li>To find out the most popular cereal among young people under the age of ten, stand outside a large supermarket for three hours and speak to every twentieth child under age ten who enters the supermarket.</li> <li>To find the average annual income of all adults in the United States, sample U.S. congressmen. Create a cluster sample by considering each state as a stratum (group). By using simple random sampling, select states to be part of the cluster. Then survey every U.S. congressman in the cluster.</li> <li>To determine the proportion of people taking public transportation to work, survey 20 people in New York City. Conduct the survey by sitting in Central Park on a bench and interviewing every person who sits next to you.</li> <li>To determine the average cost of a two-day stay in a hospital in Massachusetts, survey 100 hospitals across the state using simple random sampling.</li> </ol> </div> <div id="fs-idp39929584" class="textbox textbox--examples" data-type="example"><div id="fs-idp52654944" data-type="exercise"><div id="fs-idp44701008" data-type="problem"><p id="fs-idp47080192">A study is done to determine the average tuition that San Jose State undergraduate students pay per semester. Each student in the following samples is asked how much tuition he or she paid for the Fall semester. What is the type of sampling in each case?</p> <ol id="fs-idp94402832" type="a"><li>A sample of 100 undergraduate San Jose State students is taken by organizing the students’ names by classification (freshman, sophomore, junior, or senior), and then selecting 25 students from each.</li> <li>A random number generator is used to select a student from the alphabetical listing of all undergraduate students in the Fall semester. Starting with that student, every 50th student is chosen until 75 students are included in the sample.</li> <li>A completely random method is used to select 75 students. Each undergraduate student in the fall semester has the same probability of being chosen at any stage of the sampling process.</li> <li>The freshman, sophomore, junior, and senior years are numbered one, two, three, and four, respectively. A random number generator is used to pick two of those years. All students in those two years are in the sample.</li> <li>An administrative assistant is asked to stand in front of the library one Wednesday and to ask the first 100 undergraduate students he encounters what they paid for tuition the Fall semester. Those 100 students are the sample.</li> </ol> </div> <div id="fs-idp47713216" data-type="solution"><p id="fs-idp66558896">a. stratified; b. systematic; c. simple random; d. cluster; e. convenience</p> </div> </div> </div> <div id="fs-idm68463584" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <p id="eip-idm101249376">You are going to use the random number generator to generate different types of samples from the data.</p> <p id="fs-idm65546768">This table displays six sets of quiz scores (each quiz counts 10 points) for an elementary statistics class.</p> <table id="fs-idm49741168" summary=""><thead><tr><th>#1</th> <th>#2</th> <th>#3</th> <th>#4</th> <th>#5</th> <th>#6</th> </tr> </thead> <tbody><tr><td>5</td> <td>7</td> <td>10</td> <td>9</td> <td>8</td> <td>3</td> </tr> <tr><td>10</td> <td>5</td> <td>9</td> <td>8</td> <td>7</td> <td>6</td> </tr> <tr><td>9</td> <td>10</td> <td>8</td> <td>6</td> <td>7</td> <td>9</td> </tr> <tr><td>9</td> <td>10</td> <td>10</td> <td>9</td> <td>8</td> <td>9</td> </tr> <tr><td>7</td> <td>8</td> <td>9</td> <td>5</td> <td>7</td> <td>4</td> </tr> <tr><td>9</td> <td>9</td> <td>9</td> <td>10</td> <td>8</td> <td>7</td> </tr> <tr><td>7</td> <td>7</td> <td>10</td> <td>9</td> <td>8</td> <td>8</td> </tr> <tr><td>8</td> <td>8</td> <td>9</td> <td>10</td> <td>8</td> <td>8</td> </tr> <tr><td>9</td> <td>7</td> <td>8</td> <td>7</td> <td>7</td> <td>8</td> </tr> <tr><td>8</td> <td>8</td> <td>10</td> <td>9</td> <td>8</td> <td>7</td> </tr> </tbody> </table> <div id="fs-idm83079024" data-type="exercise"><div id="fs-idm133839952" data-type="problem"></div> </div> </div> <div class="textbox textbox--examples" data-type="example"><div id="element-3770" data-type="exercise"><div id="id1170185735110" data-type="problem"><p id="element-669">Determine the type of sampling used (simple random, stratified, systematic, cluster, or convenience).</p> <ol id="element-187" type="a"><li>A soccer coach selects six players from a group of boys aged eight to ten, seven players from a group of boys aged 11 to 12, and three players from a group of boys aged 13 to 14 to form a recreational soccer team.</li> <li>A pollster interviews all human resource personnel in five different high tech companies.</li> <li>A high school educational researcher interviews 50 high school female teachers and 50 high school male teachers.</li> <li>A medical researcher interviews every third cancer patient from a list of cancer patients at a local hospital.</li> <li>A high school counselor uses a computer to generate 50 random numbers and then picks students whose names correspond to the numbers.</li> <li>A student interviews classmates in his algebra class to determine how many pairs of jeans a student owns, on the average.</li> </ol> </div> <div id="id2242997" data-type="solution"><p id="exercise-solution-1">a. stratified; b. cluster; c. stratified; d. systematic; e. simple random; f.convenience</p> </div> </div> </div> <div id="fs-idp29434960" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="eip-idm58698240" data-type="exercise"><div id="eip-idm40923952" data-type="problem"><p id="fs-idp164822480">Determine the type of sampling used (simple random, stratified, systematic, cluster, or convenience).</p> <p id="fs-idp86918064">A high school principal polls 50 freshmen, 50 sophomores, 50 juniors, and 50 seniors regarding policy changes for after school activities.</p> </div> </div> </div> <p id="id7645179">If we were to examine two samples representing the same population, even if we used random sampling methods for the samples, they would not be exactly the same. Just as there is variation in data, there is variation in samples. As you become accustomed to sampling, the variability will begin to seem natural.</p> <div id="element-575" class="textbox textbox--examples" data-type="example"><p>Suppose ABC College has 10,000 part-time students (the population). We are interested in the average amount of money a part-time student spends on books in the fall term. Asking all 10,000 students is an almost impossible task.</p> <p>Suppose we take two different samples.</p> <p id="element-499">First, we use convenience sampling and survey ten students from a first term organic chemistry class. Many of these students are taking first term calculus in addition to the organic chemistry class. The amount of money they spend on books is as follows:</p> <p id="element-25001"><span id="set-element-195" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">&nbsp;128&nbsp; </span><span data-type="item">87&nbsp; </span><span data-type="item">173&nbsp; </span><span data-type="item">116&nbsp; </span><span data-type="item">130&nbsp; </span><span data-type="item">204&nbsp; </span><span data-type="item">147&nbsp; </span><span data-type="item">189&nbsp; </span><span data-type="item">93&nbsp; </span><span data-type="item">153</span></span></p> <p id="element-849">The second sample is taken using a list of senior citizens who take P.E. classes and taking every fifth senior citizen on the list, for a total of ten senior citizens. They spend:</p> <p id="element-25002"><span id="set-element-865" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">50&nbsp; </span><span data-type="item">40&nbsp; </span><span data-type="item">36 </span><span data-type="item">&nbsp;15&nbsp; </span><span data-type="item">50&nbsp; </span><span data-type="item">100&nbsp; </span><span data-type="item">40&nbsp; </span><span data-type="item">53&nbsp; </span><span data-type="item">22&nbsp; &nbsp;</span><span data-type="item">22</span></span></p> <p>It is unlikely that any student is in both samples.</p> <div data-type="exercise"><div id="id1170185361067" data-type="problem"><p id="eip-idp1467504">a. Do you think that either of these samples is representative of (or is characteristic of) the entire 10,000 part-time student population?</p> </div> <div id="id2000336" data-type="solution"><p id="eip-idp24214256">a. No. The first sample probably consists of science-oriented students. Besides the chemistry course, some of them are also taking first-term calculus. Books for these classes tend to be expensive. Most of these students are, more than likely, paying more than the average part-time student for their books. The second sample is a group of senior citizens who are, more than likely, taking courses for health and interest. The amount of money they spend on books is probably much less than the average parttime student. Both samples are biased. Also, in both cases, not all students have a chance to be in either sample.</p> </div> </div> <div id="element-179" data-type="exercise"><div id="id1170187341164" data-type="problem"><p id="eip-idp141882144">b. Since these samples are not representative of the entire population, is it wise to use the results to describe the entire population?</p> </div> <div id="id2072958" data-type="solution"><p id="eip-idp142060688">b. No. For these samples, each member of the population did not have an equally likely chance of being chosen.</p> </div> </div> <p id="element-513">Now, suppose we take a third sample. We choose ten different part-time students from the disciplines of chemistry, math, English, psychology, sociology, history, nursing, physical education, art, and early childhood development. (We assume that these are the only disciplines in which part-time students at ABC College are enrolled and that an equal number of part-time students are enrolled in each of the disciplines.) Each student is chosen using simple random sampling. Using a calculator, random numbers are generated and a student from a particular discipline is selected if he or she has a corresponding number. The students spend the following amounts:</p> <p id="element-25003"><span id="set-element-651" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">&nbsp;180&nbsp; </span><span data-type="item">50&nbsp; </span><span data-type="item">150&nbsp; </span><span data-type="item">85&nbsp; </span><span data-type="item">260&nbsp; </span><span data-type="item">75&nbsp; </span><span data-type="item">180&nbsp; </span><span data-type="item">200&nbsp; </span><span data-type="item">200&nbsp; </span><span data-type="item">150</span></span></p> <div id="element-887" data-type="exercise"><div id="id1170187890896" data-type="problem"><p id="element-666">c. Is the sample biased?</p> </div> <div id="id1170187183972" data-type="solution"><p id="element-971">c. The sample is unbiased, but a larger sample would be recommended to increase the likelihood that the sample will be close to representative of the population. However, for a biased sampling technique, even a large sample runs the risk of not being representative of the population.</p> </div> </div> <p id="element-577">Students often ask if it is “good enough” to take a sample, instead of surveying the entire population. If the survey is done well, the answer is yes.</p> </div> <div id="fs-idp1075328" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="eip-idm70522384" data-type="exercise"><div id="eip-idm52377904" data-type="problem"><p id="fs-idp35687664">A local radio station has a fan base of 20,000 listeners. The station wants to know if its audience would prefer more music or more talk shows. Asking all 20,000 listeners is an almost impossible task.</p> <p id="fs-idp43318816">The station uses convenience sampling and surveys the first 200 people they meet at one of the station’s music concert events. 24 people said they’d prefer more talk shows, and 176 people said they’d prefer more music.</p> <p id="fs-idp24594000">Do you think that this sample is representative of (or is characteristic of) the entire 20,000 listener population?</p> </div> </div> </div> </div> <div class="bc-section section" data-depth="1"><h3 data-type="title">Variation in Data</h3> <p id="id8219993"><span data-type="term">Variation</span> is present in any set of data. For example, 16-ounce cans of beverage may contain more or less than 16 ounces of liquid. In one study, eight 16 ounce cans were measured and produced the following amount (in ounces) of beverage:</p> <p id="element-25004"><span id="set-element-984" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">15.8&nbsp; </span><span data-type="item">16.1&nbsp; </span><span data-type="item">15.2&nbsp; </span><span data-type="item">14.8&nbsp; </span><span data-type="item">15.8&nbsp; </span><span data-type="item">15.9&nbsp; </span><span data-type="item">16.0&nbsp; </span><span data-type="item">15.5</span></span></p> <p id="id6856137">Measurements of the amount of beverage in a 16-ounce can may vary because different people make the measurements or because the exact amount, 16 ounces of liquid, was not put into the cans. Manufacturers regularly run tests to determine if the amount of beverage in a 16-ounce can falls within the desired range.</p> <p id="id5172174">Be aware that as you take data, your data may vary somewhat from the data someone else is taking for the same purpose. This is completely natural. However, if two or more of you are taking the same data and get very different results, it is time for you and the others to reevaluate your data-taking methods and your accuracy.</p> </div> <div id="eip-735" class="bc-section section" data-depth="1"><h3 data-type="title">Variation in Samples</h3> <p id="id11414550">It was mentioned previously that two or more <span data-type="term">samples</span> from the same <span data-type="term">population</span>, taken randomly, and having close to the same characteristics of the population will likely be different from each other. Suppose Doreen and Jung both decide to study the average amount of time students at their college sleep each night. Doreen and Jung each take samples of 500 students. Doreen uses systematic sampling and Jung uses cluster sampling. Doreen’s sample will be different from Jung’s sample. Even if Doreen and Jung used the same sampling method, in all likelihood their samples would be different. Neither would be wrong, however.</p> <p id="id11414555">Think about what contributes to making Doreen’s and Jung’s samples different.</p> <p id="id10715475">If Doreen and Jung took larger samples (i.e. the number of data values is increased), their sample results (the average amount of time a student sleeps) might be closer to the actual population average. But still, their samples would be, in all likelihood, different from each other. This <strong>variability in samples</strong> cannot be stressed enough.</p> <div id="id-191351579127" class="bc-section section" data-depth="2"><h4 data-type="title">Size of a Sample</h4> <p id="id10291271">The size of a sample (often called the number of observations) is important. The examples you have seen in this book so far have been small. Samples of only a few hundred observations, or even smaller, are sufficient for many purposes. In polling, samples that are from 1,200 to 1,500 observations are considered large enough and good enough if the survey is random and is well done. You will learn why when you study confidence intervals.</p> <p id="eip-idp48192112">Be aware that many large samples are biased. For example, call-in surveys are invariably biased, because people choose to respond or not.</p> <div id="fs-idm85528320" class="statistics collab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Collaborative Exercise</div> <p id="fs-idp60280064">Divide into groups of two, three, or four. Your instructor will give each group one six-sided die. Try this experiment twice. Roll one fair die (six-sided) 20 times. Record the number of ones, twos, threes, fours, fives, and sixes you get in <a class="autogenerated-content" href="#element-497" data-url="https://pressbooks.ccconline.org/accintrostats/chapter/data-sampling-and-variation-in-data-and-sampling/#element-497">(Figure)</a> and <a class="autogenerated-content" href="#fs-idm76511248" data-url="https://pressbooks.ccconline.org/accintrostats/chapter/data-sampling-and-variation-in-data-and-sampling/#fs-idm76511248">(Figure)</a> (“frequency” is the number of times a particular face of the die occurs):</p> <table id="element-497" summary="This table provides a blank template for recording the results of an experimental trial involving the roll of a die. The first column contains the values 1 through 6, representing the possible outcomes of a single throw of a die. The second column is to be used by the student to tally the number of times the die lands on that value during the experiment."><caption><span data-type="title">First Experiment (20 rolls)</span></caption> <thead><tr><th>Face on Die</th> <th>Frequency</th> </tr> </thead> <tbody><tr><td>1</td> <td></td> </tr> <tr><td>2</td> <td></td> </tr> <tr><td>3</td> <td></td> </tr> <tr><td>4</td> <td></td> </tr> <tr><td>5</td> <td></td> </tr> <tr><td>6</td> <td></td> </tr> </tbody> </table> <table id="fs-idm76511248" summary="A duplicate of the previous table, this table provides a blank template for recording the results of an experimental trial involving the roll of a die. The first column contains the values 1 through 6, representing the possible outcomes of a single throw of a die. The second column is to be used by the student to tally the number of times the die lands on that value during the experiment."><caption><span data-type="title">Second Experiment (20 rolls)</span></caption> <thead><tr><th>Face on Die</th> <th>Frequency</th> </tr> </thead> <tbody><tr><td>1</td> <td></td> </tr> <tr><td>2</td> <td></td> </tr> <tr><td>3</td> <td></td> </tr> <tr><td>4</td> <td></td> </tr> <tr><td>5</td> <td></td> </tr> <tr><td>6</td> <td></td> </tr> </tbody> </table> <p id="element-663">Did the two experiments have the same results? Probably not. If you did the experiment a third time, do you expect the results to be identical to the first or second experiment? Why or why not?</p> <p id="element-440">Which experiment had the correct results? They both did. The job of the statistician is to see through the variability and draw appropriate conclusions.</p> </div> </div> </div> <div id="eip-787" class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="eip-idp72168976">Gallup-Healthways Well-Being Index. http://www.well-beingindex.com/default.asp (accessed May 1, 2013).</p> <p id="eip-idp7078560">Gallup-Healthways Well-Being Index. http://www.well-beingindex.com/methodology.asp (accessed May 1, 2013).</p> <p id="eip-idp35327680">Gallup-Healthways Well-Being Index. http://www.gallup.com/poll/146822/gallup-healthways-index-questions.aspx (accessed May 1, 2013).</p> <p id="eip-idp35328208">Data from http://www.bookofodds.com/Relationships-Society/Articles/A0374-How-George-Gallup-Picked-the-President</p> <p id="eip-idm46671824">Dominic Lusinchi, “’President’ Landon and the 1936 <em data-effect="italics">Literary Digest</em> Poll: Were Automobile and Telephone Owners to Blame?” Social Science History 36, no. 1: 23-54 (2012), http://ssh.dukejournals.org/content/36/1/23.abstract (accessed May 1, 2013).</p> <p id="eip-idp72182368">“The Literary Digest Poll,” Virtual Laboratories in Probability and Statistics http://www.math.uah.edu/stat/data/LiteraryDigest.html (accessed May 1, 2013).</p> <p id="eip-idp2773216">“Gallup Presidential Election Trial-Heat Trends, 1936–2008,” Gallup Politics http://www.gallup.com/poll/110548/gallup-presidential-election-trialheat-trends-19362004.aspx#4 (accessed May 1, 2013).</p> <p id="eip-idm55424592">The Data and Story Library, http://lib.stat.cmu.edu/DASL/Datafiles/USCrime.html (accessed May 1, 2013).</p> <p id="eip-idm81943888">LBCC Distance Learning (DL) program data in 2010-2011, http://de.lbcc.edu/reports/2010-11/future/highlights.html#focus (accessed May 1, 2013).</p> <p id="eip-idm81943344">Data from San Jose Mercury News</p> </div> <div id="fs-idm44463152" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idm39216544">Data are individual items of information that come from a population or sample. Data may be classified as qualitative(categorical), quantitative continuous, or quantitative discrete.</p> <p>Because it is not practical to measure the entire population in a study, researchers use samples to represent the population. A random sample is a representative group from the population chosen by using a method that gives each individual in the population an equal chance of being included in the sample. Random sampling methods include simple random sampling, stratified sampling, cluster sampling, and systematic sampling. Convenience sampling is a nonrandom method of choosing a sample that often produces biased data.</p> <p>Samples that contain different individuals result in different data. This is true even when the samples are well-chosen and representative of the population. When properly selected, larger samples model the population more closely than smaller samples. There are many different potential problems that can affect the reliability of a sample. Statistical data needs to be critically analyzed, not simply accepted.</p> </div> <div id="fs-idm55283840" class="practice" data-depth="1"><h3 data-type="title">Practice</h3> <div id="eip-455" data-type="exercise"><div id="fs-idm60065488" data-type="problem"><p id="fs-idm78617856">“Number of times per week” is what type of data?</p> <p id="eip-idm49300272"><span data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">a. qualitative(categorical)<br /> </span><span data-type="item">b. quantitative discrete<br /> </span><span data-type="item">c. quantitative continuous</span></span></p> </div> <p>b</p> </div> <p id="fs-idm42775552"><em data-effect="italics">Use the following information to answer the next four exercises:</em> A study was done to determine the age, number of times per week, and the duration (amount of time) of residents using a local park in San Antonio, Texas. The first house in the neighborhood around the park was selected randomly, and then the resident of every eighth house in the neighborhood around the park was interviewed.</p> <div id="fs-idm67132096" data-type="exercise"><div id="fs-idm60885840" data-type="problem"><p id="fs-idm58359376">The sampling method was</p> <p id="fs-idm13619408"><span data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">a. simple random<br /> </span><span data-type="item">b. systematic<br /> </span><span data-type="item">c. stratified<br /> </span><span data-type="item">d. cluster</span></span></p> </div> <div id="fs-idm20056272" data-type="solution"><p id="fs-idm36580912">b</p> </div> </div> <div id="fs-idm13982640" data-type="exercise"><div id="fs-idm80593456" data-type="problem"><p id="fs-idm3496784">“Duration (amount of time)” is what type of data?</p> <p id="fs-idm13028176"><span data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">a. qualitative(categorical)<br /> </span><span data-type="item">b. quantitative discrete<br /> </span><span data-type="item">c. quantitative continuous</span></span></p> </div> <p>c</p> </div> <div id="fs-idm63417136" data-type="exercise"><div id="fs-idm58900112" data-type="problem"><p id="fs-idm37392208">The colors of the houses around the park are what kind of data?</p> <p id="fs-idm14485248"><span data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">a. qualitative(categorical)<br /> </span><span data-type="item">b. quantitative discrete<br /> </span><span data-type="item">c. quantitative continuous</span></span></p> </div> <div id="fs-idm39895808" data-type="solution"><p id="fs-idm55101488">a</p> </div> </div> <div id="fs-idm77797872" data-type="exercise"><div id="fs-idm71703824" data-type="problem"><p id="fs-idm36831184">The population is ______________________</p> </div> </div> <div id="fs-idp31767488" data-type="exercise"><div id="fs-idm56980144" data-type="problem"><p id="fs-idm65646832"><a class="autogenerated-content" href="#fs-idm39599824" data-url="https://pressbooks.ccconline.org/accintrostats/chapter/data-sampling-and-variation-in-data-and-sampling/#fs-idm39599824">(Figure)</a> contains the total number of deaths worldwide as a result of earthquakes from 2000 to 2012.</p> <table id="fs-idm39599824" summary=""><thead><tr><th>Year</th> <th>Total Number of Deaths</th> </tr> </thead> <tbody><tr><td>2000</td> <td>231</td> </tr> <tr><td>2001</td> <td>21,357</td> </tr> <tr><td>2002</td> <td>11,685</td> </tr> <tr><td>2003</td> <td>33,819</td> </tr> <tr><td>2004</td> <td>228,802</td> </tr> <tr><td>2005</td> <td>88,003</td> </tr> <tr><td>2006</td> <td>6,605</td> </tr> <tr><td>2007</td> <td>712</td> </tr> <tr><td>2008</td> <td>88,011</td> </tr> <tr><td>2009</td> <td>1,790</td> </tr> <tr><td>2010</td> <td>320,120</td> </tr> <tr><td>2011</td> <td>21,953</td> </tr> <tr><td>2012</td> <td>768</td> </tr> <tr><td><strong>Total</strong></td> <td><strong>823,856</strong></td> </tr> </tbody> </table> <p id="fs-idm35478848">Use <a class="autogenerated-content" href="#fs-idm39599824" data-url="https://pressbooks.ccconline.org/accintrostats/chapter/data-sampling-and-variation-in-data-and-sampling/#fs-idm39599824">(Figure)</a> to answer the following questions.</p> <ol id="fs-idm39651536" type="a"><li>What is the proportion of deaths between 2007 and 2012?</li> <li>What percent of deaths occurred before 2001?</li> <li>What is the percent of deaths that occurred in 2003 or after 2010?</li> <li>What is the fraction of deaths that happened before 2012?</li> <li>What kind of data is the number of deaths?</li> <li>Earthquakes are quantified according to the amount of energy they produce (examples are 2.1, 5.0, 6.7). What type of data is that?</li> <li>What contributed to the large number of deaths in 2010? In 2004? Explain.</li> </ol> </div> <div id="fs-idm60571216" data-type="solution"><ol id="fs-idm80641104" type="a"><li>0.5242</li> <li>0.03%</li> <li>6.86%</li> <li>(frac{823,088}{823,856})</li> <li>quantitative discrete</li> <li>quantitative continuous</li> <li>In both years, underwater earthquakes produced massive tsunamis.</li> </ol> </div> </div> <p id="eip-230"><em data-effect="italics">For the following four exercises, determine the type of sampling used (simple random, stratified, systematic, cluster, or convenience).</em></p> <div id="eip-310" data-type="exercise"><div id="fs-idm47756144" data-type="problem"><p id="fs-idm39414608">A group of test subjects is divided into twelve groups; then four of the groups are chosen at random.</p> </div> </div> <div id="eip-537" data-type="exercise"><div id="fs-idm55711296" data-type="problem"><p id="fs-idm74741424">A market researcher polls every tenth person who walks into a store.</p> </div> <div id="fs-idm54803760" data-type="solution"><p id="fs-idm63960800">systematic</p> </div> </div> <div id="eip-916" data-type="exercise"><div id="fs-idm44583104" data-type="problem"><p id="fs-idm42533376">The first 50 people who walk into a sporting event are polled on their television preferences.</p> </div> </div> <div id="eip-403" data-type="exercise"><div id="fs-idm44571536" data-type="problem"><p id="fs-idp13288656">A computer generates 100 random numbers, and 100 people whose names correspond with the numbers on the list are chosen.</p> </div> <div id="fs-idp166299360" data-type="solution"><p id="fs-idp38717600">simple random</p> </div> </div> <p id="eip-604"><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next seven exercises:</em> Studies are often done by pharmaceutical companies to determine the effectiveness of a treatment program. Suppose that a new AIDS antibody drug is currently under study. It is given to patients once the AIDS symptoms have revealed themselves. Of interest is the average (mean) length of time in months patients live once starting the treatment. Two researchers each follow a different set of 40 AIDS patients from the start of treatment until their deaths. The following data (in months) are collected. <span data-type="newline"><br /> </span></p> <p><strong>Researcher A:</strong> 3; 4; 11; 15; 16; 17; 22; 44; 37; 16; 14; 24; 25; 15; 26; 27; 33; 29; 35; 44; 13; 21; 22; 10; 12; 8; 40; 32; 26; 27; 31; 34; 29; 17; 8; 24; 18; 47; 33; 34</p> <p id="eip-936"><strong>Researcher B:</strong> 3; 14; 11; 5; 16; 17; 28; 41; 31; 18; 14; 14; 26; 25; 21; 22; 31; 2; 35; 44; 23; 21; 21; 16; 12; 18; 41; 22; 16; 25; 33; 34; 29; 13; 18; 24; 23; 42; 33; 29</p> <div id="eip-641" data-type="exercise"><div id="eip-523" data-type="problem"><p id="eip-233">Complete the tables using the data provided:</p> <table id="id6060586" summary="This table provides a blank template for calculating the results of a study using the data set provided. For each survival length range provided in the first column, students are to calculate and write down the frequency (second column), relative frequency (third column), and cumulative relative frequency (fourth column)."><caption><span data-type="title">Researcher A</span></caption> <colgroup><col /> <col data-width="4*" /> <col data-width="4*" /> <col data-width="4*" /></colgroup> <thead><tr><th>Survival Length (in months)</th> <th>Frequency</th> <th>Relative Frequency</th> <th>Cumulative Relative Frequency</th> </tr> </thead> <tbody><tr><td><span class="normal" data-type="emphasis" data-effect="normal">0.5–6.5</span></td> <td></td> <td></td> <td></td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">6.5–12.5</span></td> <td></td> <td></td> <td></td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">12.5–18.5</span></td> <td></td> <td></td> <td></td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">18.5–24.5</span></td> <td></td> <td></td> <td></td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">24.5–30.5</span></td> <td></td> <td></td> <td></td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">30.5–36.5</span></td> <td></td> <td></td> <td></td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">36.5–42.5</span></td> <td></td> <td></td> <td></td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">42.5–48.5</span></td> <td></td> <td></td> <td></td> </tr> </tbody> </table> <table id="id6990964" summary="A duplicate of the previous table, this table provides a blank template for calculating the results of a study using the data set provided. For each survival length range provided in the first column, students are to calculate and write down the frequency (second column), relative frequency (third column), and cumulative relative frequency (fourth column)."><caption><span data-type="title">Researcher B</span></caption> <colgroup><col data-width="2*" /> <col data-width="1*" /> <col data-width="2*" /> <col data-width="2*" /></colgroup> <thead><tr><th>Survival Length (in months)</th> <th>Frequency</th> <th>Relative Frequency</th> <th>Cumulative Relative Frequency</th> </tr> </thead> <tbody><tr><td><span class="normal" data-type="emphasis" data-effect="normal">0.5–6.5</span></td> <td></td> <td></td> <td></td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">6.5–12.5</span></td> <td></td> <td></td> <td></td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">12.5–18.5</span></td> <td></td> <td></td> <td></td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">18.5–24.5</span></td> <td></td> <td></td> <td></td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">24.5–30.5</span></td> <td></td> <td></td> <td></td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">30.5–36.5</span></td> <td></td> <td></td> <td></td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">36.5-45.5</span></td> <td></td> <td></td> <td></td> </tr> </tbody> </table> </div> </div> <div id="eip-885" data-type="exercise"><div id="id14667071" data-type="problem"><p id="eip-idp45627600">Determine what the key term data refers to in the above example for Researcher A.</p> </div> <div id="eip-126" data-type="solution"><p id="eip-259">values for <em data-effect="italics">X</em>, such as 3, 4, 11, and so on</p> </div> </div> <div id="eip-812" data-type="exercise"><div id="id14670683" data-type="problem"><p id="prob_7">List two reasons why the data may differ.</p> </div> <p>Answers will vary. Sample answer: One reason may be the average age of the individuals in the two samples. Or, perhaps the drug affects men and women differently. If the ratio of men and women aren’t the same in both sample groups, then the data would differ.</p> </div> <div data-type="exercise"><div id="id14670710" data-type="problem"><p id="prob_8">Can you tell if one researcher is correct and the other one is incorrect? Why?</p> </div> <div id="eip-337" data-type="solution"><p id="eip-33">No, we do not have enough information to make such a claim.</p> </div> </div> <div id="eip-322" data-type="exercise"><div id="id14670739" data-type="problem"><p id="prob_9">Would you expect the data to be identical? Why or why not?</p> </div> <p>Since the treatment is not the same the data might be different unless neither treatment has an effect.</p> </div> <div id="eip-471" data-type="exercise"><div id="id14670767" data-type="problem"><p id="prob_10">Suggest at least two methods the researchers might use to gather random data.</p> </div> <div id="eip-683" data-type="solution"><p>Take a simple random sample from each group. One way is by assigning a number to each patient and using a random number generator to randomly select patients.</p> </div> </div> <div id="eip-581" data-type="exercise"><div id="eip-idm59020688" data-type="problem"><p id="prob_11">Suppose that the first researcher conducted his survey by randomly choosing one state in the nation and then randomly picking 40 patients from that state. What sampling method would that researcher have used?</p> </div> <p>He has used a simple random sample method.</p> </div> <div id="eip-183" data-type="exercise"><div id="id14670826" data-type="problem"><p id="prob_12">Suppose that the second researcher conducted his survey by choosing 40 patients he knew. What sampling method would that researcher have used? What concerns would you have about this data set, based upon the data collection method?</p> </div> <div id="eip-247" data-type="solution"><p id="eip-944">This would be convenience sampling and is not random.</p> </div> </div> <p id="eip-idm417504"><em data-effect="italics">Use the following data to answer the next five exercises:</em> Two researchers are gathering data on hours of video games played by school-aged children and young adults. They each randomly sample different groups of 150 students from the same school. They collect the following data.</p> <table id="fs-idp67465568" summary="Researcher A Table"><caption><span data-type="title">Researcher A</span></caption> <thead><tr><th>Hours Played per Week</th> <th>Frequency</th> <th>Relative Frequency</th> <th>Cumulative Relative Frequency</th> </tr> </thead> <tbody><tr><td>0–2</td> <td>26</td> <td>0.17</td> <td>0.17</td> </tr> <tr><td>2–4</td> <td>30</td> <td>0.20</td> <td>0.37</td> </tr> <tr><td>4–6</td> <td>49</td> <td>0.33</td> <td>0.70</td> </tr> <tr><td>6–8</td> <td>25</td> <td>0.17</td> <td>0.87</td> </tr> <tr><td>8–10</td> <td>12</td> <td>0.08</td> <td>0.95</td> </tr> <tr><td>10–12</td> <td>8</td> <td>0.05</td> <td>1</td> </tr> </tbody> </table> <table id="fs-idm16196000" summary="Researcher B Table"><caption><span data-type="title">Researcher B</span></caption> <thead><tr><th>Hours Played per Week</th> <th>Frequency</th> <th>Relative Frequency</th> <th>Cumulative Relative Frequency</th> </tr> </thead> <tbody><tr><td>0–2</td> <td>48</td> <td>0.32</td> <td>0.32</td> </tr> <tr><td>2–4</td> <td>51</td> <td>0.34</td> <td>0.66</td> </tr> <tr><td>4–6</td> <td>24</td> <td>0.16</td> <td>0.82</td> </tr> <tr><td>6–8</td> <td>12</td> <td>0.08</td> <td>0.90</td> </tr> <tr><td>8–10</td> <td>11</td> <td>0.07</td> <td>0.97</td> </tr> <tr><td>10–12</td> <td>4</td> <td>0.03</td> <td>1</td> </tr> </tbody> </table> <div id="eip-695" data-type="exercise"><div id="fs-idm18010208" data-type="problem"><p id="fs-idp17780256">Give a reason why the data may differ.</p> <p><span style="font-size: 1em">&nbsp;The researchers are studying different groups, so there will be some variation in the data.&nbsp;</span></p> </div> </div> <div id="eip-381" data-type="exercise"><div id="fs-idp50411184" data-type="problem"><p id="fs-idm5625760">Would the sample size be large enough if the population is the students in the school?</p> </div> <div id="fs-idm5086640" data-type="solution"><p id="fs-idp44993584">Yes, the sample size of 150 would be large enough to reflect a population of one school.</p> </div> </div> <div id="eip-680" data-type="exercise"><div id="fs-idp13795744" data-type="problem"><p id="fs-idm14907952">Would the sample size be large enough if the population is school-aged children and young adults in the United States?</p> </div> <p>There are many school-aged children and young adults in the United States, and the study was done at only one school, so the sample size is not large enough to reflect the population.</p> </div> <div data-type="exercise"><div id="fs-idm17330624" data-type="problem"><p id="fs-idm16129040">Researcher A concludes that most students play video games between four and six hours each week. Researcher B concludes that most students play video games between two and four hours each week. Who is correct?</p> </div> <div id="fs-idp17911216" data-type="solution"><p id="fs-idp15262096">Even though the specific data support each researcher’s conclusions, the different results suggest that more data need to be collected before the researchers can reach a conclusion.</p> </div> </div> <div id="eip-528" data-type="exercise"><div id="fs-idp19886544" data-type="problem"><p id="fs-idm46843168">As part of a way to reward students for participating in the survey, the researchers gave each student a gift card to a video game store. Would this affect the data if students knew about the award before the study?</p> </div> <p>Yes, people who play games more might be more likely to participate, since they would want the gift card more than a student who does not play video games. This would leave out many students who do not play games at all and skew the data.</p> </div> <p id="fs-idm7660016"><em data-effect="italics">Use the following data to answer the next five exercises:</em> A pair of studies was performed to measure the effectiveness of a new software program designed to help stroke patients regain their problem-solving skills. Patients were asked to use the software program twice a day, once in the morning and once in the evening. The studies observed 200 stroke patients recovering over a period of several weeks. The first study collected the data in <a class="autogenerated-content" href="#fs-idp19719008" data-url="https://pressbooks.ccconline.org/accintrostats/chapter/data-sampling-and-variation-in-data-and-sampling/#fs-idp19719008">(Figure)</a>. The second study collected the data in <a class="autogenerated-content" href="#fs-idp79550096" data-url="https://pressbooks.ccconline.org/accintrostats/chapter/data-sampling-and-variation-in-data-and-sampling/#fs-idp79550096">(Figure)</a>.</p> <table id="fs-idp19719008" summary=""><thead><tr><th>Group</th> <th>Showed improvement</th> <th>No improvement</th> <th>Deterioration</th> </tr> </thead> <tbody><tr><td>Used program</td> <td>142</td> <td>43</td> <td>15</td> </tr> <tr><td>Did not use program</td> <td>72</td> <td>110</td> <td>18</td> </tr> </tbody> </table> <table id="fs-idp79550096" summary=""><thead><tr><th>Group</th> <th>Showed improvement</th> <th>No improvement</th> <th>Deterioration</th> </tr> </thead> <tbody><tr><td>Used program</td> <td>105</td> <td>74</td> <td>19</td> </tr> <tr><td>Did not use program</td> <td>89</td> <td>99</td> <td>12</td> </tr> </tbody> </table> <div id="eip-370" data-type="exercise"><div id="fs-idm3080944" data-type="problem"><p id="fs-idp24352912">Given what you know, which study is correct?</p> </div> <div id="fs-idm63151328" data-type="solution"><p id="fs-idp21500080">There is not enough information given to judge if either one is correct or incorrect.</p> </div> </div> <div id="eip-78" data-type="exercise"><div id="fs-idp20475520" data-type="problem"><p id="fs-idp18565968">The first study was performed by the company that designed the software program. The second study was performed by the American Medical Association. Which study is more reliable?</p> </div> <p>The second study is more reliable, because the company would be interested in showing results that favored a higher rate of improvement from patients using their software. The data may be skewed; however, the American Medical Association is not concerned with the success of the software and so should be objective.</p> </div> <div id="eip-297" data-type="exercise"><div id="fs-idm1829168" data-type="problem"><p id="fs-idm49249536">Both groups that performed the study concluded that the software works. Is this accurate?</p> </div> <div id="fs-idp15790288" data-type="solution"><p id="fs-idm13661152">The software program seems to work because the second study shows that more patients improve while using the software than not. Even though the difference is not as large as that in the first study, the results from the second study are likely more reliable and still show improvement.</p> </div> </div> <div data-type="exercise"><div id="fs-idp16634960" data-type="problem"><p id="fs-idp1933072">The company takes the two studies as proof that their software causes mental improvement in stroke patients. Is this a fair statement?</p> </div> <p>No, the data suggest the two are correlated, but more studies need to be done to prove that using the software causes improvement in stroke patients.</p> </div> <div id="eip-668" data-type="exercise"><div id="fs-idm7567104" data-type="problem"><p id="fs-idp50471984">Patients who used the software were also a part of an exercise program whereas patients who did not use the software were not. Does this change the validity of the conclusions from <a class="autogenerated-content" href="#eip-297" data-url="https://pressbooks.ccconline.org/accintrostats/chapter/data-sampling-and-variation-in-data-and-sampling/#eip-297">(Figure)</a>?</p> </div> <div id="fs-idp2741616" data-type="solution"><p id="fs-idp20270432">Yes, because we cannot tell if the improvement was due to the software or the exercise; the data is confounded, and a reliable conclusion cannot be drawn. New studies should be performed.</p> </div> </div> <div id="eip-908" data-type="exercise"><div id="fs-idp16509920" data-type="problem"><p id="fs-idp24413168">Is a sample size of 1,000 a reliable measure for a population of 5,000?</p> </div> <p>Yes, 1,000 represents 20% of the population and should be representative, if the population of the sample is chosen at random.</p> </div> <div id="eip-254" data-type="exercise"><div id="fs-idm2561856" data-type="problem"><p id="fs-idm9636784">Is a sample of 500 volunteers a reliable measure for a population of 2,500?</p> </div> <div id="fs-idm14441984" data-type="solution"><p id="fs-idp401392">No, even though the sample is large enough, the fact that the sample consists of volunteers makes it a self-selected sample, which is not reliable.</p> </div> </div> <div id="eip-831" data-type="exercise"><div id="fs-idp18528704" data-type="problem"><p id="fs-idp23008144">A question on a survey reads: “Do you prefer the delicious taste of Brand X or the taste of Brand Y?” Is this a fair question?</p> </div> <p>No, the question is creating undue influence by adding the word “delicious” to describe Brand X. The wording may influence responses.</p> </div> <div id="eip-121" data-type="exercise"><div id="fs-idp12191632" data-type="problem"><p id="fs-idm68825152">Is a sample size of two representative of a population of five?</p> </div> <div id="fs-idp11584704" data-type="solution"><p id="fs-idp66162848">No, even though the sample is a large portion of the population, two responses are not enough to justify any conclusions. Because the population is so small, it would be better to include everyone in the population to get the most accurate data.</p> </div> </div> <div id="eip-17" data-type="exercise"><div id="fs-idm10416592" data-type="problem"><p id="fs-idp8901520">Is it possible for two experiments to be well run with similar sample sizes to get different data?</p> </div> <p>Yes, there will most likely be a degree of variation between any two studies, even if they are set up and run the same way. Each study may be affected differently by unknown factors such as location, mood of the subjects, or time of year.</p> </div> </div> <div id="fs-idm36629824" class="free-response" data-depth="1"><h3 data-type="title">HOMEWORK</h3> <p id="eip-idp129446736"><em data-effect="italics">For the following exercises, identify the type of data that would be used to describe a response (quantitative discrete, quantitative continuous, or qualitative), and give an example of the data.</em></p> <div id="element-842" data-type="exercise"><div id="id30102618" data-type="problem"><p id="eip-idp111230112">1) number of tickets sold to a concert</p> <p>&nbsp;</p> </div> </div> <div id="eip-869" data-type="exercise"><div id="eip-207" data-type="problem"><p id="eip-idp24757632">2) percent of body fat</p> <p>&nbsp;</p> </div> </div> <div id="eip-146" data-type="exercise"><div id="eip-867" data-type="problem"><p id="eip-idp8956160">3) favorite baseball team</p> <p>&nbsp;</p> </div> </div> <div id="eip-572" data-type="exercise"><div id="eip-677" data-type="problem"><p id="eip-idm111997008">4) time in line to buy groceries</p> <p>&nbsp;</p> </div> </div> <div id="eip-601" data-type="exercise"><div id="eip-715" data-type="problem"><p id="eip-idp122567424">5) number of students enrolled at Evergreen Valley College</p> <p>&nbsp;</p> </div> </div> <div id="eip-22" data-type="exercise"><div id="eip-994" data-type="problem"><p id="eip-idm129992992">6) most-watched television show</p> <p>&nbsp;</p> </div> </div> <div id="eip-279" data-type="exercise"><div id="eip-825" data-type="problem"><p id="eip-idm58234496">7) brand of toothpaste</p> <p>&nbsp;</p> </div> <div id="eip-327" data-type="solution"><p id="eip-idm127187888"><span style="font-size: 1em">8) distance to the closest movie theatre</span></p> <p>&nbsp;</p> </div> </div> <div id="eip-418" data-type="exercise"><div id="eip-513" data-type="problem"><p id="eip-idm46593472">9) age of executives in Fortune 500 companies</p> <p>&nbsp;</p> </div> <div id="eip-86" data-type="solution"><p id="eip-idp15293808"><span style="font-size: 1em">10) number of competing computer spreadsheet software packages</span></p> <p>&nbsp;</p> </div> </div> <p id="fs-idm105632384"><em data-effect="italics">Use the following information to answer the next two exercises:</em> A study was done to determine the age, number of times per week, and the duration (amount of time) of resident use of a local park in San Jose. The first house in the neighborhood around the park was selected randomly and then every 8th house in the neighborhood around the park was interviewed.</p> <div id="element-636" data-type="exercise"><div id="id30104871" data-type="problem"><p>11) “Number of times per week” is what type of data?</p> <ol id="element-661" type="a" data-mark-suffix="."><li>qualitative</li> <li>quantitative discrete</li> <li>quantitative continuous</li> </ol> <p>&nbsp;</p> </div> </div> <div id="fs-idm58778016" data-type="exercise"><div id="fs-idm116555680" data-type="problem"><p id="fs-idm22393792">12) “Duration (amount of time)” is what type of data?</p> <ol id="fs-idm18544672" type="a" data-mark-suffix="."><li>qualitative</li> <li>quantitative discrete</li> <li>quantitative continuous</li> </ol> <p>&nbsp;</p> </div> </div> <div id="eip-772" data-type="exercise"><div id="id30201732" data-type="problem"><p id="element-563">13) Airline companies are interested in the consistency of the number of babies on each flight, so that they have adequate safety equipment. Suppose an airline conducts a survey. Over Thanksgiving weekend, it surveys six flights from Boston to Salt Lake City to determine the number of babies on the flights. It determines the amount of safety equipment needed by the result of that study.</p> <ol id="element-797" type="a" data-mark-suffix="."><li>Using complete sentences, list three things wrong with the way the survey was conducted.</li> <li>Using complete sentences, list three ways that you would improve the survey if it were to be repeated</li> </ol> <p>&nbsp;</p> </div> <div id="eip-746" data-type="solution"></div> </div> <div id="eip-700" data-type="exercise"><div id="id30201811" data-type="problem"><p id="element-881">14) Suppose you want to determine the mean number of students per statistics class in your state. Describe a possible sampling method in three to five complete sentences. Make the description detailed.</p> <p>&nbsp;</p> </div> </div> <div id="eip-155" data-type="exercise"><div id="id30201846" data-type="problem"><p id="element-176">15) Suppose you want to determine the mean number of cans of soda drunk each month by students in their twenties at your school. Describe a possible sampling method in three to five complete sentences. Make the description detailed.</p> </div> <div id="eip-359" data-type="solution"><p id="eip-611"></p></div> </div> <div id="eip-255" data-type="exercise"><div id="id30103600" data-type="problem"><p id="eip-idp43814448">16) List some practical difficulties involved in getting accurate results from a telephone survey.</p> <p>&nbsp;</p> </div> </div> <div id="eip-738" data-type="exercise"><div id="eip-104" data-type="problem"><p id="eip-idp1979008">17) List some practical difficulties involved in getting accurate results from a mailed survey.</p> <p>&nbsp;</p> </div> </div> <div id="eip-608" data-type="exercise"><div id="eip-433" data-type="problem"><p id="eip-idp194071008">18) With your classmates, brainstorm some ways you could overcome these problems if you needed to conduct a phone or mail survey.</p> <p>&nbsp;</p> </div> </div> <div id="eip-818" data-type="exercise"><div id="id30103983" data-type="problem"><p id="element-27">19) The instructor takes her sample by gathering data on five randomly selected students from each Lake Tahoe Community College math class. The type of sampling she used is</p> <ol type="a" data-mark-suffix="."><li>cluster sampling</li> <li>stratified sampling</li> <li>simple random sampling</li> <li>convenience sampling</li> </ol> <p>&nbsp;</p> </div> </div> <div id="eip-40" data-type="exercise"><div id="eip-idp39664016" data-type="problem"><p id="eip-idp39664272">20) A study was done to determine the age, number of times per week, and the duration (amount of time) of residents using a local park in San Jose. The first house in the neighborhood around the park was selected randomly and then every eighth house in the neighborhood around the park was interviewed. The sampling method was:</p> <ol id="element-503" type="a"><li>simple random</li> <li>systematic</li> <li>stratified</li> <li>cluster</li> </ol> <p>&nbsp;</p> </div> </div> <div id="eip-720" data-type="exercise"><div id="eip-630a" data-type="problem"><p id="eip-312a">21) Name the sampling method used in each of the following situations:</p> <ol id="eip-id1172345503325" type="a" data-mark-suffix="."><li>A woman in the airport is handing out questionnaires to travelers asking them to evaluate the airport’s service. She does not ask travelers who are hurrying through the airport with their hands full of luggage, but instead asks all travelers who are sitting near gates and not taking naps while they wait.</li> <li>A teacher wants to know if her students are doing homework, so she randomly selects rows two and five and then calls on all students in row two and all students in row five to present the solutions to homework problems to the class.</li> <li>The marketing manager for an electronics chain store wants information about the ages of its customers. Over the next two weeks, at each store location, 100 randomly selected customers are given questionnaires to fill out asking for information about age, as well as about other variables of interest.</li> <li>The librarian at a public library wants to determine what proportion of the library users are children. The librarian has a tally sheet on which she marks whether books are checked out by an adult or a child. She records this data for every fourth patron who checks out books.</li> <li>A political party wants to know the reaction of voters to a debate between the candidates. The day after the debate, the party’s polling staff calls 1,200 randomly selected phone numbers. If a registered voter answers the phone or is available to come to the phone, that registered voter is asked whom he or she intends to vote for and whether the debate changed his or her opinion of the candidates.</li> </ol> </div> <div data-type="solution"></div> </div> <div id="eip-937" data-type="exercise"><div id="id30103442" data-type="problem"><p id="element-143">22) A “random survey” was conducted of 3,274 people of the “microprocessor generation” (people born since 1971, the year the microprocessor was invented). It was reported that 48% of those individuals surveyed stated that if they had ?2,000 to spend, they would use it for computer equipment. Also, 66% of those surveyed considered themselves relatively savvy computer users.</p> <ol id="element-411" type="a" data-mark-suffix="."><li>Do you consider the sample size large enough for a study of this type? Why or why not?</li> <li>Based on your “gut feeling,” do you believe the percents accurately reflect the U.S. population for those individuals born since 1971? If not, do you think the percents of the population are actually higher or lower than the sample statistics? Why? <span data-type="newline"><br /> </span>Additional information: The survey, reported by Intel Corporation, was filled out by individuals who visited the Los Angeles Convention Center to see the Smithsonian Institute’s road show called “America’s Smithsonian.”</li> <li>With this additional information, do you feel that all demographic and ethnic groups were equally represented at the event? Why or why not?</li> <li>With the additional information, comment on how accurately you think the sample statistics reflect the population parameters.</li> </ol> <p>&nbsp;</p> </div> </div> <div id="eip-667" data-type="exercise"><div id="fs-idm42832576" data-type="problem"><p id="fs-idm18026656">23) The Well-Being Index is a survey that follows trends of U.S. residents on a regular basis. There are six areas of health and wellness covered in the survey: Life Evaluation, Emotional Health, Physical Health, Healthy Behavior, Work Environment, and Basic Access. Some of the questions used to measure the Index are listed below.</p> <p id="fs-idm15478400">Identify the type of data obtained from each question used in this survey: qualitative, quantitative discrete, or quantitative continuous.</p> <ol id="fs-idm65780000" type="a" data-mark-suffix="."><li>Do you have any health problems that prevent you from doing any of the things people your age can normally do?</li> <li>During the past 30 days, for about how many days did poor health keep you from doing your usual activities?</li> <li>In the last seven days, on how many days did you exercise for 30 minutes or more?</li> <li>Do you have health insurance coverage?</li> </ol> </div> <div id="eip-idp50852800" data-type="solution"></div> </div> <div id="eip-428" data-type="exercise"><div id="eip-54" data-type="problem"><p id="eip-603">24) In advance of the 1936 Presidential Election, a magazine titled Literary Digest released the results of an opinion poll predicting that the republican candidate Alf Landon would win by a large margin. The magazine sent post cards to approximately 10,000,000 prospective voters. These prospective voters were selected from the subscription list of the magazine, from automobile registration lists, from phone lists, and from club membership lists. Approximately 2,300,000 people returned the postcards.</p> <ol id="fs-idp4880096" type="a"><li>Think about the state of the United States in 1936. Explain why a sample chosen from magazine subscription lists, automobile registration lists, phone books, and club membership lists was not representative of the population of the United States at that time.</li> <li>What effect does the low response rate have on the reliability of the sample?</li> <li>Are these problems examples of sampling error or nonsampling error?</li> <li>During the same year, George Gallup conducted his own poll of 30,000 prospective voters. These researchers used a method they called “quota sampling” to obtain survey answers from specific subsets of the population. Quota sampling is an example of which sampling method described in this module?</li> </ol> </div> </div> <div id="eip-706" data-type="exercise"><div id="fs-idm48468192" data-type="problem"></div> </div> <div id="eip-943" data-type="exercise"><div id="eip-11" data-type="solution"></div> </div> </div> <div id="eip-166" class="bring-together-homework" data-depth="1"><h3 data-type="title">Bringing It Together</h3> <div id="eip-219" data-type="exercise"><div id="id30201877" data-type="problem"><p id="element-740">25) Seven hundred and seventy-one distance learning students at Long Beach City College responded to surveys in the 2010-11 academic year. Highlights of the summary report are listed in <a class="autogenerated-content" href="#fs-idp27381216" data-url="https://pressbooks.ccconline.org/accintrostats/chapter/data-sampling-and-variation-in-data-and-sampling/#fs-idp27381216">(Figure)</a>.</p> <table id="fs-idp27381216" summary="This table presents a list of survey responses by Long Beach City College students in the first column, with percentages in the second column."><caption><span data-type="title">LBCC Distance Learning Survey Results</span></caption> <tbody><tr><td>Have computer at home</td> <td>96%</td> </tr> <tr><td>Unable to come to campus for classes</td> <td>65%</td> </tr> <tr><td>Age 41 or over</td> <td>24%</td> </tr> <tr><td>Would like LBCC to offer more DL courses</td> <td>95%</td> </tr> <tr><td>Took DL classes due to a disability</td> <td>17%</td> </tr> <tr><td>Live at least 16 miles from campus</td> <td>13%</td> </tr> <tr><td>Took DL courses to fulfill transfer requirements</td> <td>71%</td> </tr> </tbody> </table> <ol id="element-431" type="a"><li>What percent of the students surveyed do not have a computer at home?</li> <li>About how many students in the survey live at least 16 miles from campus?</li> <li>If the same survey were done at Great Basin College in Elko, Nevada, do you think the percentages would be the same? Why?</li> </ol> </div> </div> <div id="eip-731" data-type="exercise"><div id="eip-708" data-type="problem"></div> <div id="eip-635" data-type="solution"><ul id="eip-id1169691864822"></ul> <p><strong>Answers to odd questions</strong></p> <p>1) quantitative discrete, 150</p> <p>&nbsp;</p> <p>3) qualitative, Oakland A’s</p> <p>&nbsp;</p> <p>5) quantitative discrete, 11,234 students</p> <p>&nbsp;</p> <p>7) qualitative, Crest</p> <p>&nbsp;</p> <p>9) quantitative continuous, 47.3 years</p> <p>&nbsp;</p> <p>11) b</p> <p>&nbsp;</p> <p>13)</p> <ol id="eip-idm59103408" type="a"><li>The survey was conducted using six similar flights. <span data-type="newline"><br /> </span>The survey would not be a true representation of the entire population of air travelers. <span data-type="newline"><br /> </span>Conducting the survey on a holiday weekend will not produce representative results.</li> <li>Conduct the survey during different times of the year. <span data-type="newline"><br /> </span>Conduct the survey using flights to and from various locations. <span data-type="newline"><br /> </span>Conduct the survey on different days of the week.</li> </ol> <p>&nbsp;</p> <p>15) Answers will vary. Sample Answer: You could use a systematic sampling method. Stop the tenth person as they leave one of the buildings on campus at 9:50 in the morning. Then stop the tenth person as they leave a different building on campus at 1:50 in the afternoon.</p> <p>&nbsp;</p> <p>17) Answers will vary. Sample Answer: Many people will not respond to mail surveys. If they do respond to the surveys, you can’t be sure who is responding. In addition, mailing lists can be incomplete.</p> <p>&nbsp;</p> <p>19) b</p> <p>&nbsp;</p> <p>21) a) <span id="eip-id1170191145155" data-type="list" data-list-type="enumerated" data-mark-suffix="." data-display="inline"><span data-type="item">convenience&nbsp; b) </span><span data-type="item">cluster&nbsp; c) </span><span data-type="item">stratified d) </span><span data-type="item">systematic&nbsp; e) </span><span data-type="item">simple random</span></span></p> <p>&nbsp;</p> <p>23)</p> <ol id="eip-idp50853056" type="a" data-mark-suffix="."><li>qualitative</li> <li>quantitative discrete</li> <li>quantitative discrete</li> <li>qualitative</li> </ol> <p>&nbsp;</p> <p>25) &nbsp;4% 13% Not necessarily. Long beach City is the seventh largest in California the college has an enrollment of approximately 27,000 students. On the other hand, Great Basin College has its campuses in rural northeastern Nevada, and its enrollment of about 3,500 students.</p> <p>&nbsp;</p> </div> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="fs-idp32623728"><dt>Cluster Sampling</dt> <dd id="fs-idp5132208">a method for selecting a random sample and dividing the population into groups (clusters); use simple random sampling to select a set of clusters. Every individual in the chosen clusters is included in the sample.</dd> </dl> <dl id="continrv"><dt>Continuous Random Variable</dt> <dd id="id20531650">a random variable (RV) whose outcomes are measured; the height of trees in the forest is a continuous RV.</dd> </dl> <dl id="fs-idp3703840"><dt>Convenience Sampling</dt> <dd id="fs-idp25034160">a nonrandom method of selecting a sample; this method selects individuals that are easily accessible and may result in biased data.</dd> </dl> <dl id="discrrv"><dt>Discrete Random Variable</dt> <dd id="id18678019">a random variable (RV) whose outcomes are counted</dd> </dl> <dl id="fs-idp2765392"><dt>Nonsampling Error</dt> <dd id="fs-idp73290592">an issue that affects the reliability of sampling data other than natural variation; it includes a variety of human errors including poor study design, biased sampling methods, inaccurate information provided by study participants, data entry errors, and poor analysis.</dd> </dl> <dl id="qual"><dt>Qualitative Data</dt> <dd id="id14429967">See <a href="#id15539900" data-url="/contents/cb418599-f69b-46c1-b0ef-60d9e36e677f#id15539900">Data</a>.</dd> </dl> <dl id="quant"><dt>Quantitative Data</dt> <dd id="id14429999">See <a href="#id15539900" data-url="/contents/cb418599-f69b-46c1-b0ef-60d9e36e677f#id15539900">Data</a>.</dd> </dl> <dl id="fs-idm32759648"><dt>Random Sampling</dt> <dd id="fs-idm21672384">a method of selecting a sample that gives every member of the population an equal chance of being selected.</dd> </dl> <dl id="fs-idp46371840"><dt>Sampling Bias</dt> <dd id="fs-idp56438976">not all members of the population are equally likely to be selected</dd> </dl> <dl id="fs-idp25387472"><dt>Sampling Error</dt> <dd id="fs-idp40044672">the natural variation that results from selecting a sample to represent a larger population; this variation decreases as the sample size increases, so selecting larger samples reduces sampling error.</dd> </dl> <dl id="fs-idp25704576"><dt>Sampling with Replacement</dt> <dd id="fs-idp54367760">Once a member of the population is selected for inclusion in a sample, that member is returned to the population for the selection of the next individual.</dd> </dl> <dl id="fs-idp101272432"><dt>Sampling without Replacement</dt> <dd id="fs-idm15157568">A member of the population may be chosen for inclusion in a sample only once. If chosen, the member is not returned to the population before the next selection.</dd> </dl> <dl id="fs-idp52118064"><dt>Simple Random Sampling</dt> <dd id="fs-idm19437232">a straightforward method for selecting a random sample; give each member of the population a number. Use a random number generator to select a set of labels. These randomly selected labels identify the members of your sample.</dd> </dl> <dl id="fs-idp7467888"><dt>Stratified Sampling</dt> <dd id="fs-idp27614512">a method for selecting a random sample used to ensure that subgroups of the population are represented adequately; divide the population into groups (strata). Use simple random sampling to identify a proportionate number of individuals from each stratum.</dd> </dl> <dl id="fs-idp51718080"><dt>Systematic Sampling</dt> <dd id="fs-idp4126240">a method for selecting a random sample; list the members of the population. Use simple random sampling to select a starting point in the population. Let k = (number of individuals in the population)/(number of individuals needed in the sample). Choose every kth individual in the list starting with the one that was randomly selected. If necessary, return to the beginning of the population list to complete your sample.</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-experimental-design-and-ethics" title="Chapter 1.4: Experimental Design and Ethics"><div class="chapter-title-wrap"><h3 class="chapter-number">4</h3><h2 class="chapter-title"><span class="display-none">Chapter 1.4: Experimental Design and Ethics</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p id="fs-idm26748320">Does aspirin reduce the risk of heart attacks? Is one brand of fertilizer more effective at growing roses than another? Is fatigue as dangerous to a driver as the influence of alcohol? Questions like these are answered using randomized experiments. In this module, you will learn important aspects of experimental design. Proper study design ensures the production of reliable, accurate data.</p> <p id="fs-idm23516688">The purpose of an experiment is to investigate the relationship between two variables. When one variable causes change in another, we call the first variable the <span data-type="term">explanatory variable</span>. The affected variable is called the <span data-type="term">response variable</span>. In a randomized experiment, the researcher manipulates values of the explanatory variable and measures the resulting changes in the response variable. The different values of the explanatory variable are called <span data-type="term">treatments</span>. An <span data-type="term">experimental unit</span> is a single object or individual to be measured.</p> <p id="fs-idp20958864">You want to investigate the effectiveness of vitamin E in preventing disease. You recruit a group of subjects and ask them if they regularly take vitamin E. You notice that the subjects who take vitamin E exhibit better health on average than those who do not. Does this prove that vitamin E is effective in disease prevention? It does not. There are many differences between the two groups compared in addition to vitamin E consumption. People who take vitamin E regularly often take other steps to improve their health: exercise, diet, other vitamin supplements, choosing not to smoke. Any one of these factors could be influencing health. As described, this study does not prove that vitamin E is the key to disease prevention.</p> <p id="fs-idm26816896">Additional variables that can cloud a study are called <span data-type="term">lurking variables</span>. In order to prove that the explanatory variable is causing a change in the response variable, it is necessary to isolate the explanatory variable. The researcher must design her experiment in such a way that there is only one difference between groups being compared: the planned treatments. This is accomplished by the <span data-type="term">random assignment</span> of experimental units to treatment groups. When subjects are assigned treatments randomly, all of the potential lurking variables are spread equally among the groups. At this point the only difference between groups is the one imposed by the researcher. Different outcomes measured in the response variable, therefore, must be a direct result of the different treatments. In this way, an experiment can prove a cause-and-effect connection between the explanatory and response variables.</p> <p id="fs-idm40990544">The power of suggestion can have an important influence on the outcome of an experiment. Studies have shown that the expectation of the study participant can be as important as the actual medication. In one study of performance-enhancing drugs, researchers noted:</p> <p><em data-effect="italics">Results showed that believing one had taken the substance resulted in [</em>performance<em data-effect="italics">] times almost as fast as those associated with consuming the drug itself. In contrast, taking the drug without knowledge yielded no significant performance increment.</em><sup id="footnote-ref1" data-type="footnote-number"><a href="#footnote1" data-type="footnote-link">1</a></sup></p> <p id="fs-idp21694240">When participation in a study prompts a physical response from a participant, it is difficult to isolate the effects of the explanatory variable. To counter the power of suggestion, researchers set aside one treatment group as a <span data-type="term">control group</span>. This group is given a <span data-type="term">placebo</span> treatment–a treatment that cannot influence the response variable. The control group helps researchers balance the effects of being in an experiment with the effects of the active treatments. Of course, if you are participating in a study and you know that you are receiving a pill which contains no actual medication, then the power of suggestion is no longer a factor. <span data-type="term">Blinding</span> in a randomized experiment preserves the power of suggestion. When a person involved in a research study is blinded, he does not know who is receiving the active treatment(s) and who is receiving the placebo treatment. A <span data-type="term">double-blind experiment</span> is one in which both the subjects and the researchers involved with the subjects are blinded.</p> <div id="fs-idm46263440" class="textbox textbox--examples" data-type="example"><div id="fs-idm61332720" data-type="exercise"><div id="fs-idm36631952" data-type="problem"><p id="fs-idm36907744">Researchers want to investigate whether taking aspirin regularly reduces the risk of heart attack. Four hundred men between the ages of 50 and 84 are recruited as participants. The men are divided randomly into two groups: one group will take aspirin, and the other group will take a placebo. Each man takes one pill each day for three years, but he does not know whether he is taking aspirin or the placebo. At the end of the study, researchers count the number of men in each group who have had heart attacks.</p> <p id="fs-idm23449264">Identify the following values for this study: population, sample, experimental units, explanatory variable, response variable, treatments.</p> </div> <div id="fs-idm43262048" data-type="solution"><p id="fs-idm50725904">The <em data-effect="italics">population</em> is men aged 50 to 84. <span data-type="newline" data-count="1"><br /> </span>The <em data-effect="italics">sample</em> is the 400 men who participated. <span data-type="newline" data-count="1"><br /> </span>The <em data-effect="italics">experimental units</em> are the individual men in the study. <span data-type="newline" data-count="1"><br /> </span>The <em data-effect="italics">explanatory variable</em> is oral medication. <span data-type="newline" data-count="1"><br /> </span>The <em data-effect="italics">treatments</em> are aspirin and a placebo. <span data-type="newline" data-count="1"><br /> </span>The <em data-effect="italics">response variable</em> is whether a subject had a heart attack.</p> </div> </div> </div> <div id="fs-idp16610496" class="textbox textbox--examples" data-type="example"><div id="fs-idp23051024" data-type="exercise"><div id="fs-idp28537136" data-type="problem"><p id="fs-idp39070944">The Smell &amp; Taste Treatment and Research Foundation conducted a study to investigate whether smell can affect learning. Subjects completed mazes multiple times while wearing masks. They completed the pencil and paper mazes three times wearing floral-scented masks, and three times with unscented masks. Participants were assigned at random to wear the floral mask during the first three trials or during the last three trials. For each trial, researchers recorded the time it took to complete the maze and the subject’s impression of the mask’s scent: positive, negative, or neutral.</p> <ol id="fs-idp29645920" type="a"><li>Describe the explanatory and response variables in this study.</li> <li>What are the treatments?</li> <li>Identify any lurking variables that could interfere with this study.</li> <li>Is it possible to use blinding in this study?</li> </ol> </div> <div id="fs-idm28972352" data-type="solution"><ol id="fs-idm25552784" type="a"><li>The explanatory variable is scent, and the response variable is the time it takes to complete the maze.</li> <li>There are two treatments: a floral-scented mask and an unscented mask.</li> <li>All subjects experienced both treatments. The order of treatments was randomly assigned so there were no differences between the treatment groups. Random assignment eliminates the problem of lurking variables.</li> <li>Subjects will clearly know whether they can smell flowers or not, so subjects cannot be blinded in this study. Researchers timing the mazes can be blinded, though. The researcher who is observing a subject will not know which mask is being worn.</li> </ol> </div> </div> </div> <div id="fs-idm25280016" class="textbox textbox--examples" data-type="example"><div id="fs-idp11044144" data-type="exercise"><div id="fs-idp12574176" data-type="problem"><p id="fs-idm24852048">A researcher wants to study the effects of birth order on personality. Explain why this study could not be conducted as a randomized experiment. What is the main problem in a study that cannot be designed as a randomized experiment?</p> </div> <div id="fs-idm25555568" data-type="solution"><p id="fs-idm27252064">The explanatory variable is birth order. You cannot randomly assign a person’s birth order. Random assignment eliminates the impact of lurking variables. When you cannot assign subjects to treatment groups at random, there will be differences between the groups other than the explanatory variable.</p> </div> </div> </div> <div id="fs-idm24188176" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm48215888" data-type="exercise"><div id="fs-idm74165760" data-type="problem"><p id="fs-idm27042912">You are concerned about the effects of texting on driving performance. Design a study to test the response time of drivers while texting and while driving only. How many seconds does it take for a driver to respond when a leading car hits the brakes?</p> <ol id="fs-idp17285584" type="a"><li>Describe the explanatory and response variables in the study.</li> <li>What are the treatments?</li> <li>What should you consider when selecting participants?</li> <li>Your research partner wants to divide participants randomly into two groups: one to drive without distraction and one to text and drive simultaneously. Is this a good idea? Why or why not?</li> <li>Identify any lurking variables that could interfere with this study.</li> <li>How can blinding be used in this study?</li> </ol> </div> </div> </div> <div id="eip-165" class="bc-section section" data-depth="1"><h3 data-type="title">Ethics</h3> <p id="fs-idp69338592">The widespread misuse and misrepresentation of statistical information often gives the field a bad name. Some say that “numbers don’t lie,” but the people who use numbers to support their claims often do.</p> <p id="fs-idp63161552">A recent investigation of famous social psychologist, Diederik Stapel, has led to the retraction of his articles from some of the world’s top journals including <em data-effect="italics">Journal of Experimental Social Psychology, Social Psychology, Basic and Applied Social Psychology, British Journal of Social Psychology,</em> and the magazine <em data-effect="italics">Science</em>. Diederik Stapel is a former professor at Tilburg University in the Netherlands. Over the past two years, an extensive investigation involving three universities where Stapel has worked concluded that the psychologist is guilty of fraud on a colossal scale. Falsified data taints over 55 papers he authored and 10 Ph.D. dissertations that he supervised.</p> <p id="eip-idm40688480"><em data-effect="italics">Stapel did not deny that his deceit was driven by ambition. But it was more complicated than that, he told me. He insisted that he loved social psychology but had been frustrated by the messiness of experimental data, which rarely led to clear conclusions. His lifelong obsession with elegance and order, he said, led him to concoct sexy results that journals found attractive. “It was a quest for aesthetics, for beauty—instead of the truth,” he said. He described his behavior as an addiction that drove him to carry out acts of increasingly daring fraud, like a junkie seeking a bigger and better high.<sup id="footnote-ref2" data-type="footnote-number"><a href="#footnote2" data-type="footnote-link">2</a></sup></em></p> <p id="fs-idp11297920">The committee investigating Stapel concluded that he is guilty of several practices including:</p> <ul id="fs-idp81387808"><li>creating datasets, which largely confirmed the prior expectations,</li> <li>altering data in existing datasets,</li> <li>changing measuring instruments without reporting the change, and</li> <li>misrepresenting the number of experimental subjects.</li> </ul> <p id="fs-idp31763600">Clearly, it is never acceptable to falsify data the way this researcher did. Sometimes, however, violations of ethics are not as easy to spot.</p> <p id="fs-idp40590784">Researchers have a responsibility to verify that proper methods are being followed. The report describing the investigation of Stapel’s fraud states that, “statistical flaws frequently revealed a lack of familiarity with elementary statistics.”<sup id="footnote-ref3" data-type="footnote-number"><a href="#footnote3" data-type="footnote-link">3</a></sup> Many of Stapel’s co-authors should have spotted irregularities in his data. Unfortunately, they did not know very much about statistical analysis, and they simply trusted that he was collecting and reporting data properly.</p> <p id="fs-idp112502080">Many types of statistical fraud are difficult to spot. Some researchers simply stop collecting data once they have just enough to prove what they had hoped to prove. They don’t want to take the chance that a more extensive study would complicate their lives by producing data contradicting their hypothesis.</p> <p id="fs-idp75599696">Professional organizations, like the American Statistical Association, clearly define expectations for researchers. There are even laws in the federal code about the use of research data.</p> <p id="fs-idp35202800">When a statistical study uses human participants, as in medical studies, both ethics and the law dictate that researchers should be mindful of the safety of their research subjects. The U.S. Department of Health and Human Services oversees federal regulations of research studies with the aim of protecting participants. When a university or other research institution engages in research, it must ensure the safety of all human subjects. For this reason, research institutions establish oversight committees known as <span data-type="term">Institutional Review Boards (IRB)</span>. All planned studies must be approved in advance by the IRB. Key protections that are mandated by law include the following:</p> <ul id="fs-idp105688032"><li>Risks to participants must be minimized and reasonable with respect to projected benefits.</li> <li>Participants must give <span data-type="term">informed consent</span>. This means that the risks of participation must be clearly explained to the subjects of the study. Subjects must consent in writing, and researchers are required to keep documentation of their consent.</li> <li>Data collected from individuals must be guarded carefully to protect their privacy.</li> </ul> <p id="fs-idm5212816">These ideas may seem fundamental, but they can be very difficult to verify in practice. Is removing a participant’s name from the data record sufficient to protect privacy? Perhaps the person’s identity could be discovered from the data that remains. What happens if the study does not proceed as planned and risks arise that were not anticipated? When is informed consent really necessary? Suppose your doctor wants a blood sample to check your cholesterol level. Once the sample has been tested, you expect the lab to dispose of the remaining blood. At that point the blood becomes biological waste. Does a researcher have the right to take it for use in a study?</p> <p id="fs-idp36402832">It is important that students of statistics take time to consider the ethical questions that arise in statistical studies. How prevalent is fraud in statistical studies? You might be surprised—and disappointed. There is a <a href="http://www.retractionwatch.com">website</a> dedicated to cataloging retractions of study articles that have been proven fraudulent. A quick glance will show that the misuse of statistics is a bigger problem than most people realize.</p> <p id="fs-idp26995216">Vigilance against fraud requires knowledge. Learning the basic theory of statistics will empower you to analyze statistical studies critically.</p> <div id="fs-idp23504416" class="textbox textbox--examples" data-type="example"><div id="fs-idp16842720" data-type="exercise"><div id="fs-idp59046816" data-type="problem"><p id="fs-idp39089936">Describe the unethical behavior in each example and describe how it could impact the reliability of the resulting data. Explain how the problem should be corrected.</p> <p id="fs-idp132934528">A researcher is collecting data in a community.</p> <ol id="fs-idp10407040" type="a"><li>She selects a block where she is comfortable walking because she knows many of the people living on the street.</li> <li>No one seems to be home at four houses on her route. She does not record the addresses and does not return at a later time to try to find residents at home.</li> <li>She skips four houses on her route because she is running late for an appointment. When she gets home, she fills in the forms by selecting random answers from other residents in the neighborhood.</li> </ol> </div> <div id="fs-idp23306832" data-type="solution"><ol id="fs-idp82395040" type="a"><li>By selecting a convenient sample, the researcher is intentionally selecting a sample that could be biased. Claiming that this sample represents the community is misleading. The researcher needs to select areas in the community at random.</li> <li>Intentionally omitting relevant data will create bias in the sample. Suppose the researcher is gathering information about jobs and child care. By ignoring people who are not home, she may be missing data from working families that are relevant to her study. She needs to make every effort to interview all members of the target sample.</li> <li>It is never acceptable to fake data. Even though the responses she uses are “real” responses provided by other participants, the duplication is fraudulent and can create bias in the data. She needs to work diligently to interview everyone on her route.</li> </ol> </div> </div> </div> <div id="eip-89" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp63335072" data-type="exercise"><div id="fs-idp64206416" data-type="problem"><p id="fs-idp19040992">Describe the unethical behavior, if any, in each example and describe how it could impact the reliability of the resulting data. Explain how the problem should be corrected.</p> <p id="fs-idp62918448">A study is commissioned to determine the favorite brand of fruit juice among teens in California.</p> <ol id="fs-idm4025376" type="a"><li>The survey is commissioned by the seller of a popular brand of apple juice.</li> <li>There are only two types of juice included in the study: apple juice and cranberry juice.</li> <li>Researchers allow participants to see the brand of juice as samples are poured for a taste test.</li> <li>Twenty-five percent of participants prefer Brand X, 33% prefer Brand Y and 42% have no preference between the two brands. Brand X references the study in a commercial saying “Most teens like Brand X as much as or more than Brand Y.”</li> </ol> </div> </div> </div> </div> <div id="eip-568" class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="eip-idp74543296">“Vitamin E and Health,” Nutrition Source, Harvard School of Public Health, http://www.hsph.harvard.edu/nutritionsource/vitamin-e/ (accessed May 1, 2013).</p> <p id="eip-idp104876880">Stan Reents. “Don’t Underestimate the Power of Suggestion,” athleteinme.com, http://www.athleteinme.com/ArticleView.aspx?id=1053 (accessed May 1, 2013).</p> <p id="eip-idp150905520">Ankita Mehta. “Daily Dose of Aspiring Helps Reduce Heart Attacks: Study,” International Business Times, July 21, 2011. Also available online at http://www.ibtimes.com/daily-dose-aspirin-helps-reduce-heart-attacks-study-300443 (accessed May 1, 2013).</p> <p id="eip-idp150905904">The Data and Story Library, http://lib.stat.cmu.edu/DASL/Stories/ScentsandLearning.html (accessed May 1, 2013).</p> <p id="eip-idp75842000">M.L. Jacskon et al., “Cognitive Components of Simulated Driving Performance: Sleep Loss effect and Predictors,” Accident Analysis and Prevention Journal, Jan no. 50 (2013), http://www.ncbi.nlm.nih.gov/pubmed/22721550 (accessed May 1, 2013).</p> <p id="eip-idm5379200">“Earthquake Information by Year,” U.S. Geological Survey. http://earthquake.usgs.gov/earthquakes/eqarchives/year/ (accessed May 1, 2013).</p> <p id="eip-idp68246336">“Fatality Analysis Report Systems (FARS) Encyclopedia,” National Highway Traffic and Safety Administration. http://www-fars.nhtsa.dot.gov/Main/index.aspx (accessed May 1, 2013).</p> <p id="eip-idp61237888">Data from www.businessweek.com (accessed May 1, 2013).</p> <p id="eip-idp68246848">Data from www.forbes.com (accessed May 1, 2013).</p> <p id="eip-idp64966144">“America’s Best Small Companies,” http://www.forbes.com/best-small-companies/list/ (accessed May 1, 2013).</p> <p id="eip-idp108174160">U.S. Department of Health and Human Services, Code of Federal Regulations Title 45 Public Welfare Department of Health and Human Services Part 46 Protection of Human Subjects revised January 15, 2009. Section 46.111:Criteria for IRB Approval of Research.</p> <p id="eip-idp30063584">“April 2013 Air Travel Consumer Report,” U.S. Department of Transportation, April 11 (2013), http://www.dot.gov/airconsumer/april-2013-air-travel-consumer-report (accessed May 1, 2013).</p> <p id="eip-idp43227344">Lori Alden, “Statistics can be Misleading,” econoclass.com, http://www.econoclass.com/misleadingstats.html (accessed May 1, 2013).</p> <p id="eip-idm19679280">Maria de los A. Medina, “Ethics in Statistics,” Based on “Building an Ethics Module for Business, Science, and Engineering Students” by Jose A. Cruz-Cruz and William Frey, Connexions, http://cnx.org/content/m15555/latest/ (accessed May 1, 2013).</p> </div> <div id="fs-idp15055024" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idm59870208">A poorly designed study will not produce reliable data. There are certain key components that must be included in every experiment. To eliminate lurking variables, subjects must be assigned randomly to different treatment groups. One of the groups must act as a control group, demonstrating what happens when the active treatment is not applied. Participants in the control group receive a placebo treatment that looks exactly like the active treatments but cannot influence the response variable. To preserve the integrity of the placebo, both researchers and subjects may be blinded. When a study is designed properly, the only difference between treatment groups is the one imposed by the researcher. Therefore, when groups respond differently to different treatments, the difference must be due to the influence of the explanatory variable.</p> <p>“An ethics problem arises when you are considering an action that benefits you or some cause you support, hurts or reduces benefits to others, and violates some rule.” (Andrew Gelman, “Open Data and Open Methods,” Ethics and Statistics, http://www.stat.columbia.edu/~gelman/research/published/ChanceEthics1.pdf (accessed May 1, 2013).) Ethical violations in statistics are not always easy to spot. Professional associations and federal agencies post guidelines for proper conduct. It is important that you learn basic statistical procedures so that you can recognize proper data analysis.</p> </div> <div id="fs-idp14752656" class="practice" data-depth="1"><div id="fs-idm44521824" data-type="exercise"><div id="fs-idp14350784" data-type="problem"><p id="fs-idm28684128">Design an experiment. Identify the explanatory and response variables. Describe the population being studied and the experimental units. Explain the treatments that will be used and how they will be assigned to the experimental units. Describe how blinding and placebos may be used to counter the power of suggestion.</p> </div> </div> <div data-type="exercise"><div id="fs-idp93593616" data-type="problem"><p id="fs-idp69204048">Discuss potential violations of the rule requiring informed consent.</p> <ol id="fs-idp31112064" type="a"><li>Inmates in a correctional facility are offered good behavior credit in return for participation in a study.</li> <li>A research study is designed to investigate a new children’s allergy medication.</li> <li>Participants in a study are told that the new medication being tested is highly promising, but they are not told that only a small portion of participants will receive the new medication. Others will receive placebo treatments and traditional treatments.</li> </ol> </div> <div id="eip-idm82051504" data-type="solution"><ol id="eip-idm79742704" type="a"><li>Inmates may not feel comfortable refusing participation, or may feel obligated to take advantage of the promised benefits. They may not feel truly free to refuse participation.</li> <li>Parents can provide consent on behalf of their children, but children are not competent to provide consent for themselves.</li> <li>All risks and benefits must be clearly outlined. Study participants must be informed of relevant aspects of the study in order to give appropriate consent.</li> </ol> </div> </div> </div> <div id="fs-idm36883600" class="free-response" data-depth="1"><h3 data-type="title">HOMEWORK</h3> <div id="fs-idm22501296" data-type="exercise"><div id="fs-idm50710912" data-type="problem"><ol><li id="fs-idp28034880">How does sleep deprivation affect your ability to drive? A recent study measured the effects on 19 professional drivers. Each driver participated in two experimental sessions: one after normal sleep and one after 27 hours of total sleep deprivation. The treatments were assigned in random order. In each session, performance was measured on a variety of tasks including a driving simulation.</li> </ol> <p id="fs-idp12620912">Use key terms from this module to describe the design of this experiment.</p> <p>&nbsp;</p> </div> </div> </div> <p>&nbsp;</p> <div class="free-response" data-depth="1"><div data-type="exercise"><div id="fs-idp75882352" data-type="problem"><p id="fs-idp66601184">2) An advertisement for Acme Investments displays the two graphs in <a class="autogenerated-content" href="#fs-idp81996208">(Figure)</a> to show the value of Acme’s product in comparison with the Other Guy’s product. Describe the potentially misleading visual effect of these comparison graphs. How can this be corrected?</p> <div id="fs-idp81996208" class="bc-figure figure" data-orient="horizontal"><div class="bc-figcaption figcaption">As the graphs show, Acme consistently outperforms the Other Guys!</div> <div id="fs-idp51320352" class="bc-figure figure"><span id="fs-idp79724992" data-type="media" data-alt="This is a line graph titled Acme Investments. The line graph shows a dramatic increase; neither the x-axis nor y-axis are labeled."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/CNX_Stats_C01_M11_001-1.jpg" alt="This is a line graph titled Acme Investments. The line graph shows a dramatic increase; neither the x-axis nor y-axis are labeled." width="300" data-media-type="image/jpg" /></span></div> <div id="fs-idp16753840" class="bc-figure figure"><span id="fs-idp64067856" data-type="media" data-alt="This is a line graph titled Other Guy's Investments. The line graph shows a modest increase; neither the x-axis nor y-axis are labeled."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C01_M11_002-1.jpg" alt="This is a line graph titled Other Guy's Investments. The line graph shows a modest increase; neither the x-axis nor y-axis are labeled." width="300" data-media-type="image/jpg" /></span></div> </div> </div> </div> <div id="eip-476" data-type="exercise"><div id="fs-idp62000608" data-type="problem"><p id="fs-idp31778016">3) The graph in <a class="autogenerated-content" href="#fs-idm43124592">(Figure)</a> shows the number of complaints for six different airlines as reported to the US Department of Transportation in February 2013. Alaska, Pinnacle, and Airtran Airlines have far fewer complaints reported than American, Delta, and United. Can we conclude that American, Delta, and United are the worst airline carriers since they have the most complaints?</p> <div id="fs-idm43124592" class="bc-figure figure"><span id="fs-idp17849040" data-type="media" data-alt="This is a bar graph with 6 different airlines on the x-axis, and number of complaints on y-axis. The graph is titled Total Passenger Complaints. Data is from an April 2013 DOT report."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C01_M11_003s-1.jpg" alt="This is a bar graph with 6 different airlines on the x-axis, and number of complaints on y-axis. The graph is titled Total Passenger Complaints. Data is from an April 2013 DOT report." width="420" data-media-type="image/jpg" /></span></div> </div> <div id="eip-idp17300464" data-type="solution"><p><strong>Answers to odd questions</strong></p> <p>1) Explanatory variable: amount of sleep <span data-type="newline" data-count="1"><br /> </span>Response variable: performance measured in assigned tasks <span data-type="newline" data-count="1"><br /> </span>Treatments: normal sleep and 27 hours of total sleep deprivation <span data-type="newline" data-count="1"><br /> </span>Experimental Units: 19 professional drivers <span data-type="newline" data-count="1"><br /> </span>Lurking variables: none – all drivers participated in both treatments <span data-type="newline" data-count="1"><br /> </span>Random assignment: treatments were assigned in random order; this eliminated the effect of any “learning” that may take place during the first experimental session <span data-type="newline" data-count="1"><br /> </span>Control/Placebo: completing the experimental session under normal sleep conditions <span data-type="newline" data-count="1"><br /> </span>Blinding: researchers evaluating subjects’ performance must not know which treatment is being applied at the time</p> </div> </div> <p>3) You cannot assume that the numbers of complaints reflect the quality of the airlines. The airlines shown with the greatest number of complaints are the ones with the most passengers. You must consider the appropriateness of methods for presenting data; in this case displaying totals is misleading.</p> </div> <div data-type="footnote-refs"><h3 data-type="footnote-refs-title">Footnotes</h3> <ul data-list-type="bulleted" data-bullet-style="none"><li id="footnote1" data-type="footnote-ref"><a href="#footnote-ref1" data-type="footnote-ref-link">1</a> <span data-type="footnote-ref-content">McClung, M. Collins, D. “Because I know it will!”: placebo effects of an ergogenic aid on athletic performance. Journal of Sport &amp; Exercise Psychology. 2007 Jun. 29(3):382-94. Web. April 30, 2013.</span></li> <li id="footnote2" data-type="footnote-ref"><a href="#footnote-ref2" data-type="footnote-ref-link">2</a> <span data-type="footnote-ref-content">Yudhijit Bhattacharjee, “The Mind of a Con Man,” Magazine, New York Times, April 26, 2013. Available online at: http://www.nytimes.com/2013/04/28/magazine/diederik-stapels-audacious-academic-fraud.html?src=dayp&amp;_r=2&amp; (accessed May 1, 2013).</span></li> <li id="footnote3" data-type="footnote-ref"><a href="#footnote-ref3" data-type="footnote-ref-link">3</a> <span data-type="footnote-ref-content">“Flawed Science: The Fraudulent Research Practices of Social Psychologist Diederik Stapel,” Tillburg University, November 28, 2012, http://www.tilburguniversity.edu/upload/064a10cd-bce5-4385-b9ff-05b840caeae6_120695_Rapp_nov_2012_UK_web.pdf (accessed May 1, 2013).</span></li> </ul> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="fs-idm109278432"><dt>Explanatory Variable</dt> <dd id="fs-idm31399376">the independent variable in an experiment; the value controlled by researchers</dd> </dl> <dl id="fs-idm58845216"><dt>Treatments</dt> <dd id="fs-idm22575600">different values or components of the explanatory variable applied in an experiment</dd> </dl> <dl id="fs-idm22025168"><dt>Response Variable</dt> <dd id="fs-idm21518528">the dependent variable in an experiment; the value that is measured for change at the end of an experiment</dd> </dl> <dl id="fs-idm37719200"><dt>Experimental Unit</dt> <dd id="fs-idm24214112">any individual or object to be measured</dd> </dl> <dl id="fs-idp2339632"><dt>Lurking Variable</dt> <dd id="fs-idm53118336">a variable that has an effect on a study even though it is neither an explanatory variable nor a response variable</dd> </dl> <dl id="fs-idm60242624"><dt>Random Assignment</dt> <dd id="fs-idm36285312">the act of organizing experimental units into treatment groups using random methods</dd> </dl> <dl id="fs-idm44014688"><dt>Control Group</dt> <dd id="fs-idp2890064">a group in a randomized experiment that receives an inactive treatment but is otherwise managed exactly as the other groups</dd> </dl> <dl id="fs-idm53331472"><dt>Informed Consent</dt> <dd id="fs-idp6209616">Any human subject in a research study must be cognizant of any risks or costs associated with the study. The subject has the right to know the nature of the treatments included in the study, their potential risks, and their potential benefits. Consent must be given freely by an informed, fit participant.</dd> </dl> <dl id="fs-idm1693648"><dt>Institutional Review Board</dt> <dd id="fs-idp35054208">a committee tasked with oversight of research programs that involve human subjects</dd> </dl> <dl id="fs-idm46313024"><dt>Placebo</dt> <dd id="fs-idm28719952">an inactive treatment that has no real effect on the explanatory variable</dd> </dl> <dl id="fs-idm26631824"><dt>Blinding</dt> <dd id="fs-idm40749296">not telling participants which treatment a subject is receiving</dd> </dl> <dl id="fs-idm21639776"><dt>Double-blinding</dt> <dd id="fs-idm25589376">the act of blinding both the subjects of an experiment and the researchers who work with the subjects</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-data-collection-experiment" title="Activity 1.5: Data Collection Experiment"><div class="chapter-title-wrap"><h3 class="chapter-number">5</h3><h2 class="chapter-title"><span class="display-none">Activity 1.5: Data Collection Experiment</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-id1170942442486" class="statistics lab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Data Collection Experiment</div> <p id="id11029318">Class Time:</p> <p id="element-194">Names:</p> <div data-type="list"><div data-type="title">Student Learning Outcomes</div> <ul><li>The student will demonstrate the systematic sampling technique.</li> <li>The student will construct relative frequency tables.</li> <li>The student will interpret results and their differences from different data groupings.</li> </ul> </div> <p id="element-972"><span data-type="title">Movie Survey</span>Ask five classmates from a different class how many movies they saw at the theater last month. Do not include rented movies.</p> <ol id="list-92836987265"><li>Record the data.</li> <li>In class, randomly pick one person. On the class list, mark that person’s name. Move down four names on the class list. Mark that person’s name. Continue doing this until you have marked 12 names. You may need to go back to the start of the list. For each marked name record the five data values. You now have a total of 60 data values.</li> <li>For each name marked, record the data.<br /> <table id="table-234986" summary="empty 5x12 table for recording data"><tbody><tr><td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> </tr> <tr><td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> </tr> <tr><td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> </tr> <tr><td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> </tr> <tr><td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> <td>___</td> </tr> </tbody> </table> </li> </ol> <p id="element-6"><span data-type="title">Order the Data</span>Complete the two relative frequency tables below using your class data.</p> <table id="id9610791234" summary="This table provides a blank template for recording the results of the previously conducted survey. The first column contains the exact number of movies watched, the second column contains the frequency, the third column contains the relative frequency, and the fourth column contains the cumulative relative frequency. Only the first column is completed."><caption><span data-type="title">Frequency of Number of Movies Viewed</span></caption> <thead><tr><th>Number of Movies</th> <th>Frequency</th> <th>Relative Frequency</th> <th>Cumulative Relative Frequency</th> </tr> </thead> <tbody><tr><td><span class="normal" data-type="emphasis" data-effect="normal">0</span></td> <td></td> <td></td> <td></td> </tr> <tr><td>1</td> <td></td> <td></td> <td></td> </tr> <tr><td>2</td> <td></td> <td></td> <td></td> </tr> <tr><td>3</td> <td></td> <td></td> <td></td> </tr> <tr><td>4</td> <td></td> <td></td> <td></td> </tr> <tr><td>5</td> <td></td> <td></td> <td></td> </tr> <tr><td>6</td> <td></td> <td></td> <td></td> </tr> <tr><td>7+</td> <td></td> <td></td> <td></td> </tr> </tbody> </table> <table id="id10778923248" summary="Similar to the previous table, this is a blank template for recording the results of the previously conducted survey. The first column presents a range of number of movies watched, the second column contains the frequency, the third column contains the relative frequency, and the fourth column contains the cumulative relative frequency. Only the first column is completed."><caption><span data-type="title">Frequency of Number of Movies Viewed</span></caption> <thead><tr><th>Number of Movies</th> <th>Frequency</th> <th>Relative Frequency</th> <th>Cumulative Relative Frequency</th> </tr> </thead> <tbody><tr><td>0–1</td> <td></td> <td></td> <td></td> </tr> <tr><td>2–3</td> <td></td> <td></td> <td></td> </tr> <tr><td>4–5</td> <td></td> <td></td> <td></td> </tr> <tr><td>6–7+</td> <td></td> <td></td> <td></td> </tr> </tbody> </table> <ol id="list-2349875"><li>Using the tables, find the percent of data that is at most two. Which table did you use and why?</li> <li>Using the tables, find the percent of data that is at most three. Which table did you use and why?</li> <li>Using the tables, find the percent of data that is more than two. Which table did you use and why?</li> <li>Using the tables, find the percent of data that is more than three. Which table did you use and why?</li> </ol> <div id="list-23497695" data-type="list"><div data-type="title">Discussion Questions</div> <ol><li>Is one of the tables “more correct” than the other? Why or why not?</li> <li>In general, how could you group the data differently? Are there any advantages to either way of grouping the data?</li> <li>Why did you switch between tables, if you did, when answering the question above?</li> </ol> </div> </div> </div></div>
<div class="chapter standard" id="chapter-sampling-experiment" title="Activity 1.6: Sampling Experiment"><div class="chapter-title-wrap"><h3 class="chapter-number">6</h3><h2 class="chapter-title"><span class="display-none">Activity 1.6: Sampling Experiment</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-id1167914447982" class="statistics lab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Sampling Experiment</div> <p id="id9893246">Class Time:</p> <p id="element-745">Names:</p> <div id="element-806" data-type="list"><div data-type="title">Student Learning Outcomes</div> <ul><li>The student will demonstrate the simple random, systematic, stratified, and cluster sampling techniques.</li> <li>The student will explain the details of each procedure used.</li> </ul> </div> <p id="element-153">In this lab, you will be asked to pick several random samples of restaurants. In each case, describe your procedure briefly, including how you might have used the random number generator, and then list the restaurants in the sample you obtained.</p> <div id="id11521709" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="fs-idp40181216">The following section contains restaurants stratified by city into columns and grouped horizontally by entree cost (clusters).</p> </div> <p id="fs-idp15735184"><span data-type="title">Restaurants Stratified by City and Entree Cost </span></p> <table id="id9894hh324636" summary="This table provides a sample of restaurants. Each cell contains a list of restaurants that correspond to the given location and price range. The first column lists the location, the second column contains restaurants with entrees under $10, the third column contains restaurants with entrees between $10 and $15, the fourth column between $15 and $20, and the fifth column restaurants with entrees over $20."><caption><span data-type="title">Restaurants Used in Sample</span></caption> <thead><tr><th>Entree Cost</th> <th>Under \$10</th> <th>\$10 to under \$15</th> <th>\$15 to under \$20</th> <th>Over \$20</th> </tr> </thead> <tbody><tr><td>San Jose</td> <td>El Abuelo Taq, Pasta Mia, Emma’s Express, Bamboo Hut</td> <td>Emperor’s Guard, Creekside Inn</td> <td>Agenda, Gervais, Miro’s</td> <td>Blake’s, Eulipia, Hayes Mansion, Germania</td> </tr> <tr><td>Palo Alto</td> <td>Senor Taco, Olive Garden, Taxi’s</td> <td>Ming’s, P.A. Joe’s, Stickney’s</td> <td>Scott’s Seafood, Poolside Grill, Fish Market</td> <td>Sundance Mine, Maddalena’s, Spago’s</td> </tr> <tr><td>Los Gatos</td> <td>Mary’s Patio, Mount Everest, Sweet Pea’s, Andele Taqueria</td> <td>Lindsey’s, Willow Street</td> <td>Toll House</td> <td>Charter House, La Maison Du Cafe</td> </tr> <tr><td>Mountain View</td> <td>Maharaja, New Ma’s, Thai-Rific, Garden Fresh</td> <td>Amber Indian, La Fiesta, Fiesta del Mar, Dawit</td> <td>Austin’s, Shiva’s, Mazeh</td> <td>Le Petit Bistro</td> </tr> <tr><td>Cupertino</td> <td>Hobees, Hung Fu, Samrat, Panda Express</td> <td>Santa Barb. Grill, Mand. Gourmet, Bombay Oven, Kathmandu West</td> <td>Fontana’s, Blue Pheasant</td> <td>Hamasushi, Helios</td> </tr> <tr><td>Sunnyvale</td> <td>Chekijababi, Taj India, Full Throttle, Tia Juana, Lemon Grass</td> <td>Pacific Fresh, Charley Brown’s, Cafe Cameroon, Faz, Aruba’s</td> <td>Lion &amp; Compass, The Palace, Beau Sejour</td> <td></td> </tr> <tr><td>Santa Clara</td> <td>Rangoli, Armadillo Willy’s, Thai Pepper, Pasand</td> <td>Arthur’s, Katie’s Cafe, Pedro’s, La Galleria</td> <td>Birk’s, Truya Sushi, Valley Plaza</td> <td>Lakeside, Mariani’s</td> </tr> </tbody> </table> <p id="element-23513616999"><span data-type="title">A Simple Random Sample</span>Pick a <strong>simple random sample</strong> of 15 restaurants.</p> <ol id="list-23562"><li>Describe your procedure.</li> <li>Complete the table with your sample.<br /> <table id="tableone" summary="This table is presented as a template for collecting data and contains a total of 15 blank cells."><tbody><tr><td>1. __________</td> <td>6. __________</td> <td>11. __________</td> </tr> <tr><td>2. __________</td> <td>7. __________</td> <td>12. __________</td> </tr> <tr><td>3. __________</td> <td>8. __________</td> <td>13. __________</td> </tr> <tr><td>4. __________</td> <td>9. __________</td> <td>14. __________</td> </tr> <tr><td>5. __________</td> <td>10. __________</td> <td>15. __________</td> </tr> </tbody> </table> </li> </ol> <p id="element-235975691"><span data-type="title">A Systematic Sample</span>Pick a <strong>systematic sample</strong> of 15 restaurants.</p> <ol id="list-235562"><li>Describe your procedure.</li> <li>Complete the table with your sample.<br /> <table id="tabletwo" summary="This table is presented as a template for collecting data and contains a total of 15 blank cells."><tbody><tr><td>1. __________</td> <td>6. __________</td> <td>11. __________</td> </tr> <tr><td>2. __________</td> <td>7. __________</td> <td>12. __________</td> </tr> <tr><td>3. __________</td> <td>8. __________</td> <td>13. __________</td> </tr> <tr><td>4. __________</td> <td>9. __________</td> <td>14. __________</td> </tr> <tr><td>5. __________</td> <td>10. __________</td> <td>15. __________</td> </tr> </tbody> </table> </li> </ol> <p id="element-2466992"><span data-type="title">A Stratified Sample</span>Pick a <strong>stratified sample</strong>, by city, of 20 restaurants. Use 25% of the restaurants from each stratum. Round to the nearest whole number.</p> <ol id="list-23566372"><li>Describe your procedure.</li> <li>Complete the table with your sample.<br /> <table id="tablethree" summary="This table is presented as a template for collecting data and contains a total of 20 blank cells."><tbody><tr><td>1. __________</td> <td>6. __________</td> <td>11. __________</td> <td>16. __________</td> </tr> <tr><td>2. __________</td> <td>7. __________</td> <td>12. __________</td> <td>17. __________</td> </tr> <tr><td>3. __________</td> <td>8. __________</td> <td>13. __________</td> <td>18. __________</td> </tr> <tr><td>4. __________</td> <td>9. __________</td> <td>14. __________</td> <td>19. __________</td> </tr> <tr><td>5. __________</td> <td>10. __________</td> <td>15. __________</td> <td>20. __________</td> </tr> </tbody> </table> </li> </ol> <p id="element-2359869125"><span data-type="title">A Stratified Sample</span>Pick a <strong>stratified sample</strong>, by entree cost, of 21 restaurants. Use 25% of the restaurants from each stratum. Round to the nearest whole number.</p> <ol id="list-235966372"><li>Describe your procedure.</li> <li>Complete the table with your sample.<br /> <table id="tablefour" summary="This table is presented as a template for collecting data and contins a total of 21 blank cells."><tbody><tr><td>1. __________</td> <td>6. __________</td> <td>11. __________</td> <td>16. __________</td> </tr> <tr><td>2. __________</td> <td>7. __________</td> <td>12. __________</td> <td>17. __________</td> </tr> <tr><td>3. __________</td> <td>8. __________</td> <td>13. __________</td> <td>18. __________</td> </tr> <tr><td>4. __________</td> <td>9. __________</td> <td>14. __________</td> <td>19. __________</td> </tr> <tr><td>5. __________</td> <td>10. __________</td> <td>15. __________</td> <td>20. __________</td> </tr> <tr><td></td> <td></td> <td></td> <td>21. __________</td> </tr> </tbody> </table> </li> </ol> <p id="element-252976925"><span data-type="title">A Cluster Sample</span>Pick a <strong>cluster sample</strong> of restaurants from two cities. The number of restaurants will vary.</p> <ol id="list-2359669372"><li>Describe your procedure.</li> <li>Complete the table with your sample.<br /> <table id="tablefive" summary="This table is presented as a template for collecting data and contains a total of 25 blank cells."><tbody><tr><td>1. ________</td> <td>6. ________</td> <td>11. ________</td> <td>16. ________</td> <td>21. ________</td> </tr> <tr><td>2. ________</td> <td>7. ________</td> <td>12. ________</td> <td>17. ________</td> <td>22. ________</td> </tr> <tr><td>3. ________</td> <td>8. ________</td> <td>13. ________</td> <td>18. ________</td> <td>23. ________</td> </tr> <tr><td>4. ________</td> <td>9. ________</td> <td>14. ________</td> <td>19. ________</td> <td>24. ________</td> </tr> <tr><td>5. ________</td> <td>10. ________</td> <td>15. ________</td> <td>20. ________</td> <td>25. ________</td> </tr> </tbody> </table> </li> </ol> </div> </div></div>
<div class="part " id="part-descriptive-statistics"><div class="part-title-wrap"><h3 class="part-number">II</h3><h1 class="part-title">Chapter 2: Descriptive Statistics</h1></div><div class="ugc part-ugc"></div></div>
<div class="chapter standard" id="chapter-introduction-14" title="Chapter 2.1: Introduction"><div class="chapter-title-wrap"><h3 class="chapter-number">7</h3><h2 class="chapter-title"><span class="display-none">Chapter 2.1: Introduction</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-idm37970816" class="splash"><div class="bc-figcaption figcaption">When you have large amounts of data, you will need to organize it in a way that makes sense. These ballots from an election are rolled together with similar ballots to keep them organized. (credit: William Greeson)</div> <p><span id="fs-idm99957184" data-type="media" data-alt="This photo shows about 26 rolls of paper piled together. The rolls are different sizes."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/CNX_Stats_C02_CO20Photo-1.jpg" alt="This photo shows about 26 rolls of paper piled together. The rolls are different sizes." width="500" data-media-type="image/png" /></span></p> </div> <div id="fs-idp14137120" class="chapter-objectives" data-type="note" data-has-label="true" data-label=""><div data-type="title">Chapter Objectives</div> <p id="element-910">By the end of this chapter, the student should be able to:</p> <ul id="list123523"><li>Display data graphically and interpret graphs: stemplots, histograms, and box plots.</li> <li>Recognize, describe, and calculate the measures of location of data: quartiles and percentiles.</li> <li>Recognize, describe, and calculate the measures of the center of data: mean, median, and mode.</li> <li>Recognize, describe, and calculate the measures of the spread of data: variance, standard deviation, and range.</li> </ul> </div> <p id="fs-idp57139568">Once you have collected data, what will you do with it? Data can be described and presented in many different formats. For example, suppose you are interested in buying a house in a particular area. You may have no clue about the house prices, so you might ask your real estate agent to give you a sample data set of prices. Looking at all the prices in the sample often is overwhelming. A better way might be to look at the median price and the variation of prices. The median and variation are just two ways that you will learn to describe data. Your agent might also provide you with a graph of the data.</p> <p id="element-908">In this chapter, you will study numerical and graphical ways to describe and display your data. This area of statistics is called <strong>&#8220;Descriptive Statistics.&#8221;</strong> You will learn how to calculate, and even more importantly, how to interpret these measurements and graphs.</p> <p id="id6322560">A statistical graph is a tool that helps you learn about the shape or distribution of a sample or a population. A graph can be a more effective way of presenting data than a mass of numbers because we can see where data clusters and where there are only a few data values. Newspapers and the Internet use graphs to show trends and to enable readers to compare facts and figures quickly. Statisticians often graph data first to get a picture of the data. Then, more formal tools may be applied.</p> <p id="id5413204">Some of the types of graphs that are used to summarize and organize data are the dot plot, the bar graph, the histogram, the stem-and-leaf plot, the frequency polygon (a type of broken line graph), the pie chart, and the box plot. In this chapter, we will briefly look at stem-and-leaf plots, line graphs, and bar graphs, as well as frequency polygons, and time series graphs. Our emphasis will be on histograms and box plots.</p> <div id="fs-idp5542896" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="fs-idm1055600">This book contains instructions for constructing a histogram and a box plot for the TI-83+ and TI-84 calculators. The <a href="http://education.ti.com/educationportal/sites/US/sectionHome/support.html">Texas Instruments (TI) website</a> provides additional instructions for using these calculators.</p> </div> </div></div>
<div class="chapter standard" id="chapter-frequency-frequency-tables-and-levels-of-measurement" title="Chapter 2.2: Frequency, Frequency Tables, and Levels of Measurement"><div class="chapter-title-wrap"><h3 class="chapter-number">8</h3><h2 class="chapter-title"><span class="display-none">Chapter 2.2: Frequency, Frequency Tables, and Levels of Measurement</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p id="fs-idp5986864">Once you have a set of data, you will need to organize it so that you can analyze how frequently each datum occurs in the set. However, when calculating the frequency, you may need to round your answers so that they are as precise as possible.</p> <div id="fs-idm2552448" class="bc-section section" data-depth="1"><h3 data-type="title">Answers and Rounding Off</h3> <p id="id5602419">A simple way to round off answers is to carry your final answer one more decimal place than was present in the original data. Round off only the final answer. Do not round off any intermediate results, if possible. If it becomes necessary to round off intermediate results, carry them to at least twice as many decimal places as the final answer. For example, the average of the three quiz scores four, six, and nine is 6.3, rounded off to the nearest tenth, because the data are whole numbers. Most answers will be rounded off in this manner.</p> <p id="id5164237">It is not necessary to reduce most fractions in this course. Especially in <a href="/contents/326ee2e0-0ccd-46ae-a776-f8857a5dad4c">Probability Topics</a>, the chapter on probability, it is more helpful to leave an answer as an unreduced fraction.</p> </div> <div class="bc-section section" data-depth="1"><h3 data-type="title">Levels of Measurement</h3> <p id="eip-192">The way a set of data is measured is called its <span data-type="term">level of measurement</span>. Correct statistical procedures depend on a researcher being familiar with levels of measurement. Not every statistical operation can be used with every set of data. Data can be classified into four levels of measurement. They are (from lowest to highest level):</p> <ul><li><strong>Nominal scale level</strong></li> <li><strong>Ordinal scale level</strong></li> <li><strong>Interval scale level</strong></li> <li><strong>Ratio scale level</strong></li> </ul> <p>Data that is measured using a <span data-type="term">nominal scale</span> is <strong>qualitative(categorical)</strong>. Categories, colors, names, labels and favorite foods along with yes or no responses are examples of nominal level data. Nominal scale data are not ordered. For example, trying to classify people according to their favorite food does not make any sense. Putting pizza first and sushi second is not meaningful.</p> <p id="eip-817">Smartphone companies are another example of nominal scale data. The data are the names of the companies that make smartphones, but there is no agreed upon order of these brands, even though people may have personal preferences. Nominal scale data cannot be used in calculations.</p> <p id="eip-369">Data that is measured using an <span data-type="term">ordinal scale</span> is similar to nominal scale data but there is a big difference. The ordinal scale data can be ordered. An example of ordinal scale data is a list of the top five national parks in the United States. The top five national parks in the United States can be ranked from one to five but we cannot measure differences between the data.</p> <p id="eip-289">Another example of using the ordinal scale is a cruise survey where the responses to questions about the cruise are “excellent,” “good,” “satisfactory,” and “unsatisfactory.” These responses are ordered from the most desired response to the least desired. But the differences between two pieces of data cannot be measured. Like the nominal scale data, ordinal scale data cannot be used in calculations.</p> <p id="eip-158">Data that is measured using the <span data-type="term">interval scale</span> is similar to ordinal level data because it has a definite ordering but there is a difference between data. The differences between interval scale data can be measured though the data does not have a starting point.</p> <p id="eip-134">Temperature scales like Celsius (C) and Fahrenheit (F) are measured by using the interval scale. In both temperature measurements, 40° is equal to 100° minus 60°. Differences make sense. But 0 degrees does not because, in both scales, 0 is not the absolute lowest temperature. Temperatures like -10° F and -15° C exist and are colder than 0.</p> <p id="eip-435">Interval level data can be used in calculations, but one type of comparison cannot be done. 80° C is not four times as hot as 20° C (nor is 80° F four times as hot as 20° F). There is no meaning to the ratio of 80 to 20 (or four to one).</p> <p>Data that is measured using the <span data-type="term">ratio scale</span> takes care of the ratio problem and gives you the most information. Ratio scale data is like interval scale data, but it has a 0 point and ratios can be calculated. For example, four multiple choice statistics final exam scores are 80, 68, 20 and 92 (out of a possible 100 points). The exams are machine-graded.</p> <p id="eip-53">The data can be put in order from lowest to highest: 20, 68, 80, 92.</p> <p id="eip-750">The differences between the data have meaning. The score 92 is more than the score 68 by 24 points. Ratios can be calculated. The smallest score is 0. So 80 is four times 20. The score of 80 is four times better than the score of 20.</p> </div> <div id="eip-959" class="bc-section section" data-depth="1"><h3 data-type="title">Frequency</h3> <p id="id7489802">Twenty students were asked how many hours they worked per day. Their responses, in hours, are as follows: <span id="set-element-244" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">5  </span><span data-type="item">6  </span><span data-type="item">3  </span><span data-type="item">3  </span><span data-type="item">2  </span><span data-type="item">4  </span><span data-type="item">7  </span><span data-type="item">5  </span><span data-type="item">2  </span><span data-type="item">3  </span><span data-type="item">5  </span><span data-type="item">6  </span><span data-type="item">5  </span><span data-type="item">4  </span><span data-type="item">4  </span><span data-type="item">3  </span><span data-type="item">5  </span><span data-type="item">2  </span><span data-type="item">5  </span><span data-type="item">3</span></span>.</p> <p id="id9267444"><a class="autogenerated-content" href="#id10383738">(Figure)</a> lists the different data values in ascending order and their frequencies.</p> <table id="id10383738" summary="This table presents the values provided in the previously given data set in the first column, and the frequency of each value in the second column."><caption><span data-type="title">Frequency Table of Student Work Hours</span></caption> <thead><tr><th>DATA VALUE</th> <th>FREQUENCY</th> </tr> </thead> <tbody><tr><td><span class="normal" data-type="emphasis" data-effect="normal">2</span></td> <td>3</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">3</span></td> <td>5</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">4</span></td> <td>3</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">5</span></td> <td>6</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">6</span></td> <td>2</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">7</span></td> <td>1</td> </tr> </tbody> </table> <p id="element-118">A <span data-type="term">frequency</span> is the number of times a value of the data occurs. According to <a class="autogenerated-content" href="#id10383738">(Figure)</a>, there are three students who work two hours, five students who work three hours, and so on. The sum of the values in the frequency column, 20, represents the total number of students included in the sample.</p> <p id="id8007492">A <span data-type="term">relative frequency</span> is the ratio (fraction or proportion) of the number of times a value of the data occurs in the set of all outcomes to the total number of outcomes. To find the relative frequencies, divide each frequency by the total number of students in the sample–in this case, 20. Relative frequencies can be written as fractions, percents, or decimals.</p> <table id="id11177380" summary="Frequency Table of Student Work Hours with Relative Frequencies"><caption><span data-type="title">Frequency Table of Student Work Hours with Relative Frequencies</span></caption> <thead><tr><th>DATA VALUE</th> <th>FREQUENCY</th> <th>RELATIVE FREQUENCY</th> </tr> </thead> <tbody><tr><td><span class="normal" data-type="emphasis" data-effect="normal">2</span></td> <td>3</td> <td>\(\frac{3}{20}\) or 0.15</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">3</span></td> <td>5</td> <td>\(\frac{5}{20}\) or 0.25</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">4</span></td> <td>3</td> <td>\(\frac{3}{20}\) or 0.15</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">5</span></td> <td>6</td> <td>\(\frac{6}{20}\) or 0.30</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">6</span></td> <td>2</td> <td>\(\frac{2}{20}\) or 0.10</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">7</span></td> <td>1</td> <td>\(\frac{1}{20}\) or 0.05</td> </tr> </tbody> </table> <p id="id7087521">The sum of the values in the relative frequency column of <a class="autogenerated-content" href="#id11177380">(Figure)</a> is (frac{20}{20}) , or 1.</p> <p id="id7575466"><span data-type="term">Cumulative relative frequency</span> is the accumulation of the previous relative frequencies. To find the cumulative relative frequencies, add all the previous relative frequencies to the relative frequency for the current row, as shown in <a class="autogenerated-content" href="#id10564302">(Figure)</a>.</p> <table id="id10564302" summary="Table shows data, frequency, relative frequency and cumulative relative frequency."><caption><span data-type="title">Frequency Table of Student Work Hours with Relative and Cumulative Relative Frequencies</span></caption> <thead><tr><th>DATA VALUE</th> <th>FREQUENCY</th> <th>RELATIVE <span data-type="newline"><br /> </span>FREQUENCY</th> <th>CUMULATIVE RELATIVE <span data-type="newline"><br /> </span>FREQUENCY</th> </tr> </thead> <tbody><tr><td><span class="normal" data-type="emphasis" data-effect="normal">2</span></td> <td>3</td> <td>\(\frac{3}{20}\) or 0.15</td> <td>0.15</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">3</span></td> <td>5</td> <td>\(\frac{5}{20}\) or 0.25</td> <td>0.15 + 0.25 = 0.40</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">4</span></td> <td>3</td> <td>\(\frac{3}{20}\) or 0.15</td> <td>0.40 + 0.15 = 0.55</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">5</span></td> <td>6</td> <td>\(\frac{6}{20}\) or 0.30</td> <td>0.55 + 0.30 = 0.85</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">6</span></td> <td>2</td> <td>\(\frac{2}{20}\) or 0.10</td> <td>0.85 + 0.10 = 0.95</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">7</span></td> <td>1</td> <td>\(\frac{1}{20}\) or 0.05</td> <td>0.95 + 0.05 = 1.00</td> </tr> </tbody> </table> <p id="id3561407">The last entry of the cumulative relative frequency column is one, indicating that one hundred percent of the data has been accumulated.</p> <div id="id16479556" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="eip-idp27462880">Because of rounding, the relative frequency column may not always sum to one, and the last entry in the cumulative relative frequency column may not be one. However, they each should be close to one.</p> </div> <p id="element-305"><a class="autogenerated-content" href="#id9703284">(Figure)</a> represents the heights, in inches, of a sample of 100 male semiprofessional soccer players.</p> <table id="id9703284" summary="This table presents a range of heights in inches in the first column, the number of students whose height falls within that range in the second column, the relative frequency of students in this range (expressed as both a fraction and a decimal) in the third column, and the cumulative relative frequency (expressed as a sum of current and previous relative frequency values) in the fourth column."><caption><span data-type="title">Frequency Table of Soccer Player Height</span></caption> <thead><tr><th>HEIGHTS <span data-type="newline"><br /> </span>(INCHES)</th> <th>FREQUENCY</th> <th>RELATIVE <span data-type="newline"><br /> </span>FREQUENCY</th> <th>CUMULATIVE <span data-type="newline"><br /> </span>RELATIVE <span data-type="newline"><br /> </span>FREQUENCY</th> </tr> </thead> <tbody><tr><td><span class="normal" data-type="emphasis" data-effect="normal">59.95–61.95</span></td> <td>5</td> <td>\(\frac{5}{100}\) = 0.05</td> <td>0.05</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">61.95–63.95</span></td> <td>3</td> <td>\(\frac{3}{100}\) = 0.03</td> <td>0.05 + 0.03 = 0.08</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">63.95–65.95</span></td> <td>15</td> <td>\(\frac{15}{100}\) = 0.15</td> <td>0.08 + 0.15 = 0.23</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">65.95–67.95</span></td> <td>40</td> <td>\(\frac{40}{100}\) = 0.40</td> <td>0.23 + 0.40 = 0.63</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">67.95–69.95</span></td> <td>17</td> <td>\(\frac{17}{100}\) = 0.17</td> <td>0.63 + 0.17 = 0.80</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">69.95–71.95</span></td> <td>12</td> <td>\(\frac{12}{100}\) = 0.12</td> <td>0.80 + 0.12 = 0.92</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">71.95–73.95</span></td> <td>7</td> <td>\(\frac{7}{100}\) = 0.07</td> <td>0.92 + 0.07 = 0.99</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">73.95–75.95</span></td> <td>1</td> <td>\(\frac{1}{100}\) = 0.01</td> <td>0.99 + 0.01 = 1.00</td> </tr> <tr><td></td> <td><strong>Total = 100</strong></td> <td><strong>Total = 1.00</strong></td> <td></td> </tr> </tbody> </table> <p>The data in this table have been <strong>grouped</strong> into the following intervals:</p> <ul id="element-634"><li>59.95 to 61.95 inches</li> <li>61.95 to 63.95 inches</li> <li>63.95 to 65.95 inches</li> <li>65.95 to 67.95 inches</li> <li>67.95 to 69.95 inches</li> <li>69.95 to 71.95 inches</li> <li>71.95 to 73.95 inches</li> <li>73.95 to 75.95 inches</li> </ul> <div id="id16927214" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="eip-idp9286128">This example is used again in <a href="/contents/67ff0f10-8867-4852-85f6-0f0be2257ed4">Descriptive Statistics</a>, where the method used to compute the intervals will be explained.</p> </div> <p id="element-689">In this sample, there are <strong>five</strong> players whose heights fall within the interval 59.95–61.95 inches, <strong>three</strong> players whose heights fall within the interval 61.95–63.95 inches, <strong>15</strong> players whose heights fall within the interval 63.95–65.95 inches, <strong>40</strong> players whose heights fall within the interval 65.95–67.95 inches, <strong>17</strong> players whose heights fall within the interval 67.95–69.95 inches, <strong>12</strong> players whose heights fall within the interval 69.95–71.95, <strong>seven</strong> players whose heights fall within the interval 71.95–73.95, and <strong>one</strong> player whose heights fall within the interval 73.95–75.95. All heights fall between the endpoints of an interval and not at the endpoints.</p> <div id="element-23523" class="textbox textbox--examples" data-type="example"><div data-type="exercise"><div id="id16927329" data-type="problem"><p id="element-455">From <a class="autogenerated-content" href="#id9703284">(Figure)</a>, find the percentage of heights that are less than 65.95 inches.</p> </div> <div id="id16927349" data-type="solution"><p id="element-963">If you look at the first, second, and third rows, the heights are all less than 65.95 inches. There are 5 + 3 + 15 = 23 players whose heights are less than 65.95 inches. The percentage of heights less than 65.95 inches is then \(\frac{23}{100}\) or 23%. This percentage is the cumulative relative frequency entry in the third row.</p> </div> </div> </div> <div id="fs-idm40908240" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <p>&lt;!&#8211;1&#8211;&gt;</p> <div id="fs-idp57790448" data-type="exercise"><div id="fs-idm58055472" data-type="problem"><p id="fs-idm72824256"><a class="autogenerated-content" href="#fs-idm82680048">(Figure)</a> shows the amount, in inches, of annual rainfall in a sample of towns.</p> <table id="fs-idm82680048" summary="Table shows the amount in inches, of annual rainfallin a sample towns"><thead><tr><th>Rainfall (Inches)</th> <th>Frequency</th> <th>Relative Frequency</th> <th>Cumulative Relative Frequency</th> </tr> </thead> <tbody><tr><td>2.95–4.97</td> <td>6</td> <td>6/50 = 0.12</td> <td>0.12</td> </tr> <tr><td>4.97–6.99</td> <td>7</td> <td>\(\frac{7}{50}\) = 0.14</td> <td>0.12 + 0.14 = 0.26</td> </tr> <tr><td>6.99–9.01</td> <td>15</td> <td>\(\frac{15}{50}\) = 0.30</td> <td>0.26 + 0.30 = 0.56</td> </tr> <tr><td>9.01–11.03</td> <td>8</td> <td>\(\frac{8}{50}\) = 0.16</td> <td>0.56 + 0.16 = 0.72</td> </tr> <tr><td>11.03–13.05</td> <td>9</td> <td>\(\frac{9}{50}\) = 0.18</td> <td>0.72 + 0.18 = 0.90</td> </tr> <tr><td>13.05–15.07</td> <td>5</td> <td>\(\frac{5}{50}\) = 0.10</td> <td>0.90 + 0.10 = 1.00</td> </tr> <tr><td></td> <td>Total = 50</td> <td>Total = 1.00</td> <td></td> </tr> </tbody> </table> <p id="fs-idm81784864">From <a class="autogenerated-content" href="#fs-idm82680048">(Figure)</a>, find the percentage of rainfall that is less than 9.01 inches.</p> </div> </div> </div> <div id="element-2398" class="textbox textbox--examples" data-type="example"><div id="element-987" data-type="exercise"><div id="id16927407" data-type="problem"><p id="element-266">From <a class="autogenerated-content" href="#id9703284">(Figure)</a>, find the percentage of heights that fall between 61.95 and 65.95 inches.</p> </div> <div id="id16927427" data-type="solution"><p>Add the relative frequencies in the second and third rows: 0.03 + 0.15 = 0.18 or 18%.</p> </div> </div> </div> <div id="fs-idp124591904" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm61239616" data-type="exercise"><div id="fs-idm61142976" data-type="problem"><p id="fs-idm71703488">From <a class="autogenerated-content" href="#fs-idm82680048">(Figure)</a>, find the percentage of rainfall that is between 6.99 and 13.05 inches.</p> </div> </div> </div> <div id="element-235235" class="textbox textbox--examples" data-type="example"><div id="element-994" data-type="exercise"><div id="id16927469" data-type="problem"><p id="element-952">Use the heights of the 100 male semiprofessional soccer players in <a class="autogenerated-content" href="#id9703284">(Figure)</a>. Fill in the blanks and check your answers.</p> <ol id="element-162" type="a"><li>The percentage of heights that are from 67.95 to 71.95 inches is: ____.</li> <li>The percentage of heights that are from 67.95 to 73.95 inches is: ____.</li> <li>The percentage of heights that are more than 65.95 inches is: ____.</li> <li>The number of players in the sample who are between 61.95 and 71.95 inches tall is: ____.</li> <li>What kind of data are the heights?</li> <li>Describe how you could gather this data (the heights) so that the data are characteristic of all male semiprofessional soccer players.</li> </ol> <p id="element-683">Remember, you <strong>count frequencies</strong>. To find the relative frequency, divide the frequency by the total number of data values. To find the cumulative relative frequency, add all of the previous relative frequencies to the relative frequency for the current row.</p> </div> <div id="id16927576" data-type="solution"><ol id="solution-list-1" type="a"><li>29%</li> <li>36%</li> <li>77%</li> <li>87</li> <li>quantitative continuous</li> <li>get rosters from each team and choose a simple random sample from each</li> </ol> </div> </div> </div> <div id="fs-idp31175024" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm58580624" data-type="exercise"><div id="fs-idm39695376" data-type="problem"><p id="fs-idm98844192">From <a class="autogenerated-content" href="#fs-idm82680048">(Figure)</a>, find the number of towns that have rainfall between 2.95 and 9.01 inches.</p> </div> </div> </div> <div id="res" class="statistics collab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Collaborative Exercise</div> <p id="id8661133">In your class, have someone conduct a survey of the number of siblings (brothers and sisters) each student has. Create a frequency table. Add to it a relative frequency column and a cumulative relative frequency column. Answer the following questions:</p> <ol><li>What percentage of the students in your class have no siblings?</li> <li>What percentage of the students have from one to three siblings?</li> <li>What percentage of the students have fewer than three siblings?</li> </ol> </div> <div id="element-569" class="textbox textbox--examples" data-type="example"><p id="element-755">Nineteen people were asked how many miles, to the nearest mile, they commute to work each day. The data are as follows: <span id="set-element-392" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">2 </span><span data-type="item">5 </span><span data-type="item">7 </span><span data-type="item">3 </span><span data-type="item">2 </span><span data-type="item">10 </span><span data-type="item">18 </span><span data-type="item">15 </span><span data-type="item">20 </span><span data-type="item">7 </span><span data-type="item">10 </span><span data-type="item">18 </span><span data-type="item">5 </span><span data-type="item">12 </span><span data-type="item">13 </span><span data-type="item">12 </span><span data-type="item">4 </span><span data-type="item">5 </span><span data-type="item">10</span></span>. <a class="autogenerated-content" href="#id9833287">(Figure)</a> was produced:</p> <table id="id9833287" summary="This table presents the number of miles driven by survey respondents in the first column, the frequency of each response in the second column, the relative frequency (expressed as a fraction) in the third column, and the cumulative relative frequency (expressed as a decimal) in the fourth column."><caption><span data-type="title">Frequency of Commuting Distances</span></caption> <thead><tr><th>DATA</th> <th>FREQUENCY</th> <th>RELATIVE <span data-type="newline"><br /> </span>FREQUENCY</th> <th>CUMULATIVE <span data-type="newline"><br /> </span>RELATIVE <span data-type="newline"><br /> </span>FREQUENCY</th> </tr> </thead> <tbody><tr><td><span class="normal" data-type="emphasis" data-effect="normal">3</span></td> <td>3</td> <td>\(\frac{3}{19}\)</td> <td>0.1579</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">4</span></td> <td>1</td> <td>\(\frac{1}{19}\)</td> <td>0.2105</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">5</span></td> <td>3</td> <td>\(\frac{3}{19}\)</td> <td>0.1579</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">7</span></td> <td>2</td> <td>\(\frac{2}{19}\)</td> <td>0.2632</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">10</span></td> <td>3</td> <td>\(\frac{4}{19}\)</td> <td>0.4737</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">12</span></td> <td>2</td> <td>\(\frac{2}{19}\)</td> <td>0.7895</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">13</span></td> <td>1</td> <td>\(\frac{1}{19}\)</td> <td>0.8421</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">15</span></td> <td>1</td> <td>\(\frac{1}{19}\)</td> <td>0.8948</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">18</span></td> <td>1</td> <td>\(\frac{1}{19}\)</td> <td>0.9474</td> </tr> <tr><td><span class="normal" data-type="emphasis" data-effect="normal">20</span></td> <td>1</td> <td>\(\frac{1}{19}\)</td> <td>1.0000</td> </tr> </tbody> </table> <div id="element-268" data-type="exercise"><div id="id16928563" data-type="problem"><ol id="element-582" type="a"><li>Is the table correct? If it is not correct, what is wrong?</li> <li>True or False: Three percent of the people surveyed commute three miles. If the statement is not correct, what should it be? If the table is incorrect, make the corrections.</li> <li>What fraction of the people surveyed commute five or seven miles?</li> <li>What fraction of the people surveyed commute 12 miles or more? Less than 12 miles? Between five and 13 miles (not including five and 13 miles)?</li> </ol> </div> <div id="id16928619" data-type="solution"><ol id="solution-list-2" type="a"><li>No. The frequency column sums to 18, not 19. Not all cumulative relative frequencies are correct.</li> <li>False. The frequency for three miles should be one; for two miles (left out), two. The cumulative relative frequency column should read: 0.1052, 0.1579, 0.2105, 0.3684, 0.4737, 0.6316, 0.7368, 0.7895, 0.8421, 0.9474, 1.0000.</li> <li>\(\frac{5}{19}\)</li> <li>\(\frac{7}{19}\), \(\frac{12}{19}\), \(\frac{7}{19}\)</li> </ol> </div> </div> </div> <div id="fs-idp46953216" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm57469440" data-type="exercise"><div id="fs-idm86905392" data-type="problem"><p id="fs-idm12777216"><a class="autogenerated-content" href="#fs-idm82680048">(Figure)</a> represents the amount, in inches, of annual rainfall in a sample of towns. What fraction of towns surveyed get between 11.03 and 13.05 inches of rainfall each year?</p> </div> </div> </div> <div id="fs-idm64477536" class="textbox textbox--examples" data-type="example"><p id="fs-idm39829664"><a class="autogenerated-content" href="#fs-idm52810848">(Figure)</a> contains the total number of deaths worldwide as a result of earthquakes for the period from 2000 to 2012.</p> <table id="fs-idm52810848" summary="This table represents the total number of deaths worldwide as a result of earthquakes for the period from 2000 to 2012."><thead><tr><th>Year</th> <th>Total Number of Deaths</th> </tr> </thead> <tbody><tr><td>2000</td> <td>231</td> </tr> <tr><td>2001</td> <td>21,357</td> </tr> <tr><td>2002</td> <td>11,685</td> </tr> <tr><td>2003</td> <td>33,819</td> </tr> <tr><td>2004</td> <td>228,802</td> </tr> <tr><td>2005</td> <td>88,003</td> </tr> <tr><td>2006</td> <td>6,605</td> </tr> <tr><td>2007</td> <td>712</td> </tr> <tr><td>2008</td> <td>88,011</td> </tr> <tr><td>2009</td> <td>1,790</td> </tr> <tr><td>2010</td> <td>320,120</td> </tr> <tr><td>2011</td> <td>21,953</td> </tr> <tr><td>2012</td> <td>768</td> </tr> <tr><td>Total</td> <td>823,856</td> </tr> </tbody> </table> <div id="fs-idm15443744" data-type="exercise"><div id="fs-idm56505488" data-type="problem"><p id="fs-idm38990240">Answer the following questions.</p> <ol id="fs-idm21722256" type="a"><li>What is the frequency of deaths measured from 2006 through 2009?</li> <li>What percentage of deaths occurred after 2009?</li> <li>What is the relative frequency of deaths that occurred in 2003 or earlier?</li> <li>What is the percentage of deaths that occurred in 2004?</li> <li>What kind of data are the numbers of deaths?</li> <li>The Richter scale is used to quantify the energy produced by an earthquake. Examples of Richter scale numbers are 2.3, 4.0, 6.1, and 7.0. What kind of data are these numbers?</li> </ol> </div> <div id="fs-idm33435392" data-type="solution"><ol id="fs-idm15442992" type="a"><li>97,118 (11.8%)</li> <li>41.6%</li> <li>67,092/823,356 or 0.081 or 8.1 %</li> <li>27.8%</li> <li>Quantitative discrete</li> <li>Quantitative continuous</li> </ol> </div> </div> </div> <div id="fs-idm44002848" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm62775664" data-type="exercise"><div id="fs-idm9924272" data-type="problem"><p id="fs-idm65128336"><a class="autogenerated-content" href="#fs-idm74902768">(Figure)</a> contains the total number of fatal motor vehicle traffic crashes in the United States for the period from 1994 to 2011.</p> <table id="fs-idm74902768" summary=""><thead><tr><th>Year</th> <th>Total Number of Crashes</th> <th>Year</th> <th>Total Number of Crashes</th> </tr> </thead> <tbody><tr><td>1994</td> <td>36,254</td> <td>2004</td> <td>38,444</td> </tr> <tr><td>1995</td> <td>37,241</td> <td>2005</td> <td>39,252</td> </tr> <tr><td>1996</td> <td>37,494</td> <td>2006</td> <td>38,648</td> </tr> <tr><td>1997</td> <td>37,324</td> <td>2007</td> <td>37,435</td> </tr> <tr><td>1998</td> <td>37,107</td> <td>2008</td> <td>34,172</td> </tr> <tr><td>1999</td> <td>37,140</td> <td>2009</td> <td>30,862</td> </tr> <tr><td>2000</td> <td>37,526</td> <td>2010</td> <td>30,296</td> </tr> <tr><td>2001</td> <td>37,862</td> <td>2011</td> <td>29,757</td> </tr> <tr><td>2002</td> <td>38,491</td> <td>Total</td> <td>653,782</td> </tr> <tr><td>2003</td> <td>38,477</td> <td></td> <td></td> </tr> </tbody> </table> <p id="fs-idm72473616">Answer the following questions.</p> <ol id="fs-idm94591440" type="a"><li>What is the frequency of deaths measured from 2000 through 2004?</li> <li>What percentage of deaths occurred after 2006?</li> <li>What is the relative frequency of deaths that occurred in 2000 or before?</li> <li>What is the percentage of deaths that occurred in 2011?</li> <li>What is the cumulative relative frequency for 2006? Explain what this number tells you about the data.</li> </ol> </div> </div> </div> </div> <div class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="eip-idm114490208">“State &amp; County QuickFacts,” U.S. Census Bureau. http://quickfacts.census.gov/qfd/download_data.html (accessed May 1, 2013).</p> <p id="eip-idm217605552">“State &amp; County QuickFacts: Quick, easy access to facts about people, business, and geography,” U.S. Census Bureau. http://quickfacts.census.gov/qfd/index.html (accessed May 1, 2013).</p> <p id="eip-idm85470464">“Table 5: Direct hits by mainland United States Hurricanes (1851-2004),” National Hurricane Center, http://www.nhc.noaa.gov/gifs/table5.gif (accessed May 1, 2013).</p> <p id="eip-idm126552704">“Levels of Measurement,” http://infinity.cos.edu/faculty/woodbury/stats/tutorial/Data_Levels.htm (accessed May 1, 2013).</p> <p id="eip-idm96097840">Courtney Taylor, “Levels of Measurement,” about.com, http://statistics.about.com/od/HelpandTutorials/a/Levels-Of-Measurement.htm (accessed May 1, 2013).</p> <p id="eip-idm150901088">David Lane. “Levels of Measurement,” Connexions, http://cnx.org/content/m10809/latest/ (accessed May 1, 2013).</p> </div> <div id="fs-idm13164064" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idm13630672">Some calculations generate numbers that are artificially precise. It is not necessary to report a value to eight decimal places when the measures that generated that value were only accurate to the nearest tenth. Round off your final answer to one more decimal place than was present in the original data. This means that if you have data measured to the nearest tenth of a unit, report the final statistic to the nearest hundredth.</p> <p id="eip-50">In addition to rounding your answers, you can measure your data using the following four levels of measurement.</p> <ul id="eip-349"><li><strong>Nominal scale level:</strong> data that cannot be ordered nor can it be used in calculations</li> <li><strong>Ordinal scale level:</strong> data that can be ordered; the differences cannot be measured</li> <li><strong>Interval scale level:</strong> data with a definite ordering but no starting point; the differences can be measured, but there is no such thing as a ratio.</li> <li><strong>Ratio scale level:</strong> data with a starting point that can be ordered; the differences have meaning and ratios can be calculated.</li> </ul> <p id="fs-idm83721584">When organizing data, it is important to know how many times a value appears. How many statistics students study five hours or more for an exam? What percent of families on our block own two pets? Frequency, relative frequency, and cumulative relative frequency are measures that answer questions like these.</p> </div> <div id="eip-206" class="practice" data-depth="1"><div id="eip-167" data-type="exercise"><div id="eip-429" data-type="problem"><p id="eip-idm3326416">What type of measure scale is being used? Nominal, ordinal, interval or ratio.</p> <ol id="eip-idm4124352" type="a" data-element-type="enumerated"><li>High school soccer players classified by their athletic ability: Superior, Average, Above average</li> <li>Baking temperatures for various main dishes: 350, 400, 325, 250, 300</li> <li>The colors of crayons in a 24-crayon box</li> <li>Social security numbers</li> <li>Incomes measured in dollars</li> <li>A satisfaction survey of a social website by number: 1 = very satisfied, 2 = somewhat satisfied, 3 = not satisfied</li> <li>Political outlook: extreme left, left-of-center, right-of-center, extreme right</li> <li>Time of day on an analog watch</li> <li>The distance in miles to the closest grocery store</li> <li>The dates 1066, 1492, 1644, 1947, and 1944</li> <li>The heights of 21–65 year-old women</li> <li>Common letter grades: A, B, C, D, and F</li> </ol> </div> <div id="eip-171" data-type="solution"><ol id="eip-idm34826016" type="a" data-element-type="enumerated"><li>ordinal</li> <li>interval</li> <li>nominal</li> <li>nominal</li> <li>ratio</li> <li>ordinal</li> <li>nominal</li> <li>interval</li> <li>ratio</li> <li>interval</li> <li>ratio</li> <li>ordinal</li> </ol> </div> </div> </div> <div id="fs-idm8238960" class="free-response" data-depth="1"><h3 data-type="title">HOMEWORK</h3> <div id="eip-idp18600336" data-type="exercise"><div id="id29870590" data-type="problem"><p>1) Fifty part-time students were asked how many courses they were taking this term. The (incomplete) results are shown below:</p> <table id="element-795" summary="This incomplete table will be completed by the student as an exercise using the values provided. Fifty students were asked how many courses they were taking; for every value in the first column, students are given or must calculate the frequency of responses in the second column, the relative frequency in the third column, and the cumulative relative frequency in the fourth column."><caption><span data-type="title">Part-time Student Course Loads</span></caption> <thead><tr><th># of Courses</th> <th>Frequency</th> <th>Relative Frequency</th> <th>Cumulative Relative Frequency</th> </tr> </thead> <tbody><tr><td>1</td> <td>30</td> <td>0.6</td> <td></td> </tr> <tr><td>2</td> <td>15</td> <td></td> <td></td> </tr> <tr><td>3</td> <td></td> <td></td> <td></td> </tr> </tbody> </table> <ol id="element-190" type="a" data-mark-suffix="."><li>Fill in the blanks in <a class="autogenerated-content" href="#element-795">(Figure)</a>.</li> <li>What percent of students take exactly two courses?</li> <li>What percent of students take one or two courses?</li> </ol> </div> <table id="eip-idm19957776" summary="..."><thead></thead> </table> </div> <div id="eip-idp1177440" data-type="exercise"><div id="id29870923" data-type="problem"><p id="element-426">2) Sixty adults with gum disease were asked the number of times per week they used to floss before their diagnosis. The (incomplete) results are shown in <a class="autogenerated-content" href="#element-619">(Figure)</a>.</p> <table id="element-619" summary="Flossing Frequency for Adults with Gum Disease"><caption><span data-type="title">Flossing Frequency for Adults with Gum Disease</span></caption> <thead><tr><th># Flossing per Week</th> <th>Frequency</th> <th>Relative Frequency</th> <th>Cumulative Relative Freq.</th> </tr> </thead> <tbody><tr><td>0</td> <td>27</td> <td>0.4500</td> <td></td> </tr> <tr><td>1</td> <td>18</td> <td></td> <td></td> </tr> <tr><td>3</td> <td></td> <td></td> <td>0.9333</td> </tr> <tr><td>6</td> <td>3</td> <td>0.0500</td> <td></td> </tr> <tr><td>7</td> <td>1</td> <td>0.0167</td> <td></td> </tr> </tbody> </table> <ol id="element-502" type="a" data-mark-suffix="."><li>Fill in the blanks in <a class="autogenerated-content" href="#element-619">(Figure)</a>.</li> <li>What percent of adults flossed six times per week?</li> <li>What percent flossed at most three times per week?</li> </ol> </div> <div id="id30125528" data-type="solution"></div> </div> <div id="eip-idp3710944" data-type="exercise"><div id="id30202348" data-type="problem"><p id="element-314">3) Nineteen immigrants to the U.S were asked how many years, to the nearest year, they have lived in the U.S. The data are as follows: <span id="set-element-773" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">2</span><span data-type="item">5</span><span data-type="item">7</span><span data-type="item">2</span><span data-type="item">2</span><span data-type="item">10</span><span data-type="item">20</span><span data-type="item">15</span><span data-type="item">0</span><span data-type="item">7</span><span data-type="item">0</span><span data-type="item">20</span><span data-type="item">5</span><span data-type="item">12</span><span data-type="item">15</span><span data-type="item">12</span><span data-type="item">4</span><span data-type="item">5</span><span data-type="item">10</span></span>.</p> <p id="element-20"><a class="autogenerated-content" href="#element-9">(Figure)</a> was produced.</p> <table id="element-9" summary="This table presents the frequencies associated with the given data set, with the first column containing the data value, the second column containing the frequency, the third column containing the relative frequency (expressed as a fraction), and the fourth column expressing the cumulative relative frequency (expressed as a decimal)."><caption><span data-type="title">Frequency of Immigrant Survey Responses</span></caption> <thead><tr><th>Data</th> <th>Frequency</th> <th>Relative Frequency</th> <th>Cumulative Relative Frequency</th> </tr> </thead> <tbody><tr><td>0</td> <td>2</td> <td>\(\frac{2}{19}\)</td> <td>0.1053</td> </tr> <tr><td>2</td> <td>3</td> <td>\(\frac{3}{19}\)</td> <td>0.2632</td> </tr> <tr><td>4</td> <td>1</td> <td>\(\frac{1}{19}\)</td> <td>0.3158</td> </tr> <tr><td>5</td> <td>3</td> <td>\(\frac{3}{19}\)</td> <td>0.4737</td> </tr> <tr><td>7</td> <td>2</td> <td>\(\frac{2}{19}\)</td> <td>0.5789</td> </tr> <tr><td>10</td> <td>2</td> <td>\(\frac{2}{19}\)</td> <td>0.6842</td> </tr> <tr><td>12</td> <td>2</td> <td>\(\frac{2}{19}\)</td> <td>0.7895</td> </tr> <tr><td>15</td> <td>1</td> <td>\(\frac{1}{19}\)</td> <td>0.8421</td> </tr> <tr><td>20</td> <td>1</td> <td>\(\frac{1}{19}\)</td> <td>1.0000</td> </tr> </tbody> </table> <ol id="element-650" type="a" data-mark-suffix="."><li>Fix the errors in <a class="autogenerated-content" href="#element-9">(Figure)</a>. Also, explain how someone might have arrived at the incorrect number(s).</li> <li>Explain what is wrong with this statement: “47 percent of the people surveyed have lived in the U.S. for 5 years.”</li> <li>Fix the statement in <strong>b</strong> to make it correct.</li> <li>What fraction of the people surveyed have lived in the U.S. five or seven years?</li> <li>What fraction of the people surveyed have lived in the U.S. at most 12 years?</li> <li>What fraction of the people surveyed have lived in the U.S. fewer than 12 years?</li> <li>What fraction of the people surveyed have lived in the U.S. from five to 20 years, inclusive?</li> </ol> </div> </div> <div id="eip-idp24902672" data-type="exercise"><div id="eip-idp50833472" data-type="problem"><p id="fs-idp1582784">4) How much time does it take to travel to work? <a class="autogenerated-content" href="#fs-idm68779488">(Figure)</a> shows the mean commute time by state for workers at least 16 years old who are not working at home. Find the mean travel time, and round off the answer properly.</p> <table id="fs-idm68779488" summary=""><tbody><tr><td>24.0</td> <td>24.3</td> <td>25.9</td> <td>18.9</td> <td>27.5</td> <td>17.9</td> <td>21.8</td> <td>20.9</td> <td>16.7</td> <td>27.3</td> </tr> <tr><td>18.2</td> <td>24.7</td> <td>20.0</td> <td>22.6</td> <td>23.9</td> <td>18.0</td> <td>31.4</td> <td>22.3</td> <td>24.0</td> <td>25.5</td> </tr> <tr><td>24.7</td> <td>24.6</td> <td>28.1</td> <td>24.9</td> <td>22.6</td> <td>23.6</td> <td>23.4</td> <td>25.7</td> <td>24.8</td> <td>25.5</td> </tr> <tr><td>21.2</td> <td>25.7</td> <td>23.1</td> <td>23.0</td> <td>23.9</td> <td>26.0</td> <td>16.3</td> <td>23.1</td> <td>21.4</td> <td>21.5</td> </tr> <tr><td>27.0</td> <td>27.0</td> <td>18.6</td> <td>31.7</td> <td>23.3</td> <td>30.1</td> <td>22.9</td> <td>23.3</td> <td>21.7</td> <td>18.6</td> </tr> </tbody> </table> </div> <div id="eip-idp10180912" data-type="solution"><p id="eip-idm56114560"></p></div> </div> <div id="eip-idm71401488" data-type="exercise"><div id="eip-idm71401232" data-type="problem"><p id="fs-idm72637568"><em data-effect="italics">5) Forbes</em> magazine published data on the best small firms in 2012. These were firms which had been publicly traded for at least a year, have a stock price of at least ?5 per share, and have reported annual revenue between ?5 million and ?1 billion. <a class="autogenerated-content" href="#fs-idm84109040">(Figure)</a> shows the ages of the chief executive officers for the first 60 ranked firms.</p> <table id="fs-idm84109040" summary="Table shows the ages of the chief executive officers for the first 60 ranked firms"><colgroup><col /> <col data-align="right" /> <col /> <col /></colgroup> <thead><tr><th>Age</th> <th>Frequency</th> <th>Relative Frequency</th> <th>Cumulative Relative Frequency</th> </tr> </thead> <tbody><tr><td>40–44</td> <td>3</td> <td></td> <td></td> </tr> <tr><td>45–49</td> <td>11</td> <td></td> <td></td> </tr> <tr><td>50–54</td> <td>13</td> <td></td> <td></td> </tr> <tr><td>55–59</td> <td>16</td> <td></td> <td></td> </tr> <tr><td>60–64</td> <td>10</td> <td></td> <td></td> </tr> <tr><td>65–69</td> <td>6</td> <td></td> <td></td> </tr> <tr><td>70–74</td> <td>1</td> <td></td> <td></td> </tr> </tbody> </table> <ol id="fs-idm95668240" type="a"><li>What is the frequency for CEO ages between 54 and 65?</li> <li>What percentage of CEOs are 65 years or older?</li> <li>What is the relative frequency of ages under 50?</li> <li>What is the cumulative relative frequency for CEOs younger than 55?</li> <li>Which graph shows the relative frequency and which shows the cumulative relative frequency?</li> </ol> <div id="eip-idp25576880" class="bc-figure figure" data-orient="horizontal"><span id="fs-idm20141232" data-type="media" data-alt="Graph A is a bar graph with 7 bars. The x-axis shows CEO's ages in intervals of 5 years starting with 40 - 44. The y-axis shows the relative frequency in intervals of 0.2 from 0 - 1. The highest relative frequency shown is 0.27. Graph B is a bar graph with 7 bars. The x-axis shows CEO's ages in intervals of 5 years starting with 40 - 44. The y-axis shows relative frequency in intervals of 0.2 from 0 - 1. The highest relative frequency shown is 1."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/CNX_Stats_C01_M10_003-1.jpg" alt="Graph A is a bar graph with 7 bars. The x-axis shows CEO's ages in intervals of 5 years starting with 40 - 44. The y-axis shows the relative frequency in intervals of 0.2 from 0 - 1. The highest relative frequency shown is 0.27. Graph B is a bar graph with 7 bars. The x-axis shows CEO's ages in intervals of 5 years starting with 40 - 44. The y-axis shows relative frequency in intervals of 0.2 from 0 - 1. The highest relative frequency shown is 1." data-media-type="image/jpg" /></span></div> </div> </div> <p id="id5096156"><em data-effect="italics"> Use the following information to answer the next two exercises:</em><a class="autogenerated-content" href="#id9247047">(Figure)</a> contains data on hurricanes that have made direct hits on the U.S. Between 1851 and 2004. A hurricane is given a strength category rating based on the minimum wind speed generated by the storm.</p> <table id="id9247047" summary="This incomplete table presents frequency data related to hurricane activity. The first column displays the hurricane category, the second column displays the frequency of direct hits, the third column presents the relative frequency (expressed as a decimal), and the fourth column presents the cumulative relative frequency (expressed as a decimal). The two missing values are to be caluculated by the students in the upcoming exercises."><caption><span data-type="title">Frequency of Hurricane Direct Hits</span></caption> <thead><tr><th>Category</th> <th>Number of Direct Hits</th> <th>Relative Frequency</th> <th>Cumulative Frequency</th> </tr> </thead> <tfoot><tr><td></td> <td>Total = 273</td> <td></td> <td></td> </tr> </tfoot> <tbody><tr><td>1</td> <td>109</td> <td>0.3993</td> <td>0.3993</td> </tr> <tr><td>2</td> <td>72</td> <td>0.2637</td> <td>0.6630</td> </tr> <tr><td>3</td> <td>71</td> <td>0.2601</td> <td></td> </tr> <tr><td>4</td> <td>18</td> <td></td> <td>0.9890</td> </tr> <tr><td>5</td> <td>3</td> <td>0.0110</td> <td>1.0000</td> </tr> </tbody> </table> <div id="element-87" data-type="exercise"><div id="id30104614" data-type="problem"><p id="element-933">6) What is the relative frequency of direct hits that were category 4 hurricanes?</p> <ol id="element-662" type="a" data-mark-suffix="."><li>0.0768</li> <li>0.0659</li> <li>0.2601</li> <li>Not enough information to calculate</li> </ol> </div> </div> <div id="element-627" data-type="exercise"><div id="id30104731" data-type="problem"><p id="element-544">7) What is the relative frequency of direct hits that were AT MOST a category 3 storm?</p> <ol type="a" data-mark-suffix="."><li>0.3480</li> <li>0.9231</li> <li>0.2601</li> <li>0.3370</li> </ol> </div> <p>Answers to odd Questions</p> <p>1) 30% 90%</p> <p>3)  The Frequencies for 15 and 20 should both be two and the Relative Frequencies should both be 2 19. The mistake could be due to copying the data down wrong. The Cumulative Relative Frequency for five years should be 0.4737. The mistake is due to calculating the Relative Frequency instead of the Cumulative Relative Frequency. The Cumulative Relative Frequency for 15 years should be 0.8947 The 47% is the Cumulative Relative Frequency, not the Relative Frequency. 47% of the people surveyed have lived in the U.S. for five years or less. 5 19 15 19 13 19 13 19</p> <p>5) 26 (This is the count of CEOs in the 55 to 59 and 60 to 64 categories.) 12% (number of CEOs age 65 or older ÷ total number of CEOs) 14/60; 0.23; 23% 0.45 Graph A represents the cumulative relative frequency, and Graph B shows the relative frequency.</p> <p>7) b</p> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="cumrelfreq"><dt>Cumulative Relative Frequency</dt> <dd id="id19883082">The term applies to an ordered set of observations from smallest to largest. The cumulative relative frequency is the sum of the relative frequencies for all values that are less than or equal to the given value.</dd> </dl> <dl id="freq"><dt>Frequency</dt> <dd id="id19849552">the number of times a value of the data occurs</dd> </dl> <dl id="relfreq"><dt>Relative Frequency</dt> <dd id="id5747717">the ratio of the number of times a value of the data occurs in the set of all outcomes to the number of all outcomes to the total number of outcomes</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-stem-and-leaf-graphs-stemplots-line-graphs-and-bar-graphs" title="Chapter 2.3: Bar Graphs, Histrograms, and Stem-and-Leaf Graphs (Stemplots)"><div class="chapter-title-wrap"><h3 class="chapter-number">9</h3><h2 class="chapter-title"><span class="display-none">Chapter 2.3: Bar Graphs, Histrograms, and Stem-and-Leaf Graphs (Stemplots)</span></h2></div><div class="ugc chapter-ugc"><h2>Bar Graphs</h2> <p id="eip-136"><strong>Bar graphs</strong> consist of bars that are separated from each other. The bars can be rectangles or they can be rectangular boxes (used in three-dimensional plots), and they can be vertical or horizontal. The <strong>bar graph</strong> shown in <a class="autogenerated-content" href="#example5">(Figure)</a> has age groups represented on the <strong><em data-effect="italics">x</em>-axis</strong> and proportions on the <strong><em data-effect="italics">y</em>-axis</strong>.</p> <div id="example5" class="textbox textbox--examples" data-type="example"><div id="fs-idm7260336" data-type="exercise"><div id="fs-idp99169968" data-type="problem"><p id="eip-666">By the end of 2011, Facebook had over 146 million users in the United States. <a class="autogenerated-content" href="#M01_Ch02_tbl010">(Figure)</a> shows three age groups, the number of users in each age group, and the proportion (%) of users in each age group. Construct a bar graph using this data.</p> <table id="M01_Ch02_tbl010" summary="The information is from Facebook. The first row of the table displays age groups, the second row displays number of Facebook users and the third row displays percentages."><thead><tr><th>Age groups</th> <th>Number of Facebook users</th> <th>Proportion (%) of Facebook users</th> </tr> </thead> <tbody><tr><td>13–25</td> <td>65,082,280</td> <td>45%</td> </tr> <tr><td>26–44</td> <td>53,300,200</td> <td>36%</td> </tr> <tr><td>45–64</td> <td>27,885,100</td> <td>19%</td> </tr> </tbody> </table> </div> <div id="fs-idp32850832" data-type="solution"><div id="fs-idm25485824" class="bc-figure figure"><span id="bar_graph_Facebook" data-type="media" data-display="block" data-alt="This is a bar graph that matches the supplied data. The x-axis shows age groups, and the y-axis shows the percentages of Facebook users."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch02_03_02-1.jpg" alt="This is a bar graph that matches the supplied data. The x-axis shows age groups, and the y-axis shows the percentages of Facebook users." width="380" data-media-type="image/jpg" /></span></div> </div> </div> </div> <div id="fs-idp2492880" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm47906688" data-type="exercise"><div id="fs-idm7708128" data-type="problem"><p id="fs-idp6746384">The population in Park City is made up of children, working-age adults, and retirees. <a class="autogenerated-content" href="#M01_Ch02_tbl011">(Figure)</a> shows the three age groups, the number of people in the town from each age group, and the proportion (%) of people in each age group. Construct a bar graph showing the proportions.</p> <table id="M01_Ch02_tbl011" summary=""><thead><tr><th>Age groups</th> <th>Number of people</th> <th>Proportion of population</th> </tr> </thead> <tbody><tr><td>Children</td> <td>67,059</td> <td>19%</td> </tr> <tr><td>Working-age adults</td> <td>152,198</td> <td>43%</td> </tr> <tr><td>Retirees</td> <td>131,662</td> <td>38%</td> </tr> </tbody> </table> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><div id="fs-idm25526448" data-type="exercise"><div id="fs-idm19535184" data-type="problem"><p id="eip-655">The columns in <a class="autogenerated-content" href="#M01_Ch02_tbl012">(Figure)</a> contain: the race or ethnicity of students in U.S. Public Schools for the class of 2011, percentages for the Advanced Placement examine population for that class, and percentages for the overall student population. Create a bar graph with the student race or ethnicity (qualitative data) on the <em data-effect="italics">x</em>-axis, and the Advanced Placement examinee population percentages on the <em data-effect="italics">y</em>-axis.</p> <table id="M01_Ch02_tbl012" summary="The table shows Race and Ethnicity in the first column, Advanced Placement Examinee Population in the second column and Overall Student Population in the third column."><thead><tr><th>Race/Ethnicity</th> <th>AP Examinee Population</th> <th>Overall Student Population</th> </tr> </thead> <tbody><tr><td>1 = Asian, Asian American or Pacific Islander</td> <td>10.3%</td> <td>5.7%</td> </tr> <tr><td>2 = Black or African American</td> <td>9.0%</td> <td>14.7%</td> </tr> <tr><td>3 = Hispanic or Latino</td> <td>17.0%</td> <td>17.6%</td> </tr> <tr><td>4 = American Indian or Alaska Native</td> <td>0.6%</td> <td>1.1%</td> </tr> <tr><td>5 = White</td> <td>57.1%</td> <td>59.2%</td> </tr> <tr><td>6 = Not reported/other</td> <td>6.0%</td> <td>1.7%</td> </tr> </tbody> </table> </div> <div id="fs-idp30402432" data-type="solution"><div id="M01_Ch02_fig003" class="bc-figure figure"><span id="bar_graph_Ap_Examinee" data-type="media" data-display="block" data-alt="This is a bar graph that matches the supplied data. The x-axis shows race and ethnicity, and the y-axis shows the percentages of AP examinees."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch02_03_03-1.jpg" alt="This is a bar graph that matches the supplied data. The x-axis shows race and ethnicity, and the y-axis shows the percentages of AP examinees." width="380" data-media-type="image/jpg" /></span></div> </div> </div> </div> <div id="fs-idp41522672" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm43065520" data-type="exercise"><div id="fs-idp3652336" data-type="problem"><p id="fs-idp31379616">Park city is broken down into six voting districts. The table shows the percent of the total registered voter population that lives in each district as well as the percent total of the entire population that lives in each district. Construct a bar graph that shows the registered voter population by district.</p> <table id="M01_Ch02_tbl013" summary=""><colgroup><col data-align="center" /> <col /> <col /></colgroup> <thead><tr><th>District</th> <th>Registered voter population</th> <th>Overall city population</th> </tr> </thead> <tbody><tr><td>1</td> <td>15.5%</td> <td>19.4%</td> </tr> <tr><td>2</td> <td>12.2%</td> <td>15.6%</td> </tr> <tr><td>3</td> <td>9.8%</td> <td>9.0%</td> </tr> <tr><td>4</td> <td>17.4%</td> <td>18.5%</td> </tr> <tr><td>5</td> <td>22.8%</td> <td>20.7%</td> </tr> <tr><td>6</td> <td>22.3%</td> <td>16.8%</td> </tr> </tbody> </table> </div> </div> </div> <div id="fs-idm21451296" class="footnotes" data-depth="1"><h2>Histograms</h2> <p id="element-657">For most of the work you do in this book, you will use a histogram to display the data. One advantage of a histogram is that it can readily display large data sets. A rule of thumb is to use a histogram when the data set consists of 100 values or more.</p> <p id="element-446">A <span data-type="term">histogram</span> consists of contiguous (adjoining) boxes. It has both a horizontal axis and a vertical axis. The horizontal axis is labeled with what the data represents (for instance, distance from your home to school). The vertical axis is labeled either <span data-type="term">frequency</span> or <span data-type="term">relative frequency</span> (or percent frequency or probability). The graph will have the same shape with either label. The histogram (like the stemplot) can give you the shape of the data, the center, and the spread of the data.</p> <p id="element-123">The relative frequency is equal to the frequency for an observed value of the data divided by the total number of data values in the sample. (Remember, frequency is defined as the number of times an answer occurs.) If:</p> <ul id="element-614"><li><em data-effect="italics">f</em> = frequency</li> <li><em data-effect="italics">n</em> = total number of data values (or the sum of the individual frequencies), and</li> <li><em data-effect="italics">RF</em> = relative frequency,</li> </ul> <p id="element-700">then:</p> <div id="element-1000" data-type="equation">\(\text{RF}=\frac{f}{n}\)</div> <p id="element-323">For example, if three students in Mr. Ahab&#8217;s English class of 40 students received from 90% to 100%, then,</p> <p><em data-effect="italics">f</em> = 3, <em data-effect="italics">n</em> = 40, and <em data-effect="italics">RF</em> = \(\frac{f}{n}\) = \(\frac{3}{40}\) = 0.075. 7.5% of the students received 90–100%. 90–100% are quantitative measures.</p> <p id="element-237"><strong>To construct a histogram</strong>, first decide how many <strong>bars</strong> or <strong>intervals</strong>, also called classes, represent the data. Many histograms consist of five to 15 bars or classes for clarity. The number of bars needs to be chosen. Choose a starting point for the first interval to be less than the smallest data value. A <strong>convenient starting point</strong> is a lower value carried out to one more decimal place than the value with the most decimal places. For example, if the value with the most decimal places is 6.1 and this is the smallest value, a convenient starting point is 6.05 (6.1 – 0.05 = 6.05). We say that 6.05 has more precision. If the value with the most decimal places is 2.23 and the lowest value is 1.5, a convenient starting point is 1.495 (1.5 – 0.005 = 1.495). If the value with the most decimal places is 3.234 and the lowest value is 1.0, a convenient starting point is 0.9995 (1.0 – 0.0005 = 0.9995). If all the data happen to be integers and the smallest value is two, then a convenient starting point is 1.5 (2 – 0.5 = 1.5). Also, when the starting point and other boundaries are carried to one additional decimal place, no data value will fall on a boundary. The next two examples go into detail about how to construct a histogram using continuous data and how to create a histogram using discrete data.</p> <div id="exampid1" class="textbox textbox--examples" data-type="example"><p id="element-743">The following data are the heights (in inches to the nearest half inch) of 100 male semiprofessional soccer players. The heights are <strong>continuous</strong> data, since height is measured. <span data-type="newline"><br /> </span>60;  60.5;  61;  61;  61.5 <span data-type="newline"><br /> </span>63.5;  63.5;  63.5 <span data-type="newline"><br /> </span>64;  64;  64;  64;  64;  64;  64;  64.5;  64.5;  64.5;  64.5;  64.5;  64.5;  64.5;  64.5 <span data-type="newline"><br /> </span>66;  66;  66;  66;  66;  66;  66;  66;  66;  66;  66.5;  66.5;  66.5;  66.5;  66.5;  66.5;  66.5;  66.5;  66.5;  66.5;  66.5;  67;  67;  67;  67;  67;  67;  67;  67;  67;  67;  67;  67;  67.5;  67.5;  67.5;  67.5;  67.5;  67.5;  67.5 <span data-type="newline"><br /> </span>68;  68;  69;  69;  69;  69;  69;  69;  69;  69;  69;  69;  69.5;  69.5;  69.5;  69.5;  69.5 <span data-type="newline"><br /> </span>70;  70;  70;  70;  70;  70;  70.5;  70.5;  70.5;  71;  71;  71 <span data-type="newline"><br /> </span>72;  72;  72;  72.5;  72.5;  73;  73.5 <span data-type="newline"><br /> </span>74</p> <p id="element-364">The smallest data value is 60. Since the data with the most decimal places has one decimal (for instance, 61.5), we want our starting point to have two decimal places. Since the numbers 0.5, 0.05, 0.005, etc. are convenient numbers, use 0.05 and subtract it from 60, the smallest value, for the convenient starting point.</p> <p id="element-906">60 – 0.05 = 59.95 which is more precise than, say, 61.5 by one decimal place. The starting point is, then, 59.95.</p> <p id="element-291">The largest value is 74, so 74 + 0.05 = 74.05 is the ending value.</p> <p>Next, calculate the width of each bar or class interval. To calculate this width, subtract the starting point from the ending value and divide by the number of bars (you must choose the number of bars you desire). Suppose you choose eight bars.</p> <div id="element-2133" data-type="equation">\(\frac{74.05-59.95}{8}=1.76\)</div> <div id="id7476385" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="fs-idm11623536">We will round up to two and make each bar or class interval two units wide. Rounding up to two is one way to prevent a value from falling on a boundary. Rounding to the next number is often necessary even if it goes against the standard rules of rounding. For this example, using 1.76 as the width would also work. A guideline that is followed by some for the number of bars or class intervals is to take the square root of the number of data values and then round to the nearest whole number, if necessary. For example, if there are 150 values of data, take the square root of 150 and round to 12 bars or intervals.</p> </div> <p id="element-209">The boundaries are:</p> <ul id="element-790"><li>59.95</li> <li>59.95 + 2 = 61.95</li> <li>61.95 + 2 = 63.95</li> <li>63.95 + 2 = 65.95</li> <li>65.95 + 2 = 67.95</li> <li>67.95 + 2 = 69.95</li> <li>69.95 + 2 = 71.95</li> <li>71.95 + 2 = 73.95</li> <li>73.95 + 2 = 75.95</li> </ul> <p id="element-159">The heights 60 through 61.5 inches are in the interval 59.95–61.95. The heights that are 63.5 are in the interval 61.95–63.95. The heights that are 64 through 64.5 are in the interval 63.95–65.95. The heights 66 through 67.5 are in the interval 65.95–67.95. The heights 68 through 69.5 are in the interval 67.95–69.95. The heights 70 through 71 are in the interval 69.95–71.95. The heights 72 through 73.5 are in the interval 71.95–73.95. The height 74 is in the interval 73.95–75.95.</p> <p id="element-451">The following histogram displays the heights on the <em data-effect="italics">x</em>-axis and relative frequency on the <em data-effect="italics">y</em>-axis.</p> <div id="eip-idm88475792" class="bc-figure figure"><span id="id7474144" data-type="media" data-alt="Histogram consists of 8 bars with the y-axis in increments of 0.05 from 0-0.4 and the x-axis in intervals of 2 from 59.95-75.95." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch_02_04_01-1.jpg" alt="Histogram consists of 8 bars with the y-axis in increments of 0.05 from 0-0.4 and the x-axis in intervals of 2 from 59.95-75.95." width="350" data-media-type="image/png" /></span></div> </div> <div id="fs-idp304592" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm28791888" data-type="exercise"><div id="fs-idp77099632" data-type="problem"><p id="fs-idp43536752">The following data are the shoe sizes of 50 male students. The sizes are discrete data since shoe size is measured in whole and half units only. Construct a histogram and calculate the width of each bar or class interval. Suppose you choose six bars. <span data-type="newline"><br /> </span>9;  9;  9.5;  9.5;  10;  10;  10;  10;  10;  10;  10.5;  10.5;  10.5;  10.5;  10.5;  10.5;  10.5;  10.5 <span data-type="newline"><br /> </span>11;  11;  11;  11;  11;  11;  11;  11;  11;  11;  11;  11;  11;  11.5;  11.5;  11.5;  11.5;  11.5;  11.5;  11.5 <span data-type="newline"><br /> </span>12;  12;  12;  12;  12;  12;  12;  12.5;  12.5;  12.5;  12.5;  14</p> </div> </div> </div> <div id="exampid2" class="textbox textbox--examples" data-type="example"><p>Create a histogram for the following data: the number of books bought by 50 part-time college students at ABC College.the number of books bought by 50 part-time college students at ABC College. The number of books is <strong>discrete data</strong>, since books are counted. <span data-type="newline"><br /> </span>1;  1;  1;  1;  1;  1;  1;  1;  1;  1;  1 <span data-type="newline"><br /> </span>2;  2;  2;  2;  2;  2;  2;  2;  2;  2 <span data-type="newline"><br /> </span>3;  3;  3;  3;  3;  3;  3;  3;  3;  3;  3;  3;  3;  3;  3;  3 <span data-type="newline"><br /> </span>4;  4;  4;  4;  4;  4 <span data-type="newline"><br /> </span>5;  5;  5;  5;  5 <span data-type="newline"><br /> </span>6;  6</p> <p id="element-760">Eleven students buy one book. Ten students buy two books. Sixteen students buy three books. Six students buy four books. Five students buy five books. Two students buy six books.</p> <p id="element-728">Because the data are integers, subtract 0.5 from 1, the smallest data value and add 0.5 to 6, the largest data value. Then the starting point is 0.5 and the ending value is 6.5.</p> <div id="element-545" data-type="exercise"><div id="id8093949" data-type="problem"><p id="element-818">Next, calculate the width of each bar or class interval. If the data are discrete and there are not too many different values, a width that places the data values in the middle of the bar or class interval is the most convenient. Since the data consist of the numbers 1, 2, 3, 4, 5, 6, and the starting point is 0.5, a width of one places the 1 in the middle of the interval from 0.5 to 1.5, the 2 in the middle of the interval from 1.5 to 2.5, the 3 in the middle of the interval from 2.5 to 3.5, the 4 in the middle of the interval from _______ to _______, the 5 in the middle of the interval from _______ to _______, and the _______ in the middle of the interval from _______ to _______ .</p> </div> <div id="id12377723" data-type="solution" data-print-placement="end"><ul><li>3.5 to 4.5</li> <li>4.5 to 5.5</li> <li>6</li> <li>5.5 to 6.5</li> </ul> </div> </div> <p>Calculate the number of bars as follows:</p> <div id="element-48" data-type="equation">\(\frac{6.5-0.5}{\mathrm{number of bars}}=1\)</div> <p id="element-600">where 1 is the width of a bar. Therefore, bars = 6.</p> <p id="element-756">The following histogram displays the number of books on the <em data-effect="italics">x</em>-axis and the frequency on the <em data-effect="italics">y</em>-axis.</p> <div id="eip-idp35221648" class="bc-figure figure"><span id="id5693638" data-type="media" data-alt="Histogram consists of 6 bars with the y-axis in increments of 2 from 0-16 and the x-axis in intervals of 1 from 0.5-6.5." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch_02_04_02-1.jpg" alt="Histogram consists of 6 bars with the y-axis in increments of 2 from 0-16 and the x-axis in intervals of 1 from 0.5-6.5." width="380" data-media-type="image/png" /></span></div> </div> <div id="fs-idm25803056" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="fs-idm23601024">Go to <a class="autogenerated-content" href="/contents/d0ba1833-f0d2-4195-8765-3c436745f0fb">(Figure)</a>. There are calculator instructions for entering data and for creating a customized histogram. Create the histogram for <a class="autogenerated-content" href="#exampid2">(Figure)</a>.</p> <ul id="fs-idp2836464"><li>Press Y=. Press CLEAR to delete any equations.</li> <li>Press STAT 1:EDIT. If L1 has data in it, arrow up into the name L1, press CLEAR and then arrow down. If necessary, do the same for L2.</li> <li>Into L1, enter 1, 2, 3, 4, 5, 6.</li> <li>Into L2, enter 11, 10, 16, 6, 5, 2.</li> <li>Press WINDOW. Set Xmin = .5, Xmax = 6.5, Xscl = (6.5 – .5)/6, Ymin = –1, Ymax = 20, Yscl = 1, Xres = 1.</li> <li>Press 2<sup>nd</sup> Y=. Start by pressing 4:Plotsoff ENTER.</li> <li>Press 2<sup>nd</sup> Y=. Press 1:Plot1. Press ENTER. Arrow down to TYPE. Arrow to the 3<sup>rd</sup> picture (histogram). Press ENTER.</li> <li>Arrow down to Xlist: Enter L1 (2<sup>nd</sup> 1). Arrow down to Freq. Enter L2 (2<sup>nd</sup> 2).</li> <li>Press GRAPH.</li> <li>Use the TRACE key and the arrow keys to examine the histogram.</li> </ul> </div> <div id="fs-idm93603984" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp68031136" data-type="exercise"><div id="fs-idp68031392" data-type="problem"><p id="fs-idp62894704">The following data are the number of sports played by 50 student athletes. The number of sports is discrete data since sports are counted.</p> <p id="fs-idp50986496">1;  1;  1;  1;  1;  1;  1;  1;  1;  1;  1;  1;  1;  1;  1;  1;  1;  1;  1;  1 <span data-type="newline"><br /> </span>2;  2;  2;  2;  2;  2;  2;  2;  2;  2;  2;  2;  2;  2;  2;  2;  2;  2;  2;  2;  2;  2 <span data-type="newline"><br /> </span>3;  3;  3;  3;  3;  3;  3;  3 <span data-type="newline"><br /> </span>20 student athletes play one sport. 22 student athletes play two sports. Eight student athletes play three sports.</p> <p id="fs-idp11189216"><em data-effect="italics">Fill in the blanks for the following sentence.</em> Since the data consist of the numbers 1, 2, 3, and the starting point is 0.5, a width of one places the 1 in the middle of the interval 0.5 to _____, the 2 in the middle of the interval from _____ to _____, and the 3 in the middle of the interval from _____ to _____.</p> </div> </div> </div> <div id="fs-idp46234048" class="textbox textbox--examples" data-type="example"><div id="fs-idm61675872" data-type="exercise"><div id="fs-idp66653792" data-type="problem"><p id="fs-idm21354960">Using this data set, construct a histogram.</p> <table id="fs-idp46234304" summary=""><thead><tr><th colspan="5">Number of Hours My Classmates Spent Playing Video Games on Weekends</th> </tr> </thead> <tbody><tr><td>9.95</td> <td>10</td> <td>2.25</td> <td>16.75</td> <td>0</td> </tr> <tr><td>19.5</td> <td>22.5</td> <td>7.5</td> <td>15</td> <td>12.75</td> </tr> <tr><td>5.5</td> <td>11</td> <td>10</td> <td>20.75</td> <td>17.5</td> </tr> <tr><td>23</td> <td>21.9</td> <td>24</td> <td>23.75</td> <td>18</td> </tr> <tr><td>20</td> <td>15</td> <td>22.9</td> <td>18.8</td> <td>20.5</td> </tr> </tbody> </table> </div> <div id="fs-idm51477920" data-type="solution"><div id="fs-idp72238816" class="bc-figure figure"><span id="fs-idm21354576" data-type="media" data-alt="This is a histogram that matches the supplied data. The x-axis consists of 5 bars in intervals of 5 from 0 to 25. The y-axis is marked in increments of 1 from 0 to 10. The x-axis shows the number of hours spent playing video games on the weekends, and the y-axis shows the number of students." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M04_020-1.jpg" alt="This is a histogram that matches the supplied data. The x-axis consists of 5 bars in intervals of 5 from 0 to 25. The y-axis is marked in increments of 1 from 0 to 10. The x-axis shows the number of hours spent playing video games on the weekends, and the y-axis shows the number of students." width="400" data-media-type="image/png" /></span></div> <p id="fs-idm106252848">Some values in this data set fall on boundaries for the class intervals. A value is counted in a class interval if it falls on the left boundary, but not if it falls on the right boundary. Different researchers may set up histograms for the same data in different ways. There is more than one correct way to set up a histogram.</p> </div> </div> </div> <div id="fs-idm161572864" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm32934816" data-type="exercise"><div id="fs-idm4304416" data-type="problem"><p id="fs-idp57423888">The following data represent the number of employees at various restaurants in New York City. Using this data, create a histogram.</p> <p id="fs-idp57424272"><span data-type="list" data-list-type="labeled-item" data-display="inline">22  35  15  26  40  28  18  20  25  34  39  42  24  22  19  27  22  34  40  20  38  and  28</span><span data-type="newline"><br /> </span>Use 10–19 as the first interval.</p> </div> </div> </div> <div id="fs-idm107220592" class="statistics collab" data-type="note" data-has-label="true" data-label=""><p>Count the money (bills and change) in your pocket or purse. Your instructor will record the amounts. As a class, construct a histogram displaying the data. Discuss how many intervals you think is appropriate. You may want to experiment with the number of intervals.</p> </div> <div id="fs-idm4800336" class="bc-section section" data-depth="1"><h2 data-type="title">Stem and Leaf</h2> </div> </div> <p id="id6999853">One simple graph, the <strong>stem-and-leaf graph</strong> or <strong>stemplot</strong>, comes from the field of exploratory data analysis. It is a good choice when the data sets are small. To create the plot, divide each observation of data into a stem and a leaf. The leaf consists of a <strong>final significant digit</strong>. For example, 23 has stem two and leaf three. The number 432 has stem 43 and leaf two. Likewise, the number 5,432 has stem 543 and leaf two. The decimal 9.3 has stem nine and leaf three. Write the stems in a vertical line from smallest to largest. Draw a vertical line to the right of the stems. Then write the leaves in increasing order next to their corresponding stem.</p> <div id="element-696" class="textbox textbox--examples" data-type="example"><p id="element-948">For Susan Dean&#8217;s spring pre-calculus class, scores for the first exam were as follows (smallest to largest): <span data-type="newline"><br /> </span> 33;  42;  49;  49;  53;  55;  55;  61;  63;  67;  68;  68;  69;  69;  72;  73;  74;  78;  80;  83;  88;  88;  88;  90;  92;  94;  94;  94;  94;  96;  100</p> <table id="element-185" summary="Table displaying stem in first column and leaf in second column for the values listed above."><caption><span data-type="title">Stem-and-Leaf Graph</span></caption> <thead><tr><th>Stem</th> <th>Leaf</th> </tr> </thead> <tbody><tr><td>3</td> <td>3</td> </tr> <tr><td>4</td> <td>2  9  9</td> </tr> <tr><td>5</td> <td>3  5  5</td> </tr> <tr><td>6</td> <td>1  3  7  8  8  9  9</td> </tr> <tr><td>7</td> <td>2  3  4  8</td> </tr> <tr><td>8</td> <td>0  3  8  8  8</td> </tr> <tr><td>9</td> <td>0  2  4  4  4  4  6</td> </tr> <tr><td>10</td> <td>0</td> </tr> </tbody> </table> <p id="element-541">The stemplot shows that most scores fell in the 60s, 70s, 80s, and 90s. Eight out of the 31 scores or approximately 26% (left(frac{8}{31}right)) were in the 90s or 100, a fairly high number of As.</p> </div> <div id="fs-idp6114880" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp3142256" data-type="exercise"><div id="fs-idm28250320" data-type="problem"><p id="fs-idp29192512">For the Park City basketball team, scores for the last 30 games were as follows (smallest to largest): <span data-type="newline"><br /> </span> 32; 32; 33; 34; 38; 40; 42; 42; 43; 44; 46; 47; 47; 48; 48; 48; 49; 50; 50; 51; 52; 52; 52; 53; 54; 56; 57; 57; 60; 61 <span data-type="newline"><br /> </span>Construct a stem plot for the data.</p> </div> </div> </div> <p id="eip-522">The stemplot is a quick way to graph data and gives an exact picture of the data. You want to look for an overall pattern and any outliers. An <span data-type="term">outlier</span> is an observation of data that does not fit the rest of the data. It is sometimes called an <strong>extreme value.</strong> When you graph an outlier, it will appear not to fit the pattern of the graph. Some outliers are due to mistakes (for example, writing down 50 instead of 500) while others may indicate that something unusual is happening. It takes some background information to explain outliers, so we will cover them in more detail later.</p> <div id="element-798" class="textbox textbox--examples" data-type="example"><p id="element-534">The data are the distances (in kilometers) from a home to local supermarkets. Create a stemplot using the data: <span data-type="newline"><br /> </span>1.1;  1.5;  2.3;  2.5;  2.7;  3.2;  3.3;  3.3;  3.5;  3.8;  4.0;  4.2;  4.5;  4.5;  4.7;  4.8;  5.5;  5.6;  6.5;  6.7;  12.3</p> <div id="element-6923" data-type="exercise"><div id="id8567884" data-type="problem"><p id="fs-idp11807920">Do the data seem to have any concentration of values?</p> <div id="id8559724" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="fs-idp145995424">The leaves are to the right of the decimal.</p> </div> </div> <div id="id8559734" data-type="solution" data-print-placement="end"><p>The value 12.3 may be an outlier. Values appear to concentrate at three and four kilometers.</p> <table id="element-533" summary="This is a Stem-Leaf graph with stems 1, 2, 3, 4, 5, 6, 12 and leaves to the right of the decimal point."><thead><tr><th>Stem</th> <th>Leaf</th> </tr> </thead> <tbody><tr><td>1</td> <td>1  5</td> </tr> <tr><td>2</td> <td>3  5  7</td> </tr> <tr><td>3</td> <td>2  3  3  5  8</td> </tr> <tr><td>4</td> <td>0  2  5  5  7  8</td> </tr> <tr><td>5</td> <td>5  6</td> </tr> <tr><td>6</td> <td>5  7</td> </tr> <tr><td>7</td> <td></td> </tr> <tr><td>8</td> <td></td> </tr> <tr><td>9</td> <td></td> </tr> <tr><td>10</td> <td></td> </tr> <tr><td>11</td> <td></td> </tr> <tr><td>12</td> <td>3</td> </tr> </tbody> </table> </div> </div> </div> <div id="fs-idp4001472" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm40603312" data-type="exercise"><div id="fs-idm28790944" data-type="problem"><p id="fs-idp73770272">The following data show the distances (in miles) from the homes of off-campus statistics students to the college. Create a stem plot using the data and identify any outliers:</p> <p id="fs-idp20460656">0.5;  0.7;  1.1;  1.2;  1.2;  1.3;  1.3;  1.5;  1.5;  1.7;  1.7;  1.8;  1.9;  2.0;  2.2;  2.5;  2.6;  2.8;  2.8;  2.8;  3.5;  3.8;  4.4;  4.8;  4.9;  5.2;  5.5;  5.7;  5.8;  8.0</p> </div> </div> </div> <div id="fs-idp11879648" class="textbox textbox--examples" data-type="example"><div id="fs-idp161531440" data-type="exercise"><div id="fs-idp99219856" data-type="problem"><p id="fs-idp93832256">A <strong>side-by-side stem-and-leaf plot</strong> allows a comparison of the two data sets in two columns. In a side-by-side stem-and-leaf plot, two sets of leaves share the same stem. The leaves are to the left and the right of the stems. <a class="autogenerated-content" href="#M01_Ch02_tbl005">(Figure)</a> and <a class="autogenerated-content" href="#M01_Ch02_tbl006">(Figure)</a> show the ages of presidents at their inauguration and at their death. Construct a side-by-side stem-and-leaf plot using this data.</p> </div> <div id="fs-idp107817952" data-type="solution"><table id="M01_Ch02_tbl007" summary="---"><colgroup><col data-align="right" /> <col data-align="center" /> <col data-align="left" /></colgroup> <thead><tr><th>Ages at Inauguration</th> <th></th> <th>Ages at Death</th> </tr> </thead> <tbody><tr><td>9  9  8  7  7  7  6  3  2</td> <td>4</td> <td>6  9</td> </tr> <tr><td>8  7  7  7  7  6  6  6  5  5  5  5  4  4  4  4  4  2  2  1  1  1  1  1  0</td> <td>5</td> <td>3  6  6  7  7  8</td> </tr> <tr><td>9  8  5  4  4  2  1  1  1  0</td> <td>6</td> <td>0  0  3  3  4  4  5  6  7  7  7  8</td> </tr> <tr><td></td> <td>7</td> <td>0  0  1  1  1  4  7  8  8  9</td> </tr> <tr><td></td> <td>8</td> <td>0  1  3  5  8</td> </tr> <tr><td></td> <td>9</td> <td>0  0  3  3</td> </tr> </tbody> </table> </div> </div> <table id="M01_Ch02_tbl005" summary=""><caption><span data-type="title">Presidential Ages at Inauguration</span></caption> <thead><tr><th>President</th> <th>Age</th> <th>President</th> <th>Age</th> <th>President</th> <th>Age</th> </tr> </thead> <tbody><tr><td>Washington</td> <td>57</td> <td>Lincoln</td> <td>52</td> <td>Hoover</td> <td>54</td> </tr> <tr><td>J. Adams</td> <td>61</td> <td>A. Johnson</td> <td>56</td> <td>F. Roosevelt</td> <td>51</td> </tr> <tr><td>Jefferson</td> <td>57</td> <td>Grant</td> <td>46</td> <td>Truman</td> <td>60</td> </tr> <tr><td>Madison</td> <td>57</td> <td>Hayes</td> <td>54</td> <td>Eisenhower</td> <td>62</td> </tr> <tr><td>Monroe</td> <td>58</td> <td>Garfield</td> <td>49</td> <td>Kennedy</td> <td>43</td> </tr> <tr><td>J. Q. Adams</td> <td>57</td> <td>Arthur</td> <td>51</td> <td>L. Johnson</td> <td>55</td> </tr> <tr><td>Jackson</td> <td>61</td> <td>Cleveland</td> <td>47</td> <td>Nixon</td> <td>56</td> </tr> <tr><td>Van Buren</td> <td>54</td> <td>B. Harrison</td> <td>55</td> <td>Ford</td> <td>61</td> </tr> <tr><td>W. H. Harrison</td> <td>68</td> <td>Cleveland</td> <td>55</td> <td>Carter</td> <td>52</td> </tr> <tr><td>Tyler</td> <td>51</td> <td>McKinley</td> <td>54</td> <td>Reagan</td> <td>69</td> </tr> <tr><td>Polk</td> <td>49</td> <td>T. Roosevelt</td> <td>42</td> <td>G.H.W. Bush</td> <td>64</td> </tr> <tr><td>Taylor</td> <td>64</td> <td>Taft</td> <td>51</td> <td>Clinton</td> <td>47</td> </tr> <tr><td>Fillmore</td> <td>50</td> <td>Wilson</td> <td>56</td> <td>G. W. Bush</td> <td>54</td> </tr> <tr><td>Pierce</td> <td>48</td> <td>Harding</td> <td>55</td> <td>Obama</td> <td>47</td> </tr> <tr><td>Buchanan</td> <td>65</td> <td>Coolidge</td> <td>51</td> <td></td> <td></td> </tr> </tbody> </table> <table id="M01_Ch02_tbl006" summary=""><caption><span data-type="title">Presidential Age at Death</span></caption> <thead><tr><th>President</th> <th>Age</th> <th>President</th> <th>Age</th> <th>President</th> <th>Age</th> </tr> </thead> <tbody><tr><td>Washington</td> <td>67</td> <td>Lincoln</td> <td>56</td> <td>Hoover</td> <td>90</td> </tr> <tr><td>J. Adams</td> <td>90</td> <td>A. Johnson</td> <td>66</td> <td>F. Roosevelt</td> <td>63</td> </tr> <tr><td>Jefferson</td> <td>83</td> <td>Grant</td> <td>63</td> <td>Truman</td> <td>88</td> </tr> <tr><td>Madison</td> <td>85</td> <td>Hayes</td> <td>70</td> <td>Eisenhower</td> <td>78</td> </tr> <tr><td>Monroe</td> <td>73</td> <td>Garfield</td> <td>49</td> <td>Kennedy</td> <td>46</td> </tr> <tr><td>J. Q. Adams</td> <td>80</td> <td>Arthur</td> <td>56</td> <td>L. Johnson</td> <td>64</td> </tr> <tr><td>Jackson</td> <td>78</td> <td>Cleveland</td> <td>71</td> <td>Nixon</td> <td>81</td> </tr> <tr><td>Van Buren</td> <td>79</td> <td>B. Harrison</td> <td>67</td> <td>Ford</td> <td>93</td> </tr> <tr><td>W. H. Harrison</td> <td>68</td> <td>Cleveland</td> <td>71</td> <td>Reagan</td> <td>93</td> </tr> <tr><td>Tyler</td> <td>71</td> <td>McKinley</td> <td>58</td> <td></td> <td></td> </tr> <tr><td>Polk</td> <td>53</td> <td>T. Roosevelt</td> <td>60</td> <td></td> <td></td> </tr> <tr><td>Taylor</td> <td>65</td> <td>Taft</td> <td>72</td> <td></td> <td></td> </tr> <tr><td>Fillmore</td> <td>74</td> <td>Wilson</td> <td>67</td> <td></td> <td></td> </tr> <tr><td>Pierce</td> <td>64</td> <td>Harding</td> <td>57</td> <td></td> <td></td> </tr> <tr><td>Buchanan</td> <td>77</td> <td>Coolidge</td> <td>60</td> <td></td> <td></td> </tr> </tbody> </table> </div> <div id="fs-idp9770768" class="statistics try" data-type="note" data-has-label="true" data-label=""><div id="fs-idp35107968" data-type="exercise"><div id="fs-idm2063072" data-type="problem"><p id="fs-idm4586448">The table shows the number of wins and losses the Atlanta Hawks have had in 42 seasons. Create a side-by-side stem-and-leaf plot of these wins and losses.</p> <table id="fs-idm25873440" summary=".."><caption> </caption> <thead><tr><th>Losses</th> <th>Wins</th> <th>Year</th> <th>Losses</th> <th>Wins</th> <th>Year</th> </tr> </thead> <tbody><tr><td>34</td> <td>48</td> <td>1968–1969</td> <td>41</td> <td>41</td> <td>1989–1990</td> </tr> <tr><td>34</td> <td>48</td> <td>1969–1970</td> <td>39</td> <td>43</td> <td>1990–1991</td> </tr> <tr><td>46</td> <td>36</td> <td>1970–1971</td> <td>44</td> <td>38</td> <td>1991–1992</td> </tr> <tr><td>46</td> <td>36</td> <td>1971–1972</td> <td>39</td> <td>43</td> <td>1992–1993</td> </tr> <tr><td>36</td> <td>46</td> <td>1972–1973</td> <td>25</td> <td>57</td> <td>1993–1994</td> </tr> <tr><td>47</td> <td>35</td> <td>1973–1974</td> <td>40</td> <td>42</td> <td>1994–1995</td> </tr> <tr><td>51</td> <td>31</td> <td>1974–1975</td> <td>36</td> <td>46</td> <td>1995–1996</td> </tr> <tr><td>53</td> <td>29</td> <td>1975–1976</td> <td>26</td> <td>56</td> <td>1996–1997</td> </tr> <tr><td>51</td> <td>31</td> <td>1976–1977</td> <td>32</td> <td>50</td> <td>1997–1998</td> </tr> <tr><td>41</td> <td>41</td> <td>1977–1978</td> <td>19</td> <td>31</td> <td>1998–1999</td> </tr> <tr><td>36</td> <td>46</td> <td>1978–1979</td> <td>54</td> <td>28</td> <td>1999–2000</td> </tr> <tr><td>32</td> <td>50</td> <td>1979–1980</td> <td>57</td> <td>25</td> <td>2000–2001</td> </tr> <tr><td>51</td> <td>31</td> <td>1980–1981</td> <td>49</td> <td>33</td> <td>2001–2002</td> </tr> <tr><td>40</td> <td>42</td> <td>1981–1982</td> <td>47</td> <td>35</td> <td>2002–2003</td> </tr> <tr><td>39</td> <td>43</td> <td>1982–1983</td> <td>54</td> <td>28</td> <td>2003–2004</td> </tr> <tr><td>42</td> <td>40</td> <td>1983–1984</td> <td>69</td> <td>13</td> <td>2004–2005</td> </tr> <tr><td>48</td> <td>34</td> <td>1984–1985</td> <td>56</td> <td>26</td> <td>2005–2006</td> </tr> <tr><td>32</td> <td>50</td> <td>1985–1986</td> <td>52</td> <td>30</td> <td>2006–2007</td> </tr> <tr><td>25</td> <td>57</td> <td>1986–1987</td> <td>45</td> <td>37</td> <td>2007–2008</td> </tr> <tr><td>32</td> <td>50</td> <td>1987–1988</td> <td>35</td> <td>47</td> <td>2008–2009</td> </tr> <tr><td>30</td> <td>52</td> <td>1988–1989</td> <td>29</td> <td>53</td> <td>2009–2010</td> </tr> </tbody> </table> </div> </div> </div> <div class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="fs-idm29098416">Burbary, Ken. <em data-effect="italics">Facebook Demographics Revisited – 2001 Statistics,</em> 2011. Available online at http://www.kenburbary.com/2011/03/facebook-demographics-revisited-2011-statistics-2/ (accessed August 21, 2013).</p> <p id="fs-idm29098160">“9th Annual AP Report to the Nation.” CollegeBoard, 2013. Available online at http://apreport.collegeboard.org/goals-and-findings/promoting-equity (accessed September 13, 2013).</p> <p id="fs-idp130237744">“Overweight and Obesity: Adult Obesity Facts.” Centers for Disease Control and Prevention. Available online at http://www.cdc.gov/obesity/data/adult.html (accessed September 13, 2013).</p> </div> <div id="fs-idp18706816" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idp5119792">A <strong>bar graph</strong> is a chart that uses either horizontal or vertical bars to show comparisons among categories. One axis of the chart shows the specific categories being compared, and the other axis represents a discrete value. Some bar graphs present bars clustered in groups of more than one (grouped bar graphs), and others show the bars divided into subparts to show cumulative effect (stacked bar graphs). Bar graphs are especially useful when categorical data is being used. A <strong>histogram</strong> is a graphic version of a frequency distribution. The graph consists of bars of equal width drawn adjacent to each other. The horizontal scale represents classes of quantitative data values and the vertical scale represents frequencies. The heights of the bars correspond to frequency values. Histograms are typically used for large, continuous, quantitative data sets. A <strong>stem-and-leaf plot</strong> is a way to plot data and look at the distribution. In a stem-and-leaf plot, all data values within a class are visible. The advantage in a stem-and-leaf plot is that all values are listed, unlike a histogram, which gives classes of data values.</p> </div> <div id="fs-idp51353072" class="practice" data-depth="1"><h3 data-type="title"><em data-effect="italics">For each of the following data sets, create a stem plot and identify any outliers.</em>The miles per gallon rating for 30 cars are shown below (lowest to highest). <span data-type="newline"><br /> </span>19,  19,  19,  20,  21,  21,  25,  25,  25,  26,  26,  28,  29,  31,  31,  32,  32,  33,  34,  35,  36,  37,  37,  38,  38,  38,  38,  41,  43,  43</h3> </div> <div id="fs-idm37847504" data-type="solution"><table id="fs-idp96212464" summary="The miles per gallon rating for 30 cars"><colgroup><col data-align="center" /> <col data-align="left" /></colgroup> <thead><tr><th>Stem</th> <th data-align="center">Leaf</th> </tr> </thead> <tbody><tr><td>1</td> <td>9  9  9</td> </tr> <tr><td>2</td> <td>0  1  1  5  5  5  6  6  8  9</td> </tr> <tr><td>3</td> <td>1  1  2  2  3  4  5  6  7  7  8  8  8  8</td> </tr> <tr><td>4</td> <td>1  3  3</td> </tr> </tbody> </table> </div> <div id="fs-idp69585968" data-type="exercise"><div id="fs-idp5781504" data-type="problem"><p id="fs-idp7036560">The height in feet of 25 trees is shown below (lowest to highest). <span data-type="newline"><br /> </span>25,  27,  33,  34,  34,  34,  35,  37,  37,  38,  39,  39,  39,  40,  41,  45,  46,  47,  49,  50,  50,  53,  53,  54,  54</p> </div> </div> <div id="exercise8" data-type="exercise"><div id="fs-idp105863648" data-type="problem"><p id="fs-idp8584400">The data are the prices of different laptops at an electronics store. Round each value to the nearest ten. <span data-type="newline"><br /> </span>249,  249,  260,  265,  265,  280,  299,  299,  309,  319,  325,  326,  350,  350,  350,  365,  369,  389,  409,  459,  489,  559,  569,  570,  610</p> </div> <div id="fs-idp114380816" data-type="solution"><table id="M01_Ch02_tbl016" summary="The data are the prices of different laptops at an electronics store."><colgroup><col data-align="center" /> <col data-align="left" /></colgroup> <thead><tr><th>Stem</th> <th data-align="center">Leaf</th> </tr> </thead> <tbody><tr><td>2</td> <td>5  5  6  7  7  8</td> </tr> <tr><td>3</td> <td>0  0  1  2  3  3  5  5  5  7  7  9</td> </tr> <tr><td>4</td> <td>1  6  9</td> </tr> <tr><td>5</td> <td>6  7  7</td> </tr> <tr><td>6</td> <td>1</td> </tr> </tbody> </table> </div> </div> <div id="fs-idp48193248" data-type="exercise"><div id="fs-idp71330208" data-type="problem"><p id="fs-idp71330464">The data are daily high temperatures in a town for one month. <span data-type="newline"><br /> </span>61,  61,  62,  64,  66,  67,  67,  67,  68,  69,  70,  70,  70,  71,  71,  72,  74,  74,  74,  75,  75,  75,  76,  76,  77,  78,  78,  79,  79,  95</p> </div> </div> <div id="fs-idp113295424" data-type="exercise"><div id="fs-idp47635040" data-type="solution"><div id="fs-idp29124208" class="bc-figure figure"></div> </div> </div> <div id="fs-idp11360192" data-type="exercise"><div id="fs-idp6234528" data-type="problem"></div> </div> <div id="fs-idp113509472" data-type="exercise"><div id="fs-idp49973120" data-type="solution"><div id="fs-idm11163840" class="bc-figure figure"><span style="text-align: initial;font-size: 1em">The students in Ms. Ramirez’s math class have birthdays in each of the four seasons. </span><a class="autogenerated-content" style="text-align: initial;font-size: 1em" href="#M01_Ch02_tbl021">(Figure)</a> <span style="text-align: initial;font-size: 1em">shows the four seasons, the number of students who have birthdays in each season, and the percentage (%) of students in each group. Construct a bar graph showing the number of students.</span></div> </div> </div> <div id="exercise13" data-type="exercise"><div id="fs-idp84353344" data-type="problem"><table id="M01_Ch02_tbl021" summary=""><colgroup><col data-align="center" /> <col /> <col /></colgroup> <thead><tr><th>Seasons</th> <th data-align="center">Number of students</th> <th data-align="center">Proportion of population</th> </tr> </thead> <tbody><tr><td>Spring</td> <td>8</td> <td>24%</td> </tr> <tr><td>Summer</td> <td>9</td> <td>26%</td> </tr> <tr><td>Autumn</td> <td>11</td> <td>32%</td> </tr> <tr><td>Winter</td> <td>6</td> <td>18%</td> </tr> </tbody> </table> </div> </div> <div id="fs-idp114173136" data-type="exercise"><div id="fs-idp114173392" data-type="problem"><p id="fs-idp114173520">Using the data from Mrs. Ramirez’s math class supplied in <a class="autogenerated-content" href="#exercise13">(Figure)</a>, construct a bar graph showing the percentages.</p> </div> <div id="fs-idp107574416" data-type="solution"><div id="fs-idp48344576" class="bc-figure figure"><span id="fs-idp74133360" data-type="media" data-display="block" data-alt="This is a bar graph that matches the supplied data. The x-axis shows the seasons of the year, and the y-axis shows the proportion of birthdays."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M03_009-1.jpg" alt="This is a bar graph that matches the supplied data. The x-axis shows the seasons of the year, and the y-axis shows the proportion of birthdays." width="350" data-media-type="image/jpg" /></span></div> </div> </div> <div id="exercise10" data-type="exercise"><div id="fs-idm2655872" data-type="problem"><p id="fs-idp32116368">David County has six high schools. Each school sent students to participate in a county-wide science competition. <a class="autogenerated-content" href="#M01_Ch02_tbl022">(Figure)</a> shows the percentage breakdown of competitors from each school, and the percentage of the entire student population of the county that goes to each school. Construct a bar graph that shows the population percentage of competitors from each school.</p> <table id="M01_Ch02_tbl022" summary=""><colgroup><col data-align="center" /> <col data-align="left" /> <col data-align="left" /></colgroup> <thead><tr><th>High School</th> <th data-align="center">Science competition population</th> <th data-align="center">Overall student population</th> </tr> </thead> <tbody><tr><td>Alabaster</td> <td>28.9%</td> <td>8.6%</td> </tr> <tr><td>Concordia</td> <td>7.6%</td> <td>23.2%</td> </tr> <tr><td>Genoa</td> <td>12.1%</td> <td>15.0%</td> </tr> <tr><td>Mocksville</td> <td>18.5%</td> <td>14.3%</td> </tr> <tr><td>Tynneson</td> <td>24.2%</td> <td>10.1%</td> </tr> <tr><td>West End</td> <td>8.7%</td> <td>28.8%</td> </tr> </tbody> </table> </div> </div> <div id="fs-idp107146752" data-type="exercise"><div id="fs-idp107568048" data-type="problem"><p id="fs-idp107568176">Use the data from the David County science competition supplied in <a class="autogenerated-content" href="#exercise10">(Figure)</a>. Construct a bar graph that shows the county-wide population percentage of students at each school.</p> </div> <div id="fs-idp4118704" data-type="solution"><div id="fs-idp86744448" class="bc-figure figure"><span id="fs-idp112012640" data-type="media" data-display="block" data-alt="This is a bar graph that matches the supplied data. The x-axis shows the county high schools, and the y-axis shows the proportion of county students."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M03_011-1.jpg" alt="This is a bar graph that matches the supplied data. The x-axis shows the county high schools, and the y-axis shows the proportion of county students." width="420" data-media-type="image/jpg" /></span></div> </div> </div> <div id="fs-idp103158432" class="free-response" data-depth="1"><div id="fs-idm3046592" class="practice" data-depth="1"><div id="eip-341" data-type="exercise"><div id="fs-idp82639856" data-type="problem"><p id="fs-idp82640112">Sixty-five randomly selected car salespersons were asked the number of cars they generally sell in one week. Fourteen people answered that they generally sell three cars; nineteen generally sell four cars; twelve generally sell five cars; nine generally sell six cars; eleven generally sell seven cars. Complete the table.</p> <table id="table001" summary="Blank table where data can be reported with the first column designated for the data value, or number of cars, the second column for frequency, the third column for relative frequency, and the fourth column for cumulative frequency."><thead><tr><th>Data Value (# cars)</th> <th>Frequency</th> <th>Relative Frequency</th> <th>Cumulative Relative Frequency</th> </tr> </thead> <tbody><tr><td></td> <td></td> <td></td> <td></td> </tr> <tr><td></td> <td></td> <td></td> <td></td> </tr> <tr><td></td> <td></td> <td></td> <td></td> </tr> <tr><td></td> <td></td> <td></td> <td></td> </tr> <tr><td></td> <td></td> <td></td> <td></td> </tr> </tbody> </table> </div> </div> <div id="element-625" data-type="exercise"><div id="id9456620" data-type="problem"><p>What does the frequency column in <a class="autogenerated-content" href="#table001">(Figure)</a> sum to? Why?</p> </div> <div id="id6523440" data-type="solution"><p id="element-943">65</p> </div> </div> <div data-type="exercise"><div id="id6442302" data-type="problem"><p>What does the relative frequency column in <a class="autogenerated-content" href="#table001">(Figure)</a> sum to? Why?</p> </div> </div> <div data-type="exercise"><div id="id17920864" data-type="problem"><p>What is the difference between relative frequency and frequency for each data value in <a class="autogenerated-content" href="#table001">(Figure)</a>?</p> </div> <div id="eip-idm45288784" data-type="solution"><p id="eip-idm45288528">The relative frequency shows the <em data-effect="italics">proportion</em> of data points that have each value. The frequency tells the <em data-effect="italics">number</em> of data points that have each value.</p> </div> </div> <div data-type="exercise"><div id="id11564922" data-type="problem"><p>What is the difference between cumulative relative frequency and relative frequency for each data value?</p> </div> </div> <div id="fs-idp48667136" data-type="exercise"><div id="fs-idp71409056" data-type="problem"><p id="element-936">To construct the histogram for the data in <a class="autogenerated-content" href="#table001">(Figure)</a>, determine the appropriate minimum and maximum <em data-effect="italics">x</em> and <em data-effect="italics">y</em> values and the scaling. Sketch the histogram. Label the horizontal and vertical axes with words. Include numerical scaling.</p> <div id="eip-idp57691072" class="bc-figure figure"><span id="id9045474" data-type="media" data-alt="An empty graph template for use with this question." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch02_11_01-1.jpg" alt="An empty graph template for use with this question." width="350" data-media-type="image/jpg" /></span></div> </div> <div id="fs-idm9409296" data-type="solution"><p id="fs-idp4677968">Answers will vary. One possible histogram is shown:</p> <div id="eip-idp746160" class="bc-figure figure"><span id="eip-idp746416" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M03_101-1.jpg" alt="" width="380" data-media-type="image/png" /></span></div> </div> </div> </div> <div id="fs-idp52790224" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <div id="eip-457" data-type="exercise"><div id="fs-idp16459264" data-type="problem"><p id="fs-idp16459520">1) Often, cruise ships conduct all on-board transactions, with the exception of gambling, on a cashless basis. At the end of the cruise, guests pay one bill that covers all onboard transactions. Suppose that 60 single travelers and 70 couples were surveyed as to their on-board bills for a seven-day cruise from Los Angeles to the Mexican Riviera. Following is a summary of the bills for each group.</p> <table id="fs-idp40233520" summary="This table presents the amount of cruise bills by guest type. The first table is for singles with the first column listing the bill amount, the second column listing the frequency, and the third column labeled for relative frequency which is blank."><caption><span data-type="title">Singles</span></caption> <thead><tr><th>Amount(\$)</th> <th>Frequency</th> <th>Rel. Frequency</th> </tr> </thead> <tbody><tr><td>51–100</td> <td>5</td> <td></td> </tr> <tr><td>101–150</td> <td>10</td> <td></td> </tr> <tr><td>151–200</td> <td>15</td> <td></td> </tr> <tr><td>201–250</td> <td>15</td> <td></td> </tr> <tr><td>251–300</td> <td>10</td> <td></td> </tr> <tr><td>301–350</td> <td>5</td> <td></td> </tr> </tbody> </table> <table id="fs-idp2601840" summary="The second table is for couples with the first column listing the bill amount, the second column listing the frequency, and the third column labeled for relative frequency which is blank."><caption><span data-type="title">Couples</span></caption> <thead><tr><th>Amount(\$)</th> <th>Frequency</th> <th>Rel. Frequency</th> </tr> </thead> <tbody><tr><td>100–150</td> <td>5</td> <td></td> </tr> <tr><td>201–250</td> <td>5</td> <td></td> </tr> <tr><td>251–300</td> <td>5</td> <td></td> </tr> <tr><td>301–350</td> <td>5</td> <td></td> </tr> <tr><td>351–400</td> <td>10</td> <td></td> </tr> <tr><td>401–450</td> <td>10</td> <td></td> </tr> <tr><td>451–500</td> <td>10</td> <td></td> </tr> <tr><td>501–550</td> <td>10</td> <td></td> </tr> <tr><td>551–600</td> <td>5</td> <td></td> </tr> <tr><td>601–650</td> <td>5</td> <td></td> </tr> </tbody> </table> <ol id="fs-idm19738640" type="a"><li>Fill in the relative frequency for each group.</li> <li>Construct a histogram for the singles group. Scale the <em data-effect="italics">x</em>-axis by \$50 widths. Use relative frequency on the <em data-effect="italics">y</em>-axis.</li> <li>Construct a histogram for the couples group. Scale the <em data-effect="italics">x</em>-axis by \$50 widths. Use relative frequency on the <em data-effect="italics">y</em>-axis.</li> <li>Compare the two graphs: <ol id="nestlist8" type="i" data-mark-suffix="."><li>List two similarities between the graphs.</li> <li>List two differences between the graphs.</li> <li>Overall, are the graphs more similar or different?</li> </ol> </li> <li>Construct a new graph for the couples by hand. Since each couple is paying for two individuals, instead of scaling the <em data-effect="italics">x</em>-axis by \$50, scale it by \$100. Use relative frequency on the <em data-effect="italics">y</em>-axis.</li> <li>Compare the graph for the singles with the new graph for the couples: <ol id="nestlist9" type="i" data-mark-suffix="."><li>List two similarities between the graphs.</li> <li>Overall, are the graphs more similar or different?</li> </ol> </li> <li>How did scaling the couples graph differently change the way you compared it to the singles graph?</li> <li>Based on the graphs, do you think that individuals spend the same amount, more or less, as singles as they do person by person as a couple? Explain why in one or two complete sentences.</li> </ol> </div> <div id="fs-idp34240" data-type="solution"><p id="fs-idp71468496">2) Suppose that three book publishers were interested in the number of fiction paperbacks adult consumers purchase per month. Each publisher conducted a survey. In the survey, adult consumers were asked the number of fiction paperbacks they had purchased the previous month. The results are as follows:</p> <table id="fs-idp71816400" summary="The tables presents the number of books purchased by adults by three different publishers. Publisher A is the first table with number of books in the first column, from 0-8, frequency in the second column, and relative frequency in the third column which is blank."><caption><span data-type="title">Publisher A</span></caption> <thead><tr><th># of books</th> <th>Freq.</th> <th>Rel. Freq.</th> </tr> </thead> <tbody><tr><td>0</td> <td>10</td> <td></td> </tr> <tr><td>1</td> <td>12</td> <td></td> </tr> <tr><td>2</td> <td>16</td> <td></td> </tr> <tr><td>3</td> <td>12</td> <td></td> </tr> <tr><td>4</td> <td>8</td> <td></td> </tr> <tr><td>5</td> <td>6</td> <td></td> </tr> <tr><td>6</td> <td>2</td> <td></td> </tr> <tr><td>8</td> <td>2</td> <td></td> </tr> </tbody> </table> <table id="fs-idp43657824" summary="Publisher B is the second table with number of books in the first column, from 0-5, 7, 9, frequency in the second column, and relative frequency in the third column which is blank."><caption><span data-type="title">Publisher B</span></caption> <thead><tr><th># of books</th> <th>Freq.</th> <th>Rel. Freq.</th> </tr> </thead> <tbody><tr><td>0</td> <td>18</td> <td></td> </tr> <tr><td>1</td> <td>24</td> <td></td> </tr> <tr><td>2</td> <td>24</td> <td></td> </tr> <tr><td>3</td> <td>22</td> <td></td> </tr> <tr><td>4</td> <td>15</td> <td></td> </tr> <tr><td>5</td> <td>10</td> <td></td> </tr> <tr><td>7</td> <td>5</td> <td></td> </tr> <tr><td>9</td> <td>1</td> <td></td> </tr> </tbody> </table> <table id="fs-idm93843856" summary="Publisher C is the first table with number of books in the first column, 0-1, 2-3, 4-5, 6-7, 8-9, frequency in the second column, and relative frequency in the third column which is blank."><caption><span data-type="title">Publisher C</span></caption> <thead><tr><th># of books</th> <th>Freq.</th> <th>Rel. Freq.</th> </tr> </thead> <tbody><tr><td>0–1</td> <td>20</td> <td></td> </tr> <tr><td>2–3</td> <td>35</td> <td></td> </tr> <tr><td>4–5</td> <td>12</td> <td></td> </tr> <tr><td>6–7</td> <td>2</td> <td></td> </tr> <tr><td>8–9</td> <td>1</td> <td></td> </tr> </tbody> </table> <ol type="a" data-mark-suffix="."><li>Find the relative frequencies for each survey. Write them in the charts.</li> <li>Using either a graphing calculator, computer, or by hand, use the frequency column to construct a histogram for each publisher&#8217;s survey. For Publishers A and B, make bar widths of one. For Publisher C, make bar widths of two.</li> <li>In complete sentences, give two reasons why the graphs for Publishers A and B are not identical.</li> <li>Would you have expected the graph for Publisher C to look like the other two graphs? Why or why not?</li> <li>Make new histograms for Publisher A and Publisher B. This time, make bar widths of two.</li> <li>Now, compare the graph for Publisher C to the new graphs for Publishers A and B. Are the graphs more similar or more different? Explain your answer.</li> </ol> </div> </div> <div id="fs-idp73179824" data-type="exercise"><div id="fs-idp73180080" data-type="problem"></div> <div data-type="problem"></div> </div> <p id="fs-idp72793344"><em data-effect="italics">3) Use the following information to answer the next two exercises:</em> Suppose one hundred eleven people who shopped in a special t-shirt store were asked the number of t-shirts they own costing more than \$19 each.</p> <p><span id="fs-idp52038080" data-type="media" data-alt="A histogram showing the results of a survey. Of 111 respondents, 5 own 1 t-shirt costing more than 💲19, 17 own 2, 23 own 3, 39 own 4, 25 own 5, 2 own 6, and no respondents own 7." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch02_13_11-1.jpg" alt="A histogram showing the results of a survey. Of 111 respondents, 5 own 1 t-shirt costing more than 💲19, 17 own 2, 23 own 3, 39 own 4, 25 own 5, 2 own 6, and no respondents own 7." width="350" data-media-type="image/JPG" /></span></p> <div id="fs-idm24292288" data-type="exercise"><div id="fs-idm24292032" data-type="problem"><p id="fs-idp64025632">The percentage of people who own at most three t-shirts costing more than \$19 each is approximately:</p> <ol id="ni6" type="a"><li>21</li> <li>59</li> <li>41</li> <li>Cannot be determined</li> </ol> <p>4)  If the data were collected by asking the first 111 people who entered the store, then the type of sampling is:</p> <ol id="ni7" type="a"><li>cluster</li> <li>simple random</li> <li>stratified</li> <li>convenience</li> </ol> </div> <div id="id6146639" data-type="solution"></div> </div> <div id="element-195" data-type="exercise"><div id="id3671758" data-type="problem"></div> </div> <div id="fs-idp13768528" data-type="exercise"><div id="fs-idm31496800" data-type="problem"><p id="fs-idm31496544">5) Following are the 2010 obesity rates by U.S. states and Washington, DC.</p> <table id="Obesity_Rates_2008" summary="The tables represents United States states and Obesity Rates. The first column and third columns list the names of the states and the second and fourth columns list Obesity Rates."><thead><tr><th>State</th> <th>Percent (%)</th> <th>State</th> <th>Percent (%)</th> <th>State</th> <th>Percent (%)</th> </tr> </thead> <tbody><tr><td>Alabama</td> <td>32.2</td> <td>Kentucky</td> <td>31.3</td> <td>North Dakota</td> <td>27.2</td> </tr> <tr><td>Alaska</td> <td>24.5</td> <td>Louisiana</td> <td>31.0</td> <td>Ohio</td> <td>29.2</td> </tr> <tr><td>Arizona</td> <td>24.3</td> <td>Maine</td> <td>26.8</td> <td>Oklahoma</td> <td>30.4</td> </tr> <tr><td>Arkansas</td> <td>30.1</td> <td>Maryland</td> <td>27.1</td> <td>Oregon</td> <td>26.8</td> </tr> <tr><td>California</td> <td>24.0</td> <td>Massachusetts</td> <td>23.0</td> <td>Pennsylvania</td> <td>28.6</td> </tr> <tr><td>Colorado</td> <td>21.0</td> <td>Michigan</td> <td>30.9</td> <td>Rhode Island</td> <td>25.5</td> </tr> <tr><td>Connecticut</td> <td>22.5</td> <td>Minnesota</td> <td>24.8</td> <td>South Carolina</td> <td>31.5</td> </tr> <tr><td>Delaware</td> <td>28.0</td> <td>Mississippi</td> <td>34.0</td> <td>South Dakota</td> <td>27.3</td> </tr> <tr><td>Washington, DC</td> <td>22.2</td> <td>Missouri</td> <td>30.5</td> <td>Tennessee</td> <td>30.8</td> </tr> <tr><td>Florida</td> <td>26.6</td> <td>Montana</td> <td>23.0</td> <td>Texas</td> <td>31.0</td> </tr> <tr><td>Georgia</td> <td>29.6</td> <td>Nebraska</td> <td>26.9</td> <td>Utah</td> <td>22.5</td> </tr> <tr><td>Hawaii</td> <td>22.7</td> <td>Nevada</td> <td>22.4</td> <td>Vermont</td> <td>23.2</td> </tr> <tr><td>Idaho</td> <td>26.5</td> <td>New Hampshire</td> <td>25.0</td> <td>Virginia</td> <td>26.0</td> </tr> <tr><td>Illinois</td> <td>28.2</td> <td>New Jersey</td> <td>23.8</td> <td>Washington</td> <td>25.5</td> </tr> <tr><td>Indiana</td> <td>29.6</td> <td>New Mexico</td> <td>25.1</td> <td>West Virginia</td> <td>32.5</td> </tr> <tr><td>Iowa</td> <td>28.4</td> <td>New York</td> <td>23.9</td> <td>Wisconsin</td> <td>26.3</td> </tr> <tr><td>Kansas</td> <td>29.4</td> <td>North Carolina</td> <td>27.8</td> <td>Wyoming</td> <td>25.1</td> </tr> </tbody> </table> <p id="fs-idp22466816">Construct a bar graph of obesity rates of your state and the four states closest to your state. Hint: Label the <em data-effect="italics">x</em>-axis with the states.</p> <p>6) Student grades on a chemistry exam were: 77,  78,  76,  81,  86,  51,  79,  82,  84,  99</p> <ol id="fs-idp96417840" type="a"><li>Construct a stem-and-leaf plot of the data.</li> <li>Are there any potential outliers? If so, which scores are they? Why do you consider them outliers?</li> </ol> </div> <div id="fs-idp3779952" data-type="solution"><p><strong>Answers to odd questions</strong></p> <p>1)</p> <div data-type="solution"><table id="Singles" summary=""><caption><span data-type="title">Singles</span></caption> <thead><tr><th>Amount(\$)</th> <th>Frequency</th> <th>Relative Frequency</th> </tr> </thead> <tbody><tr><td>51–100</td> <td>5</td> <td>0.08</td> </tr> <tr><td>101–150</td> <td>10</td> <td>0.17</td> </tr> <tr><td>151–200</td> <td>15</td> <td>0.25</td> </tr> <tr><td>201–250</td> <td>15</td> <td>0.25</td> </tr> <tr><td>251–300</td> <td>10</td> <td>0.17</td> </tr> <tr><td>301–350</td> <td>5</td> <td>0.08</td> </tr> </tbody> </table> <table id="Couples" summary=""><caption><span data-type="title">Couples</span></caption> <thead><tr><th>Amount(\$)</th> <th>Frequency</th> <th>Relative Frequency</th> </tr> </thead> <tbody><tr><td>100–150</td> <td>5</td> <td>0.07</td> </tr> <tr><td>201–250</td> <td>5</td> <td>0.07</td> </tr> <tr><td>251–300</td> <td>5</td> <td>0.07</td> </tr> <tr><td>301–350</td> <td>5</td> <td>0.07</td> </tr> <tr><td>351–400</td> <td>10</td> <td>0.14</td> </tr> <tr><td>401–450</td> <td>10</td> <td>0.14</td> </tr> <tr><td>451–500</td> <td>10</td> <td>0.14</td> </tr> <tr><td>501–550</td> <td>10</td> <td>0.14</td> </tr> <tr><td>551–600</td> <td>5</td> <td>0.07</td> </tr> <tr><td>601–650</td> <td>5</td> <td>0.07</td> </tr> </tbody> </table> <ol id="fs-idp1651760" type="a" data-mark-suffix="."><li>See <a class="autogenerated-content" href="#Singles">(Figure)</a> and <a class="autogenerated-content" href="#Couples">(Figure)</a>.</li> <li>In the following histogram data values that fall on the right boundary are counted in the class interval, while values that fall on the left boundary are not counted (with the exception of the first interval where both boundary values are included). <div id="eip-idp10303088" class="bc-figure figure"><span id="fs-idp32280736" data-type="media" data-display="block" data-alt="This is a histogram that matches the supplied data supplied for singles. The x-axis shows the total charges in intervals of 50 from 50 to 350, and the y-axis shows the relative frequency in increments of 0.05 from 0 to 0.3."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M03_106-1.jpg" alt="This is a histogram that matches the supplied data supplied for singles. The x-axis shows the total charges in intervals of 50 from 50 to 350, and the y-axis shows the relative frequency in increments of 0.05 from 0 to 0.3." width="350" data-media-type="image/jpg" /></span></div> </li> <li>In the following histogram, the data values that fall on the right boundary are counted in the class interval, while values that fall on the left boundary are not counted (with the exception of the first interval where values on both boundaries are included). <div id="eip-idp116473376" class="bc-figure figure"><span id="fs-idm4934000" data-type="media" data-display="block" data-alt="This is a histogram that matches the supplied data for couples. The x-axis shows the total charges in intervals of 50 from 100 to 650, and the y-axis shows the relative frequency in increments of 0.02 from 0 to 0.16."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M03_107-1.jpg" alt="This is a histogram that matches the supplied data for couples. The x-axis shows the total charges in intervals of 50 from 100 to 650, and the y-axis shows the relative frequency in increments of 0.02 from 0 to 0.16." width="350" data-media-type="image/jpg" /></span></div> </li> <li>Compare the two graphs: <ol id="fs-idm1796080" type="i" data-mark-suffix="."><li>Answers may vary. Possible answers include: <ul id="fs-idp12795120"><li>Both graphs have a single peak.</li> <li>Both graphs use class intervals with width equal to ?50.</li> </ul> </li> <li>Answers may vary. Possible answers include: <ul id="fs-idm10655584"><li>The couples graph has a class interval with no values.</li> <li>It takes almost twice as many class intervals to display the data for couples.</li> </ul> </li> <li>Answers may vary. Possible answers include: The graphs are more similar than different because the overall patterns for the graphs are the same.</li> </ol> </li> <li>Check student&#8217;s solution.</li> <li>Compare the graph for the Singles with the new graph for the Couples: <ol id="fs-idp17546528" type="i" data-mark-suffix="."><li style="list-style-type: none"><ul id="fs-idp13357024"><li>Both graphs have a single peak.</li> <li>Both graphs display 6 class intervals.</li> <li>Both graphs show the same general pattern.</li> </ul> </li> <li>Answers may vary. Possible answers include: Although the width of the class intervals for couples is double that of the class intervals for singles, the graphs are more similar than they are different.</li> </ol> </li> <li>Answers may vary. Possible answers include: You are able to compare the graphs interval by interval. It is easier to compare the overall patterns with the new scale on the Couples graph. Because a couple represents two individuals, the new scale leads to a more accurate comparison.</li> <li>Answers may vary. Possible answers include: Based on the histograms, it seems that spending does not vary much from singles to individuals who are part of a couple. The overall patterns are the same. The range of spending for couples is approximately double the range for individuals.</li> </ol> <p>3) c</p> <p>5) Answers will vary.</p> </div> </div> </div> </div> <div id="fs-idp49136992" data-type="exercise"><div id="fs-idp49137248" data-type="problem"><p id="fs-idp5295728"></p></div> </div> <div id="eip-440" data-type="exercise"><div id="eip-27" data-type="solution"></div> </div> </div> </div></div>
<div class="chapter standard" id="chapter-measures-of-the-center-of-the-data" title="Chapter 2.4: Measures of the Center of the Data"><div class="chapter-title-wrap"><h3 class="chapter-number">10</h3><h2 class="chapter-title"><span class="display-none">Chapter 2.4: Measures of the Center of the Data</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p id="element-848">The &#8220;center&#8221; of a data set is also a way of describing location. The two most widely used measures of the &#8220;center&#8221; of the data are the <span data-type="term">mean</span> (average) and the <span data-type="term">median</span>. To calculate the <strong>mean weight</strong> of 50 people, add the 50 weights together and divide by 50. To find the <strong>median weight</strong> of the 50 people, order the data and find the number that splits the data into two equal parts. The median is generally a better measure of the center when there are extreme values or outliers because it is not affected by the precise numerical values of the outliers. The mean is the most common measure of the center.</p> <div id="eip-13" data-type="note" data-has-label="true" data-label=""><div data-type="title"></div> <div data-type="title">NOTE</div> <p id="fs-idp62684352">The words “mean” and “average” are often used interchangeably. The substitution of one word for the other is common practice. The technical term is “arithmetic mean” and “average” is technically a center location. However, in practice among non-statisticians, “average&#8221; is commonly accepted for “arithmetic mean.”</p> </div> <p id="element-44">When each value in the data set is not unique, the mean can be calculated by multiplying each distinct value by its frequency and then dividing the sum by the total number of data values. The letter used to represent the <strong>sample mean</strong> is an <em data-effect="italics">x</em> with a bar over it (pronounced “<em data-effect="italics">x</em> bar”): \(\overline{x}\).</p> <p id="element-705">The Greek letter <em data-effect="italics">μ</em> (pronounced &#8220;mew&#8221;) represents the <strong>population mean</strong>. One of the requirements for the <strong>sample mean</strong> to be a good estimate of the <strong>population mean</strong> is for the sample taken to be truly random.</p> <p id="element-228">To see that both ways of calculating the mean are the same, consider the sample: <span data-type="newline"><br /> </span>1; 1; 1; 2; 2; 3; 4; 4; 4; 4; 4</p> <div id="element-46" data-type="equation">\(\overline{x}=\frac{1+1+1+2+2+3+4+4+4+4+4}{11}=2.7\)</div> <div data-type="equation">\(\overline{x}=\frac{3\left(1\right)+2\left(2\right)+1\left(3\right)+5\left(4\right)}{11}=2.7\)</div> <p id="element-180">In the second calculation, the frequencies are 3, 2, 1, and 5.</p> <p>You can quickly find the location of the median by using the expression \(\frac{n+1}{2}\).</p> <p id="element-860">The letter <em data-effect="italics">n</em> is the total number of data values in the sample. If <em data-effect="italics">n</em> is an odd number, the median is the middle value of the ordered data (ordered smallest to largest). If <em data-effect="italics">n</em> is an even number, the median is equal to the two middle values added together and divided by two after the data has been ordered. For example, if the total number of data values is 97, then \(\frac{n+1}{2}\)= \(\frac{97+1}{2}\) = 49. The median is the 49<sup>th</sup> value in the ordered data. If the total number of data values is 100, then \(\frac{n+1}{2}\)= \(\frac{100+1}{2}\) = 50.5. The median occurs midway between the 50<sup>th</sup> and 51<sup>st</sup> values. The location of the median and the value of the median are <strong>not</strong> the same. The upper case letter <em data-effect="italics">M</em> is often used to represent the median. The next example illustrates the location of the median and the value of the median.</p> <div id="element-3" class="textbox textbox--examples" data-type="example"><div id="exer4" data-type="exercise"><div id="id45306962" data-type="problem"><p id="element-226">AIDS data indicating the number of months a patient with AIDS lives after taking a new antibody drug are as follows (smallest to largest): <span data-type="newline"><br /> </span>3;  4;  8;  8;  10;  11;  12;  13;  14;  15;  15;  16;  16;  17;  17;  18;  21;  22;  22;  24;  24;  25;  26;  26;  27;  27;  29;  29;  31;  32;  33;  33;  34;  34;  35;  37;  40;  44;  44;  47; <span data-type="newline"><br /> </span>Calculate the mean and the median.</p> </div> <div id="id45386042" data-type="solution"><p id="element-471">The calculation for the mean is:</p> <p id="element-197">\(\overline{x}=\frac{\left[3+4+\left(8\right)\left(2\right)+10+11+12+13+14+\left(15\right)\left(2\right)+\left(16\right)\left(2\right)+\text{&#8230;}+35+37+40+\left(44\right)\left(2\right)+47\right]}{40}=\mathrm{23.6}\)<span data-type="newline"><br /> </span>To find the median, <em data-effect="italics">M</em>, first use the formula for the location. The location is: <span data-type="newline"><br /> </span>\(\frac{n+1}{2}=\frac{40+1}{2}=20.5\)<span data-type="newline"><br /> </span>Starting at the smallest value, the median is located between the 20<sup>th</sup> and 21<sup>st</sup> values (the two 24s): <span data-type="newline"><br /> </span>3;  4;  8;  8;  10;  11;  12;  13;  14;  15;  15;  16;  16;  17;  17;  18;  21;  22;  22;  24 ; 24;  25;  26;  26;  27;  27;  29;  29;  31;  32;  33;  33;  34;  34;  35;  37;  40;  44;  44;  47;</p> <p id="element-904">\(M=\frac{24+24}{2}=24\)</p> <p>&nbsp;</p> </div> </div> </div> <div id="fs-idp50763088" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="fs-idp54507328">To find the mean and the median:</p> <p id="fs-idp87775312">Clear list L1. Pres STAT 4:ClrList. Enter 2nd 1 for list L1. Press ENTER.</p> <p id="fs-idm47103392">Enter data into the list editor. Press STAT 1:EDIT.</p> <p id="fs-idp45529696">Put the data values into list L1.</p> <p id="fs-idp33349200">Press STAT and arrow to CALC. Press 1:1-VarStats. Press 2nd 1 for L1 and then ENTER.</p> <p id="fs-idp46064432">Press the down and up arrow keys to scroll.</p> <p id="fs-idp50820000">\(\overline{x}\) = 23.6, <em data-effect="italics">M</em> = 24</p> </div> <div id="fs-idp48953680" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm28353888" data-type="exercise"><div id="fs-idm72704208" data-type="problem"><p id="fs-idm46796800">The following data show the number of months patients typically wait on a transplant list before getting surgery. The data are ordered from smallest to largest. Calculate the mean and median.</p> <p id="fs-idm114142640"><span data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">3  </span><span data-type="item">4  </span><span data-type="item">5  </span><span data-type="item">7  </span><span data-type="item">7  </span><span data-type="item">7  </span><span data-type="item">7  </span><span data-type="item">8  </span><span data-type="item">8  </span><span data-type="item">9  </span><span data-type="item">9  </span><span data-type="item">1 0  </span><span data-type="item">10  </span><span data-type="item">10  </span><span data-type="item">10  </span><span data-type="item">10  </span><span data-type="item">11  </span><span data-type="item">12  </span><span data-type="item">12  </span><span data-type="item">13  </span><span data-type="item">14  </span><span data-type="item">14  </span><span data-type="item">15  </span><span data-type="item">15  </span><span data-type="item">17  </span><span data-type="item">17  </span><span data-type="item">18  </span><span data-type="item">19  </span><span data-type="item">19  </span><span data-type="item">19  </span><span data-type="item">21  </span><span data-type="item">21  </span><span data-type="item">22  </span><span data-type="item">22  </span><span data-type="item">23  </span><span data-type="item">24  </span><span data-type="item">24  </span><span data-type="item">24  </span><span data-type="item">24</span></span></p> </div> </div> </div> <div id="element-231" class="textbox textbox--examples" data-type="example"><div id="exer6" data-type="exercise"><div id="id45393377" data-type="problem"><p id="element-213">Suppose that in a small town of 50 people, one person earns \$5,000,000 per year and the other 49 each earn \$30,000. Which is the better measure of the &#8220;center&#8221;: the mean or the median?</p> </div> <div id="id45393396" data-type="solution"><p id="element-444">\(\overline{x}=\frac{5,000,000+49\left(30,000\right)}{50}=129,400\)</p> <p><em data-effect="italics">M</em> = 30,000</p> <p id="element-831">(There are 49 people who earn \$30,000 and one person who earns \$5,000,000.)</p> <p>The median is a better measure of the &#8220;center&#8221; than the mean because 49 of the values are 30,000 and one is 5,000,000. The 5,000,000 is an outlier. The 30,000 gives us a better sense of the middle of the data.</p> </div> </div> </div> <div id="fs-idp18783360" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm79009152" data-type="exercise"><div id="fs-idm79009024" data-type="problem"><p id="fs-idm10019504">In a sample of 60 households, one house is worth \$2,500,000. Half of the rest are worth \$280,000, and all the others are worth \$315,000. Which is the better measure of the “center”: the mean or the median?</p> </div> </div> </div> <p id="element-584">Another measure of the center is the mode. The <span data-type="term">mode</span> is the most frequent value. There can be more than one mode in a data set as long as those values have the same frequency and that frequency is the highest. A data set with two modes is called bimodal.</p> <div id="element-114" class="textbox textbox--examples" data-type="example"><p id="element-639">Statistics exam scores for 20 students are as follows:</p> <p id="element-104"><span id="set-536" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">50  </span><span data-type="item">53  </span><span data-type="item">59  </span><span data-type="item">59  </span><span data-type="item">63  </span><span data-type="item">63  </span><span data-type="item">72  </span><span data-type="item">72  </span><span data-type="item">72  </span><span data-type="item">72  </span><span data-type="item">72  </span><span data-type="item">76  </span><span data-type="item">78  </span><span data-type="item">81  </span><span data-type="item">83  </span><span data-type="item">84  </span><span data-type="item">84  </span><span data-type="item">84  </span><span data-type="item">90  </span><span data-type="item">93</span></span></p> <div id="exer3" data-type="exercise"><div id="id44835721" data-type="problem"><p id="element-32535">Find the mode.</p> </div> <div id="id44835735" data-type="solution"><p id="element-76">The most frequent score is 72, which occurs five times. Mode = 72.</p> </div> </div> </div> <div id="fs-idp45793968" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm24402032" data-type="exercise"><div id="fs-idm24643648" data-type="problem"><p id="fs-idp74555792">The number of books checked out from the library from 25 students are as follows:</p> <p id="fs-idm901328"><span data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">0  </span><span data-type="item">0  </span><span data-type="item">0  </span><span data-type="item">1  </span><span data-type="item">2  </span><span data-type="item">3  </span><span data-type="item">3  </span><span data-type="item">4  </span><span data-type="item">4  </span><span data-type="item">5  </span><span data-type="item">5  </span><span data-type="item">7  </span><span data-type="item">7  </span><span data-type="item">7  </span><span data-type="item">7  </span><span data-type="item">8  </span><span data-type="item">8  </span><span data-type="item">8  </span><span data-type="item">9  </span><span data-type="item">10  </span><span data-type="item">10  </span><span data-type="item">11  </span><span data-type="item">11  </span><span data-type="item">12  </span><span data-type="item">12</span></span><span data-type="newline"><br /> </span>Find the mode.</p> </div> </div> </div> <div id="element-725" class="textbox textbox--examples" data-type="example"><p id="element-622">Five real estate exam scores are 430, 430, 480, 480, 495. The data set is bimodal because the scores 430 and 480 each occur twice.</p> <p id="element-353">When is the mode the best measure of the &#8220;center&#8221;? Consider a weight loss program that advertises a mean weight loss of six pounds the first week of the program. The mode might indicate that most people lose two pounds the first week, making the program less appealing.</p> <div data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="fs-idm78257648">The mode can be calculated for qualitative data as well as for quantitative data. For example, if the data set is: red, red, red, green, green, yellow, purple, black, blue, the mode is red.</p> </div> <p id="element-660" class="finger">Statistical software will easily calculate the mean, the median, and the mode. Some graphing calculators can also make these calculations. In the real world, people make these calculations using software.</p> </div> <div id="fs-idp55881696" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm2098096" data-type="exercise"><div id="fs-idm2097968" data-type="problem"><p id="fs-idm5923360">Five credit scores are 680, 680, 700, 720, 720. The data set is bimodal because the scores 680 and 720 each occur twice. Consider the annual earnings of workers at a factory. The mode is \$25,000 and occurs 150 times out of 301. The median is \$50,000 and the mean is \$47,500. What would be the best measure of the “center”?</p> </div> </div> </div> <div id="element-282" class="bc-section section" data-depth="1"></div> <div id="eip-529" class="bc-section section" data-depth="1"><p>&nbsp;</p> <p id="eip-309">A <strong>statistic</strong> is a number calculated from a sample. Statistic examples include the mean, the median and the mode as well as others. The sample mean \(\overline{x}\) is an example of a statistic which estimates the population mean <em data-effect="italics">μ</em>.</p> </div> <div id="fs-idp28527728" class="bc-section section" data-depth="1"><h3 data-type="title">Calculating the Mean of Grouped Frequency Tables</h3> <p id="fs-idp16870784">When only grouped data is available, you do not know the individual data values (we only know intervals and interval frequencies); therefore, you cannot compute an exact mean for the data set. What we must do is estimate the actual mean by calculating the mean of a frequency table. A frequency table is a data representation in which grouped data is displayed along with the corresponding frequencies. To calculate the mean from a grouped frequency table we can apply the basic definition of mean: <em data-effect="italics">mean</em> = \(\frac{data\text{ }sum}{number\text{ }of\text{ }data\text{ }values}\) We simply need to modify the definition to fit within the restrictions of a frequency table.</p> <p id="fs-idp28261664">Since we do not know the individual data values we can instead find the midpoint of each interval. The midpoint is \(\frac{lower\text{ }boundary+upper\text{ }boundary}{2}\). We can now modify the mean definition to be \(Mean\text{ }of\text{ }Frequency\text{ }Table=\frac{\sum fm}{\sum f}\) where <em data-effect="italics">f</em> = the frequency of the interval and <em data-effect="italics">m</em> = the midpoint of the interval.</p> <div id="fs-idp59127680" class="textbox textbox--examples" data-type="example"><div id="fs-idp61839840" data-type="exercise"><div id="fs-idp47622112" data-type="problem"><p id="fs-idm6896112">A frequency table displaying professor Blount’s last statistic test is shown. Find the best estimate of the class mean.</p> <table id="fs-idp32456976" summary=""><thead><tr><th>Grade Interval</th> <th>Number of Students</th> </tr> </thead> <tbody><tr><td>50–56.5</td> <td>1</td> </tr> <tr><td>56.5–62.5</td> <td>0</td> </tr> <tr><td>62.5–68.5</td> <td>4</td> </tr> <tr><td>68.5–74.5</td> <td>4</td> </tr> <tr><td>74.5–80.5</td> <td>2</td> </tr> <tr><td>80.5–86.5</td> <td>3</td> </tr> <tr><td>86.5–92.5</td> <td>4</td> </tr> <tr><td>92.5–98.5</td> <td>1</td> </tr> </tbody> </table> </div> <div id="fs-idp56820112" data-type="solution"><ul id="fs-idp26414448"><li>Find the midpoints for all intervals</li> </ul> <table id="fs-idp15194576" summary=""><thead><tr><th>Grade Interval</th> <th>Midpoint</th> </tr> </thead> <tbody><tr><td>50–56.5</td> <td>53.25</td> </tr> <tr><td>56.5–62.5</td> <td>59.5</td> </tr> <tr><td>62.5–68.5</td> <td>65.5</td> </tr> <tr><td>68.5–74.5</td> <td>71.5</td> </tr> <tr><td>74.5–80.5</td> <td>77.5</td> </tr> <tr><td>80.5–86.5</td> <td>83.5</td> </tr> <tr><td>86.5–92.5</td> <td>89.5</td> </tr> <tr><td>92.5–98.5</td> <td>95.5</td> </tr> </tbody> </table> <ul id="fs-idp72598816"><li>Calculate the sum of the product of each interval frequency and midpoint.\({\sum }^{\text{​}}fm\)\(53.25\left(1\right)+59.5\left(0\right)+65.5\left(4\right)+71.5\left(4\right)+77.5\left(2\right)+83.5\left(3\right)+89.5\left(4\right)+95.5\left(1\right)=1460.25\)</li> <li>\(\mu =\frac{\sum fm}{\sum f}=\frac{1460.25}{19}=76.86\)</li> </ul> </div> </div> </div> <div id="fs-idm52755520" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp22476928" data-type="exercise"><div id="fs-idp50951968" data-type="problem"><p id="fs-idm11216000">Maris conducted a study on the effect that playing video games has on memory recall. As part of her study, she compiled the following data:</p> <table id="fs-idm6918832" summary=""><thead><tr><th>Hours Teenagers Spend on Video Games</th> <th>Number of Teenagers</th> </tr> </thead> <tbody><tr><td>0–3.5</td> <td>3</td> </tr> <tr><td>3.5–7.5</td> <td>7</td> </tr> <tr><td>7.5–11.5</td> <td>12</td> </tr> <tr><td>11.5–15.5</td> <td>7</td> </tr> <tr><td>15.5–19.5</td> <td>9</td> </tr> </tbody> </table> <p id="fs-idp10596832">What is the best estimate for the mean number of hours spent playing video games?</p> </div> </div> </div> </div> <div id="fs-idp2891248" class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="fs-idp41166864">Data from The World Bank, available online at http://www.worldbank.org (accessed April 3, 2013).</p> <p id="fs-idp18149184">“Demographics: Obesity – adult prevalence rate.” Indexmundi. Available online at http://www.indexmundi.com/g/r.aspx?t=50&amp;v=2228&amp;l=en (accessed April 3, 2013).</p> </div> <div id="fs-idm7033248" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idm18283152">The mean and the median can be calculated to help you find the &#8220;center&#8221; of a data set. The mean is the best estimate for the actual data set, but the median is the best measurement when a data set contains several outliers or extreme values. The mode will tell you the most frequently occuring datum (or data) in your data set. The mean, median, and mode are extremely helpful when you need to analyze your data, but if your data set consists of ranges which lack specific values, the mean may seem impossible to calculate. However, the mean can be approximated if you add the lower boundary with the upper boundary and divide by two to find the midpoint of each interval. Multiply each midpoint by the number of values found in the corresponding range. Divide the sum of these values by the total number of data values in the set.</p> </div> <div id="fs-idm345328" class="formula-review" data-depth="1"><h3 data-type="title">Formula Review</h3> <p id="fs-idp28375264">\(\mu =\frac{\sum fm}{\sum f}\) Where <em data-effect="italics">f</em> = interval frequencies and <em data-effect="italics">m</em> = interval midpoints.</p> </div> <div id="fs-idp67676496" class="practice" data-depth="1"><div id="fs-idp11191056" data-type="exercise"><div id="fs-idp72176256" data-type="problem"><p id="fs-idp29942336">Find the mean for the following frequency tables.</p> <ol id="fs-idp39027888" type="a"><li><table id="fs-idp4665840" summary=""><thead><tr><th>Grade</th> <th>Frequency</th> </tr> </thead> <tbody><tr><td>49.5–59.5</td> <td>2</td> </tr> <tr><td>59.5–69.5</td> <td>3</td> </tr> <tr><td>69.5–79.5</td> <td>8</td> </tr> <tr><td>79.5–89.5</td> <td>12</td> </tr> <tr><td>89.5–99.5</td> <td>5</td> </tr> </tbody> </table> </li> <li><table id="fs-idp21318848" summary=""><thead><tr><th>Daily Low Temperature</th> <th>Frequency</th> </tr> </thead> <tbody><tr><td>49.5–59.5</td> <td>53</td> </tr> <tr><td>59.5–69.5</td> <td>32</td> </tr> <tr><td>69.5–79.5</td> <td>15</td> </tr> <tr><td>79.5–89.5</td> <td>1</td> </tr> <tr><td>89.5–99.5</td> <td>0</td> </tr> </tbody> </table> </li> <li><table id="fs-idp49142800" summary=""><thead><tr><th>Points per Game</th> <th>Frequency</th> </tr> </thead> <tbody><tr><td>49.5–59.5</td> <td>14</td> </tr> <tr><td>59.5–69.5</td> <td>32</td> </tr> <tr><td>69.5–79.5</td> <td>15</td> </tr> <tr><td>79.5–89.5</td> <td>23</td> </tr> <tr><td>89.5–99.5</td> <td>2</td> </tr> </tbody> </table> </li> </ol> </div> </div> <p id="eip-193"><em data-effect="italics">Use the following information to answer the next three exercises:</em> The following data show the lengths of boats moored in a marina. The data are ordered from smallest to largest: <span data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">16  </span><span data-type="item">17  </span><span data-type="item">19  </span><span data-type="item">20  </span><span data-type="item">20  </span><span data-type="item">21  </span><span data-type="item">23  </span><span data-type="item">24  </span><span data-type="item">25  </span><span data-type="item">25  </span><span data-type="item">25  </span><span data-type="item">26  </span><span data-type="item">26  </span><span data-type="item">27  </span><span data-type="item">27  </span><span data-type="item">27  </span><span data-type="item">28  </span><span data-type="item">29  </span><span data-type="item">30  </span><span data-type="item">32  </span><span data-type="item">33  </span><span data-type="item">33  </span><span data-type="item">34  </span><span data-type="item">35  </span><span data-type="item">37  </span><span data-type="item">39  </span><span data-type="item">40</span></span></p> <div id="fs-idp3713376" data-type="exercise"><div id="fs-idp45051248" data-type="problem"><p id="fs-idp72372672">Calculate the mean.</p> </div> <div id="fs-idp54737296" data-type="solution"><p id="fs-idp31759600">Mean: 16 + 17 + 19 + 20 + 20 + 21 + 23 + 24 + 25 + 25 + 25 + 26 + 26 + 27 + 27 + 27 + 28 + 29 + 30 + 32 + 33 + 33 + 34 + 35 + 37 + 39 + 40 = 738;</p> <p id="fs-idp62762816">\(\frac{738}{27}\) = 27.33</p> </div> </div> <div id="fs-idp23937584" data-type="exercise"><div id="fs-idm6016496" data-type="problem"><p id="fs-idp65368416">Identify the median.</p> </div> </div> <div id="fs-idm35941040" data-type="exercise"><div id="fs-idp155088" data-type="problem"><p id="fs-idp27706128">Identify the mode.</p> </div> <div id="fs-idm53012848" data-type="solution"><p id="fs-idp8367280">The most frequent lengths are 25 and 27, which occur three times. Mode = 25, 27</p> </div> </div> <p id="fs-idp43142960"><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next three exercises:</em> Sixty-five randomly selected car salespersons were asked the number of cars they generally sell in one week. Fourteen people answered that they generally sell three cars; nineteen generally sell four cars; twelve generally sell five cars; nine generally sell six cars; eleven generally sell seven cars. Calculate the following:</p> <div data-type="exercise"><div id="id5461618" data-type="problem"><p>sample mean = \(\overline{x}\) = _______</p> </div> </div> <div id="eip-83" data-type="exercise"><div id="eip-718" data-type="problem"><p id="eip-423">median = _______</p> </div> <div id="eip-288" data-type="solution"><p id="eip-669">4</p> </div> </div> <div id="eip-630" data-type="exercise"><div id="eip-353" data-type="problem"><p id="eip-212">mode = _______</p> </div> </div> </div> <div id="fs-idm59277808" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <div data-type="exercise"><div data-type="problem"><p id="eip-474">1)  <a class="autogenerated-content" href="#eip-456s">(Figure)</a> gives the percent of children under five considered to be underweight. What is the best estimate for the mean percentage of underweight children?</p> <table id="eip-456s" summary="Table...."><thead><tr><th>Percent of Underweight Children</th> <th>Number of Countries</th> </tr> </thead> <tbody><tr><td>16–21.45</td> <td>23</td> </tr> <tr><td>21.45–26.9</td> <td>4</td> </tr> <tr><td>26.9–32.35</td> <td>9</td> </tr> <tr><td>32.35–37.8</td> <td>7</td> </tr> <tr><td>37.8–43.25</td> <td>6</td> </tr> <tr><td>43.25–48.7</td> <td>1</td> </tr> </tbody> </table> </div> <div id="eip-492" data-type="solution"><p>&nbsp;</p> <p id="eip-506">2)  The most obese countries in the world have obesity rates that range from 11.4% to 74.6%. This data is summarized in the following table.</p> <table id="eip-456" summary="Table...."><thead><tr><th>Percent of Population Obese</th> <th>Number of Countries</th> </tr> </thead> <tbody><tr><td>11.4–20.45</td> <td>29</td> </tr> <tr><td>20.45–29.45</td> <td>13</td> </tr> <tr><td>29.45–38.45</td> <td>4</td> </tr> <tr><td>38.45–47.45</td> <td>0</td> </tr> <tr><td>47.45–56.45</td> <td>2</td> </tr> <tr><td>56.45–65.45</td> <td>1</td> </tr> <tr><td>65.45–74.45</td> <td>0</td> </tr> <tr><td>74.45–83.45</td> <td>1</td> </tr> </tbody> </table> <ol id="eip-idm42348400" type="a"><li>What is the best estimate of the average obesity percentage for these countries?</li> <li>The United States has an average obesity rate of 33.9%. Is this rate above average or below?</li> <li>How does the United States compare to other countries?</li> </ol> <p>&nbsp;</p> </div> </div> </div> <div id="fs-idm1725088" class="bring-together-homework" data-depth="1"><h3 data-type="title">Bringing It Together</h3> <div id="element-832" data-type="exercise"><div id="id6092616" data-type="problem"><div id="eip-idm179364784" class="bc-figure figure"></div> </div> </div> <p id="fs-idm4887840"><em data-effect="italics">Use the following information to answer the next three exercises</em>: We are interested in the number of years students in a particular elementary statistics class have lived in California. The information in the following table is from the entire section.</p> <table id="element-368" summary="This table presents the number of years students in a statistics class have lived in California. The first column lists the number of years and the second column lists the frequency."><thead><tr><th>Number of years</th> <th>Frequency</th> <th>Number of years</th> <th>Frequency</th> </tr> </thead> <tfoot><tr><td></td> <td></td> <td></td> <td>Total = 20</td> </tr> </tfoot> <tbody><tr><td>7</td> <td>1</td> <td>22</td> <td>1</td> </tr> <tr><td>14</td> <td>3</td> <td>23</td> <td>1</td> </tr> <tr><td>15</td> <td>1</td> <td>26</td> <td>1</td> </tr> <tr><td>18</td> <td>1</td> <td>40</td> <td>2</td> </tr> <tr><td>19</td> <td>4</td> <td>42</td> <td>2</td> </tr> <tr><td>20</td> <td>3</td> <td></td> <td></td> </tr> </tbody> </table> <div data-type="exercise"><div id="id4699277" data-type="problem"><p>3)  What is the <em data-effect="italics">IQR</em>?</p> <ol id="ni1" type="a" data-mark-suffix="."><li>8</li> <li>11</li> <li>15</li> <li>35</li> </ol> </div> <div id="id6016838" data-type="solution"><p id="element-991"></p></div> </div> <div data-type="exercise"><div id="id4942656" data-type="problem"><p id="element-780">4)  What is the mode?</p> <ol id="ni2" type="a" data-mark-suffix="."><li>19</li> <li>19.5</li> <li>14 and 20</li> <li>22.65</li> </ol> </div> </div> <div id="element-346" data-type="exercise"><div id="id5971682" data-type="problem"><p>&nbsp;</p> <p>5)  Is this a sample or the entire population?</p> <ol id="ni3" type="a" data-mark-suffix="."><li>sample</li> <li>entire population</li> <li>neither</li> </ol> </div> <div id="id4081124" data-type="solution"><p id="element-366"></p></div> </div> <p id="JavErc">6) Javier and Ercilia are supervisors at a shopping mall. Each was given the task of estimating the mean distance that shoppers live from the mall. They each randomly surveyed 100 shoppers. The samples yielded the following information.</p> <table summary="This table presents two shopping mall supervisors and their estimations of the mean distance shoppers live from the mall. Javier's data is in the second column and Ercilia is in the third column. The first row is for sample means and the second row is for standard deviations."><thead><tr><th></th> <th>Javier</th> <th>Ercilia</th> </tr> </thead> <tbody><tr><td>\(\overline{x}\)</td> <td>6.0 miles</td> <td>6.0 miles</td> </tr> <tr><td>\(s\)</td> <td>4.0 miles</td> <td>7.0 miles</td> </tr> </tbody> </table> <ol type="a"><li>How can you determine which survey was correct ?</li> <li>Explain what the difference in the results of the surveys implies about the data.</li> <li>If the two histograms depict the distribution of values for each supervisor, which one depicts Ercilia&#8217;s sample? How do you know?<span data-type="newline"><br /> </span> <div id="eip-idm218452576" class="bc-figure figure"><span id="id8387446" data-type="media" data-display="block" data-alt="This shows two histograms. The first histogram shows a fairly symmetrical distribution with a mode of 6. The second histogram shows a uniform distribution."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch02_13_09-1.jpg" alt="This shows two histograms. The first histogram shows a fairly symmetrical distribution with a mode of 6. The second histogram shows a uniform distribution." width="450" data-media-type="image/jpg" /></span></div> </li> <li>If the two box plots depict the distribution of values for each supervisor, which one depicts Ercilia’s sample? How do you know?<span data-type="newline"><br /> </span> <div class="bc-figure figure"><span id="id8629835" data-type="media" data-display="block" data-alt="This shows two horizontal boxplots. The first boxplot is graphed over a number line from 0 to 21. The first whisker extends from 0 to 1. The box begins at the first quartile, 1, and ends at the third quartile, 14. A vertical, dashed line marks the median at 6. The second whisker extends from the third quartile to the largest value, 21. The second boxplot is graphed over a number line from 0 to 12. The first whisker extends from 0 to 4. The box begins at the first quartile, 4, and ends at the third quartile, 9. A vertical, dashed line marks the median at 6. The second whisker extends from the third quartile to the largest value, 12."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch02_13_10-1.jpg" alt="This shows two horizontal boxplots. The first boxplot is graphed over a number line from 0 to 21. The first whisker extends from 0 to 1. The box begins at the first quartile, 1, and ends at the third quartile, 14. A vertical, dashed line marks the median at 6. The second whisker extends from the third quartile to the largest value, 21. The second boxplot is graphed over a number line from 0 to 12. The first whisker extends from 0 to 4. The box begins at the first quartile, 4, and ends at the third quartile, 9. A vertical, dashed line marks the median at 6. The second whisker extends from the third quartile to the largest value, 12." width="450" data-media-type="image/jpg" /></span></div> </li> </ol> <div data-type="exercise"><div data-type="solution"><p><strong>Answers to odd Questions</strong></p> <p>1)  The mean percentage, \(\overline{x}=\frac{1328.65}{50}=26.75\)</p> <p>3) a</p> <p>5) b</p> </div> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="fs-idm7751856"><dt>Frequency Table</dt> <dd id="fs-idp28775440">a data representation in which grouped data is displayed along with the corresponding frequencies</dd> </dl> <dl id="mean"><dt>Mean</dt> <dd id="id10578364">a number that measures the central tendency of the data; a common name for mean is &#8216;average.&#8217; The term &#8216;mean&#8217; is a shortened form of &#8216;arithmetic mean.&#8217; By definition, the mean for a sample (denoted by \(\overline{x}\)) is \(\overline{x}\text{ }=\text{ }\frac{\text{Sum of all values in the sample}}{\text{Number of values in the sample}}\), and the mean for a population (denoted by <em data-effect="italics">μ</em>) is \(\mu =\frac{\text{Sum of all values in the population}}{\text{Number of values in the population}}\).</dd> </dl> <dl id="median"><dt>Median</dt> <dd id="id44836016">a number that separates ordered data into halves; half the values are the same number or smaller than the median and half the values are the same number or larger than the median. The median may or may not be part of the data.</dd> </dl> <dl id="fs-idp5033152"><dt>Midpoint</dt> <dd id="fs-idp75012256">the mean of an interval in a frequency table</dd> </dl> <dl id="mode"><dt>Mode</dt> <dd id="id44836043">the value that appears most frequently in a set of data</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-skewness-and-the-mean-median-and-mode" title="Chapter 2.5: Skewness and the Mean, Median, and Mode"><div class="chapter-title-wrap"><h3 class="chapter-number">11</h3><h2 class="chapter-title"><span class="display-none">Chapter 2.5: Skewness and the Mean, Median, and Mode</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p id="element-97">Consider the following data set. <span data-type="newline"><br /> </span>4;  5;  6;  6;  6;  7;  7;  7;  7;  7;  7;  8;  8;  8;  9;  10</p> <p id="element-35965">This data set can be represented by following histogram. Each interval has width one, and each value is located in the middle of an interval.</p> <div id="M06_Ch02_fig001" class="bc-figure figure"><span id="id16811614" data-type="media" data-alt="This histogram matches the supplied data. It consists of 7 adjacent bars with the x-axis split into intervals of 1 from 4 to 10. The heighs of the bars peak in the middle and taper symmetrically to the right and left." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch02_08_01-1.jpg" alt="This histogram matches the supplied data. It consists of 7 adjacent bars with the x-axis split into intervals of 1 from 4 to 10. The heighs of the bars peak in the middle and taper symmetrically to the right and left." width="350" data-media-type="image/jpg" /></span></div> <p>The histogram displays a <strong>symmetrical</strong> distribution of data. A distribution is symmetrical if a vertical line can be drawn at some point in the histogram such that the shape to the left and the right of the vertical line are mirror images of each other. The mean, the median, and the mode are each seven for these data. <strong>In a perfectly symmetrical distribution, the mean and the median are the same.</strong> This example has one mode (unimodal), and the mode is the same as the mean and median. In a symmetrical distribution that has two modes (bimodal), the two modes would be different from the mean and median.</p> <p id="element-687">The histogram for the data: <span id="set-00016s" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">4  </span><span data-type="item">5  </span><span data-type="item">6  </span><span data-type="item">6  </span><span data-type="item">6  </span><span data-type="item">7  </span><span data-type="item">7  </span><span data-type="item">7  </span><span data-type="item">7  </span><span data-type="item">8</span></span> is not symmetrical. The right-hand side seems &#8220;chopped off&#8221; compared to the left side. A distribution of this type is called <strong>skewed to the left</strong> because it is pulled out to the left.</p> <div id="M06_Ch02_fig002" class="bc-figure figure"><span id="id17014514" data-type="media" data-alt="This histogram matches the supplied data. It consists of 5 adjacent bars with the x-axis split into intervals of 1 from 4 to 8. The peak is to the right, and the heights of the bars taper down to the left." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch02_08_02-1.jpg" alt="This histogram matches the supplied data. It consists of 5 adjacent bars with the x-axis split into intervals of 1 from 4 to 8. The peak is to the right, and the heights of the bars taper down to the left." width="350" data-media-type="image/jpg" /></span></div> <p>The mean is 6.3, the median is 6.5, and the mode is seven. <strong>Notice that the mean is less than the median, and they are both less than the mode.</strong> The mean and the median both reflect the skewing, but the mean reflects it more so.</p> <p id="element-391">The histogram for the data: <span id="set-00017" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">6  </span><span data-type="item">7  </span><span data-type="item">7  </span><span data-type="item">7  </span><span data-type="item">7  </span><span data-type="item">8  </span><span data-type="item">8  </span><span data-type="item">8  </span><span data-type="item">9  </span><span data-type="item">10</span></span>, is also not symmetrical. It is <strong>skewed to the right</strong>.</p> <div id="M06_Ch02_fig003" class="bc-figure figure"><span id="id17014699" data-type="media" data-alt="This histogram matches the supplied data. It consists of 5 adjacent bars with the x-axis split into intervals of 1 from 6 to 10. The peak is to the left, and the heights of the bars taper down to the right." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch02_08_03-1.jpg" alt="This histogram matches the supplied data. It consists of 5 adjacent bars with the x-axis split into intervals of 1 from 6 to 10. The peak is to the left, and the heights of the bars taper down to the right." width="350" data-media-type="image/jpg" /></span></div> <p id="element-434">The mean is 7.7, the median is 7.5, and the mode is seven. Of the three statistics, <strong>the mean is the largest, while the mode is the smallest</strong>. Again, the mean reflects the skewing the most.</p> <p id="element-524">To summarize, generally if the distribution of data is skewed to the left, the mean is less than the median, which is often less than the mode. If the distribution of data is skewed to the right, the mode is often less than the median, which is less than the mean.</p> <p>Skewness and symmetry become important when we discuss probability distributions in later chapters.</p> <div id="fs-idp17640608" class="textbox textbox--examples" data-type="example"><div id="eip-idp68426064" data-type="exercise"><div id="eip-idm58649808" data-type="problem"><p id="fs-idm39854688">Statistics are used to compare and sometimes identify authors. The following lists shows a simple random sample that compares the letter counts for three authors.</p> <p id="fs-idm80542976">Terry:  7;  9;  3;  3;  3;  4;  1;  3;  2;  2</p> <p id="fs-idm41545440">Davis:  3;  3;  3;  4;  1;  4;  3;  2;  3;  1</p> <p id="fs-idp1740640">Maris:  2;  3;  4;  4;  4;  6;  6;  6;  8;  3</p> <ol id="fs-idm63711472" type="a"><li>Make a dot plot for the three authors and compare the shapes.</li> <li>Calculate the mean for each.</li> <li>Calculate the median for each.</li> <li>Describe any pattern you notice between the shape and the measures of center.</li> </ol> </div> <div id="eip-idm44069408" data-type="solution"><ol id="eip-idm54630576" type="a"><li><div id="fs-idm17492640" class="bc-figure figure"><div class="bc-figcaption figcaption">Terry’s distribution has a right (positive) skew.</div> <p><span id="fs-idm78584944" data-type="media" data-alt="This dot plot matches the supplied data for Terry. The plot uses a number line from 1 to 10. It shows one x over 1, two x's over 2, four x's over 3, one x over 4, one x over 7, and one x over 9. There are no x's over the numbers 5, 6, 8, and 10."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M08_030-1.jpg" alt="This dot plot matches the supplied data for Terry. The plot uses a number line from 1 to 10. It shows one x over 1, two x's over 2, four x's over 3, one x over 4, one x over 7, and one x over 9. There are no x's over the numbers 5, 6, 8, and 10." width="450" data-media-type="image/png" /></span></p> </div> <div id="fs-idm19521120" class="bc-figure figure"><div class="bc-figcaption figcaption">Davis’ distribution has a left (negative) skew</div> <p><span id="fs-idm131679008" data-type="media" data-alt="This dot plot matches the supplied data for Davi. The plot uses a number line from 1 to 10. It shows two x's over 1, one x over 2, five x's over 3, and two x's over 4. There are no x's over the numbers 5, 6, 7, 8, 9, and 10."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M08_031-1.jpg" alt="This dot plot matches the supplied data for Davi. The plot uses a number line from 1 to 10. It shows two x's over 1, one x over 2, five x's over 3, and two x's over 4. There are no x's over the numbers 5, 6, 7, 8, 9, and 10." width="450" data-media-type="image/png" /></span></p> </div> <div id="fs-idm18855792" class="bc-figure figure"><div class="bc-figcaption figcaption">Maris’ distribution is symmetrically shaped.</div> <p><span id="fs-idm56353744" data-type="media" data-alt="This dot plot matches the supplied data for Mari. The plot uses a number line from 1 to 10. It shows one x over 2, two x's over 3, three x's over 4, three x's over 6, and one x over 8. There are no x's over the numbers 1, 5, 7, 9, and 10."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M08_032-1.jpg" alt="This dot plot matches the supplied data for Mari. The plot uses a number line from 1 to 10. It shows one x over 2, two x's over 3, three x's over 4, three x's over 6, and one x over 8. There are no x's over the numbers 1, 5, 7, 9, and 10." width="450" data-media-type="image/png" /></span></p> </div> </li> <li>Terry’s mean is 3.7, Davis’ mean is 2.7, Maris’ mean is 4.6.</li> <li>Terry’s median is three, Davis’ median is three. Maris’ median is four.</li> <li>It appears that the median is always closest to the high point (the mode), while the mean tends to be farther out on the tail. In a symmetrical distribution, the mean and the median are both centrally located close to the high point of the distribution.</li> </ol> </div> </div> </div> <div id="fs-idm10131056" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm4859600" data-type="exercise"><div id="fs-idm31997616" data-type="problem"><p id="fs-idm76707328">Discuss the mean, median, and mode for each of the following problems. Is there a pattern between the shape and measure of the center?</p> <p id="eip-idp44372864">a.</p> <div id="fs-idp12578240" class="bc-figure figure"><span id="fs-idp12578368" data-type="media" data-alt="This dot plot matches the supplied data. The plot uses a number line from 0 to 14. It shows two x's over 0, four x's over 1, three x's over 2, one x over 3, two x's over the number 4, 5, 6, and 9, and 1 x each over 10 and 14. There are no x's over the numbers 7, 8, 11, 12, and 13."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M08_033-1.png" alt="This dot plot matches the supplied data. The plot uses a number line from 0 to 14. It shows two x's over 0, four x's over 1, three x's over 2, one x over 3, two x's over the number 4, 5, 6, and 9, and 1 x each over 10 and 14. There are no x's over the numbers 7, 8, 11, 12, and 13." width="400" data-media-type="image/png" /></span></div> <p id="eip-idp140444967539280">b.</p> <table id="eip-idp39390048" summary="The ages former U.S. presidents died"><thead><tr><th colspan="2">The Ages Former U.S Presidents Died</th> </tr> </thead> <tbody><tr><td>4</td> <td>6 9</td> </tr> <tr><td>5</td> <td>3 6 7 7 7 8</td> </tr> <tr><td>6</td> <td>0 0 3 3 4 4 5 6 7 7 7 8</td> </tr> <tr><td>7</td> <td>0 1 1 2 3 4 7 8 8 9</td> </tr> <tr><td>8</td> <td>0 1 3 5 8</td> </tr> <tr><td>9</td> <td>0 0 3 3</td> </tr> <tr><td colspan="2">Key: 8|0 means 80.</td> </tr> </tbody> </table> <p id="eip-idm119989936">c.</p> <div id="fs-idp18736080" class="bc-figure figure"><span id="fs-idp18736208" data-type="media" data-alt="This is a histogram titled Hours Spent Playing Video Games on Weekends. The x-axis shows the number of hours spent playing video games with bars showing values at intervals of 5. The y-axis shows the number of students. The first bar for 0 - 4.99 hours has a height of 2. The second bar from 5 - 9.99 has a height of 3. The third bar from 10 - 14.99 has a height of 4. The fourth bar from 15 - 19.99 has a height of 7. The fifth bar from 20 - 24.99 has a height of 9."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M08_034-1.png" alt="This is a histogram titled Hours Spent Playing Video Games on Weekends. The x-axis shows the number of hours spent playing video games with bars showing values at intervals of 5. The y-axis shows the number of students. The first bar for 0 - 4.99 hours has a height of 2. The second bar from 5 - 9.99 has a height of 3. The third bar from 10 - 14.99 has a height of 4. The fourth bar from 15 - 19.99 has a height of 7. The fifth bar from 20 - 24.99 has a height of 9." width="400" data-media-type="image/png" /></span></div> </div> </div> </div> <div id="fs-idm5546880" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idm70567904">Looking at the distribution of data can reveal a lot about the relationship between the mean, the median, and the mode. There are <u data-effect="underline">three types of distributions. A <strong data-effect="bold">right (or positive) skewed</strong> </u>distribution has a shape like <a class="autogenerated-content" href="#M06_Ch02_fig002">(Figure)</a>. A <strong data-effect="bold">left (or negative) skewed</strong> distribution has a shape like <a class="autogenerated-content" href="#M06_Ch02_fig003">(Figure)</a>. A <strong data-effect="bold">symmetrical</strong> distrubtion looks like <a class="autogenerated-content" href="#M06_Ch02_fig001">(Figure)</a>.</p> </div> <div id="fs-idp2369408" class="practice" data-depth="1"><p id="eip-45"><em data-effect="italics">Use the following information to answer the next three exercises:</em> State whether the data are symmetrical, skewed to the left, or skewed to the right.</p> <div id="fs-idm89697712" data-type="exercise"><div id="fs-idm18359408" data-type="problem"><p id="fs-idm47149136"><span data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">1  </span><span data-type="item">1  </span><span data-type="item">1  </span><span data-type="item">2  </span><span data-type="item">2  </span><span data-type="item">2  </span><span data-type="item">2  </span><span data-type="item">3  </span><span data-type="item">3  </span><span data-type="item">3  </span><span data-type="item">3  </span><span data-type="item">3  </span><span data-type="item">3  </span><span data-type="item">3  </span><span data-type="item">3  </span><span data-type="item">4  </span><span data-type="item">4  </span><span data-type="item">4  </span><span data-type="item">5  </span><span data-type="item">5</span></span></p> </div> <div id="fs-idm39691376" data-type="solution"><p id="fs-idp6587296">The data are symmetrical. The median is 3 and the mean is 2.85. They are close, and the mode lies close to the middle of the data, so the data are symmetrical.</p> </div> </div> <div id="fs-idm125692000" data-type="exercise"><div id="fs-idm70278192" data-type="problem"><p id="fs-idm75453168"><span data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">16  </span><span data-type="item">17  </span><span data-type="item">19  </span><span data-type="item">22  </span><span data-type="item">22  </span><span data-type="item">22  </span><span data-type="item">22  </span><span data-type="item">22  </span><span data-type="item">23</span></span></p> </div> </div> <div id="fs-idm65822528" data-type="exercise"><div id="fs-idm112226560" data-type="problem"><p id="fs-idm14558304"><span data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">87  </span><span data-type="item">87  </span><span data-type="item">87  </span><span data-type="item">87  </span><span data-type="item">87  </span><span data-type="item">88  </span><span data-type="item">89  </span><span data-type="item">89  </span><span data-type="item">90  </span><span data-type="item">91</span></span></p> </div> <div id="fs-idm22884848" data-type="solution"><p id="fs-idm22884720">The data are skewed right. The median is 87.5 and the mean is 88.2. Even though they are close, the mode lies to the left of the middle of the data, and there are many more instances of 87 than any other number, so the data are skewed right.</p> </div> </div> <div id="fs-idm8768736" data-type="exercise"><div id="fs-idm98233488" data-type="problem"><p id="fs-idm53300704">When the data are skewed left, what is the typical relationship between the mean and median?</p> </div> </div> <div id="fs-idm8813456" data-type="exercise"><div id="fs-idm42223856" data-type="problem"><p id="fs-idm42223728">When the data are symmetrical, what is the typical relationship between the mean and median?</p> </div> <div id="fs-idm52740320" data-type="solution"><p id="fs-idm99900912">When the data are symmetrical, the mean and median are close or the same.</p> </div> </div> <div id="fs-idm89404448" data-type="exercise"><div id="fs-idm52575536" data-type="problem"><p id="fs-idm52575408">What word describes a distribution that has two modes?</p> </div> </div> <div id="fs-idp18513440" data-type="exercise"><div id="fs-idm38702384" data-type="problem"><p id="fs-idm34119056">Describe the shape of this distribution.</p> <div id="fs-idm14476592" class="bc-figure figure"><span id="fs-idm61579104" data-type="media" data-alt="This is a historgram which consists of 5 adjacent bars with the x-axis split into intervals of 1 from 3 to 7. The bar heights peak at the first bar and taper lower to the right."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M08_007-1.jpg" alt="This is a historgram which consists of 5 adjacent bars with the x-axis split into intervals of 1 from 3 to 7. The bar heights peak at the first bar and taper lower to the right." width="380" data-media-type="image/jpg" /></span></div> </div> <div id="fs-idp2236688" data-type="solution"><p id="fs-idm40053200">The distribution is skewed right because it looks pulled out to the right.</p> </div> </div> <div id="fs-idp1656096" data-type="exercise"><div id="fs-idm52428016" data-type="problem"><p id="fs-idm79965872">Describe the relationship between the mode and the median of this distribution.</p> <div id="fs-idm155444400" class="bc-figure figure"><span id="fs-idm155444272" data-type="media" data-alt="This is a histogram which consists of 5 adjacent bars with the x-axis split into intervals of 1 from 3 to 7. The bar heights peak at the first bar and taper lower to the right. The bar ehighs from left to right are: 8, 4, 2, 2, 1."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M08_007-1.jpg" alt="This is a histogram which consists of 5 adjacent bars with the x-axis split into intervals of 1 from 3 to 7. The bar heights peak at the first bar and taper lower to the right. The bar ehighs from left to right are: 8, 4, 2, 2, 1." width="380" data-media-type="image/jpg" /></span></div> </div> </div> <div id="fs-idm17924688" data-type="exercise"><div id="fs-idm83887312" data-type="problem"><p id="fs-idm94221840">Describe the relationship between the mean and the median of this distribution.</p> <div id="fs-idm57592704" class="bc-figure figure"><span id="fs-idm107167152" data-type="media" data-alt="This is a histogram which consists of 5 adjacent bars with the x-axis split into intervals of 1 from 3 to 7. The bar heights peak at the first bar and taper lower to the right. The bar heights from left to right are: 8, 4, 2, 2, 1."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M08_007-1.jpg" alt="This is a histogram which consists of 5 adjacent bars with the x-axis split into intervals of 1 from 3 to 7. The bar heights peak at the first bar and taper lower to the right. The bar heights from left to right are: 8, 4, 2, 2, 1." width="380" data-media-type="image/jpg" /></span></div> </div> <div id="fs-idp21487456" data-type="solution"><p id="fs-idm77372464">The mean is 4.1 and is slightly greater than the median, which is four.</p> </div> </div> <div id="fs-idm17745152" data-type="exercise"><div id="fs-idm77742544" data-type="problem"><p id="fs-idm16253520">Describe the shape of this distribution.</p> <div id="fs-idp20306352" class="bc-figure figure"><span id="fs-idp20306480" data-type="media" data-alt="This is a histogram which consists of 5 adjacent bars with the x-axis split into intervals of 1 from 3 to 7. The bar heights peak in the middle and taper down to the right and left."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M08_010-1.jpg" alt="This is a histogram which consists of 5 adjacent bars with the x-axis split into intervals of 1 from 3 to 7. The bar heights peak in the middle and taper down to the right and left." width="380" data-media-type="image/jpg" /></span></div> </div> </div> <div id="fs-idp21637424" data-type="exercise"><div id="fs-idm36198464" data-type="problem"><p id="fs-idm56134784">Describe the relationship between the mode and the median of this distribution.</p> <div id="fs-idm16308016" class="bc-figure figure"><span id="fs-idm20125072" data-type="media" data-alt="This is a histogram which consists of 5 adjacent bars with the x-axis split intervals of 1 from 3 to 7. The bar heights peak in the middle and taper down to the right and left."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M08_010-1.jpg" alt="This is a histogram which consists of 5 adjacent bars with the x-axis split intervals of 1 from 3 to 7. The bar heights peak in the middle and taper down to the right and left." width="380" data-media-type="image/jpg" /></span></div> </div> <div id="fs-idp18637696" data-type="solution"><p id="fs-idm157182480">The mode and the median are the same. In this case, they are both five.</p> </div> </div> <div id="fs-idm42189536" data-type="exercise"><div id="fs-idm1742112" data-type="problem"><p id="fs-idm2423344">Are the mean and the median the exact same in this distribution? Why or why not?</p> <div id="fs-idm34380208" class="bc-figure figure"><span id="fs-idm44543696" data-type="media" data-alt="This is a histogram which consists of 5 adjacent bars with the x-axis split into intervals of 1 from 3 to 7. The bar heights from left to right are: 2, 4, 8, 5, 2."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M08_010-1.jpg" alt="This is a histogram which consists of 5 adjacent bars with the x-axis split into intervals of 1 from 3 to 7. The bar heights from left to right are: 2, 4, 8, 5, 2." width="380" data-media-type="image/jpg" /></span></div> </div> </div> <div id="fs-idm44509184" data-type="exercise"><div id="fs-idm57744704" data-type="problem"><p id="fs-idm13088944">Describe the shape of this distribution.</p> <div id="fs-idp2366192" class="bc-figure figure"><span id="fs-idm41242224" data-type="media" data-alt="This is a histogram which consists of 5 adjacent bars over an x-axis split into intervals of 1 from 3 to 7. The bar heights from left to right are: 1, 1, 2, 4, 7."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M08_013-1.jpg" alt="This is a histogram which consists of 5 adjacent bars over an x-axis split into intervals of 1 from 3 to 7. The bar heights from left to right are: 1, 1, 2, 4, 7." width="380" data-media-type="image/jpg" /></span></div> </div> <div id="fs-idm34197072" data-type="solution"><p id="fs-idm64532304">The distribution is skewed left because it looks pulled out to the left.</p> </div> </div> <div id="fs-idm24266752" data-type="exercise"><div id="fs-idm39469840" data-type="problem"><p id="fs-idm31823936">Describe the relationship between the mode and the median of this distribution.</p> <div id="fs-idp1844976" class="bc-figure figure"><span id="fs-idp1845104" data-type="media" data-alt="This is a histogram which consists of 5 adjacent bars over an x-axis split into intervals of 1 from 3 to 7. The bar heights from left to right are: 1, 1, 2, 4, 7."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M08_013-1.jpg" alt="This is a histogram which consists of 5 adjacent bars over an x-axis split into intervals of 1 from 3 to 7. The bar heights from left to right are: 1, 1, 2, 4, 7." width="380" data-media-type="image/jpg" /></span></div> </div> </div> <div id="fs-idm56587152" data-type="exercise"><div id="fs-idm80448512" data-type="problem"><p id="fs-idm6830688">Describe the relationship between the mean and the median of this distribution.</p> <div id="fs-idm85728080" class="bc-figure figure"><span id="fs-idm80631312" data-type="media" data-alt="This is a histogram which consists of 5 adjacent bars over an x-axis split into intervals of 1 from 3 to 7. The bar heights from left to right are: 1, 1, 2, 4, 7."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M08_013-1.jpg" alt="This is a histogram which consists of 5 adjacent bars over an x-axis split into intervals of 1 from 3 to 7. The bar heights from left to right are: 1, 1, 2, 4, 7." width="380" data-media-type="image/jpg" /></span></div> </div> <div id="fs-idm40221120" data-type="solution"><p id="fs-idm48927584">The mean and the median are both six.</p> </div> </div> <div id="fs-idm53345552" data-type="exercise"><div id="fs-idm50126800" data-type="problem"><p id="fs-idm41982048">The mean and median for the data are the same.</p> <p id="fs-idm89655728"><span data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">3  </span><span data-type="item">4  </span><span data-type="item">5  </span><span data-type="item">5  </span><span data-type="item">6  </span><span data-type="item">6  </span><span data-type="item">6  </span><span data-type="item">6  </span><span data-type="item">7  </span><span data-type="item">7  </span><span data-type="item">7  </span><span data-type="item">7  </span><span data-type="item">7  </span><span data-type="item">7  </span><span data-type="item">7</span></span></p> <p id="fs-idm81347168">Is the data perfectly symmetrical? Why or why not?</p> </div> </div> <div id="fs-idm106262320" data-type="exercise"><div id="fs-idp4116944" data-type="problem"><p id="fs-idp12626464">Which is the greatest, the mean, the mode, or the median of the data set?</p> <p id="fs-idm22688432"><span data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">11  </span><span data-type="item">11  </span><span data-type="item">12  </span><span data-type="item">12  </span><span data-type="item">12  </span><span data-type="item">12  </span><span data-type="item">13  </span><span data-type="item">15  </span><span data-type="item">17  </span><span data-type="item">22  </span><span data-type="item">22  </span><span data-type="item">22</span></span></p> </div> <div id="fs-idm67998720" data-type="solution"><p id="fs-idm56243664">The mode is 12, the median is 12.5, and the mean is 15.1. The mean is the largest.</p> </div> </div> <div id="fs-idm77708816" data-type="exercise"><div id="fs-idm18613792" data-type="problem"><p id="fs-idm81638640">Which is the least, the mean, the mode, and the median of the data set?</p> <p id="fs-idm78212064"><span data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">56  </span><span data-type="item">56  </span><span data-type="item">56  </span><span data-type="item">58  </span><span data-type="item">59  </span><span data-type="item">60  </span><span data-type="item">62  </span><span data-type="item">64  </span><span data-type="item">64  </span><span data-type="item">65  </span><span data-type="item">67</span></span></p> </div> </div> <div id="fs-idm1767264" data-type="exercise"><div id="fs-idm7229600" data-type="problem"><p id="fs-idm79130096">Of the three measures, which tends to reflect skewing the most, the mean, the mode, or the median? Why?</p> </div> <div id="fs-idm75354176" data-type="solution"><p id="fs-idm76907792">The mean tends to reflect skewing the most because it is affected the most by outliers.</p> </div> </div> <div id="fs-idp5807168" data-type="exercise"><div id="fs-idm13048928" data-type="problem"><p id="fs-idm70492944">In a perfectly symmetrical distribution, when would the mode be different from the mean and median?</p> </div> </div> </div> <div id="fs-idm100553376" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <div data-type="exercise"><div id="id7358516" data-type="problem"><p id="element-702">1)   The median age of the U.S. population in 1980 was 30.0 years. In 1991, the median age was 33.1 years.</p> <ol id="id12488392" type="a"><li>What does it mean for the median age to rise?</li> <li>Give two reasons why the median age could rise.</li> <li>For the median age to rise, is the actual number of children less in 1991 than it was in 1980? Why or why not?</li> </ol> </div> </div> </div> </div></div>
<div class="chapter standard" id="chapter-measures-of-the-spread-of-the-data" title="Chapter 2.6: Measures of the Spread of the Data"><div class="chapter-title-wrap"><h3 class="chapter-number">12</h3><h2 class="chapter-title"><span class="display-none">Chapter 2.6: Measures of the Spread of the Data</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p>An important characteristic of any set of data is the variation in the data. In some data sets, the data values are concentrated closely near the mean; in other data sets, the data values are more widely spread out from the mean. The most common measure of variation, or spread, is the standard deviation. The <span data-type="term">standard deviation</span> is a number that measures how far data values are from their mean.</p> <div id="fs-idm94849104" class="bc-section section" data-depth="1"><h3 data-type="title">The standard deviation</h3> <ul id="eip-id1169998044652"><li>provides a numerical measure of the overall amount of variation in a data set, and</li> <li>can be used to determine whether a particular data value is close to or far from the mean.</li> </ul> <div id="fs-idm34509216" class="bc-section section" data-depth="2"><h4 data-type="title">The standard deviation provides a measure of the overall variation in a data set</h4> <p id="fs-idm111390400">The standard deviation is always positive or zero. The standard deviation is small when the data are all concentrated close to the mean, exhibiting little variation or spread. The standard deviation is larger when the data values are more spread out from the mean, exhibiting more variation.</p> <p id="eip-419">Suppose that we are studying the amount of time customers wait in line at the checkout at supermarket <em data-effect="italics">A</em> and supermarket <em data-effect="italics">B</em>. the average wait time at both supermarkets is five minutes. At supermarket <em data-effect="italics">A</em>, the standard deviation for the wait time is two minutes; at supermarket <em data-effect="italics">B</em> the standard deviation for the wait time is four minutes.</p> <p id="fs-idm8407856">Because supermarket <em data-effect="italics">B</em> has a higher standard deviation, we know that there is more variation in the wait times at supermarket <em data-effect="italics">B</em>. Overall, wait times at supermarket <em data-effect="italics">B</em> are more spread out from the average; wait times at supermarket <em data-effect="italics">A</em> are more concentrated near the average.</p> </div> <div id="fs-idm58314928" class="bc-section section" data-depth="2"><h4 data-type="title">The standard deviation can be used to determine whether a data value is close to or far from the mean.</h4> <p id="fs-idm19508768">Suppose that Rosa and Binh both shop at supermarket <em data-effect="italics">A</em>. Rosa waits at the checkout counter for seven minutes and Binh waits for one minute. At supermarket <em data-effect="italics">A</em>, the mean waiting time is five minutes and the standard deviation is two minutes. The standard deviation can be used to determine whether a data value is close to or far from the mean.</p> <p><strong>Rosa waits for seven minutes:</strong></p> <ul id="eip-id1172354240261"><li>Seven is two minutes longer than the average of five; two minutes is equal to one standard deviation.</li> <li>Rosa&#8217;s wait time of seven minutes is <strong>two minutes longer than the average</strong> of five minutes.</li> <li>Rosa&#8217;s wait time of seven minutes is <strong>one standard deviation above the average</strong> of five minutes.</li> </ul> <p id="eip-973"><strong>Binh waits for one minute.</strong></p> <ul id="eip-id1164884926039"><li>One is four minutes less than the average of five; four minutes is equal to two standard deviations.</li> <li>Binh&#8217;s wait time of one minute is <strong>four minutes less than the average</strong> of five minutes.</li> <li>Binh&#8217;s wait time of one minute is <strong>two standard deviations below the average</strong> of five minutes.</li> <li>A data value that is two standard deviations from the average is just on the borderline for what many statisticians would consider to be far from the average. Considering data to be far from the mean if it is more than two standard deviations away is more of an approximate &#8220;rule of thumb&#8221; than a rigid rule. In general, the shape of the distribution of the data affects how much of the data is further away than two standard deviations. (You will learn more about this in later chapters.)</li> </ul> <p>The number line may help you understand standard deviation. If we were to put five and seven on a number line, seven is to the right of five. We say, then, that seven is <strong>one</strong> standard deviation to the <strong>right</strong> of five because 5 + (1)(2) = 7.</p> <p id="fs-idm83232288">If one were also part of the data set, then one is <strong>two</strong> standard deviations to the <strong>left</strong> of five because 5 + (–2)(2) = 1.</p> <div id="fs-idm12176304" class="bc-figure figure"><span id="id7156187" data-type="media" data-alt="This shows a number line in intervals of 1 from 0 to 7."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch02_09_01-1.jpg" alt="This shows a number line in intervals of 1 from 0 to 7." data-media-type="image/jpg" data-print-width="2.5in" /></span></div> <ul id="eip-id1164316275630"><li>In general, a <strong>value = mean + (#ofSTDEV)(standard deviation)</strong></li> <li>where #ofSTDEVs = the number of standard deviations</li> <li>#ofSTDEV does not need to be an integer</li> <li>One is <strong>two standard deviations less than the mean</strong> of five because: 1 = 5 + (–2)(2).</li> </ul> <p id="eip-922">The equation <strong>value = mean + (#ofSTDEVs)(standard deviation)</strong> can be expressed for a sample and for a population.</p> <ul id="eip-id1164892626854"><li><strong>sample:  </strong>\(x\text{ = }\overline{x}\text{ + }\left(#ofSTDEV\right)\left(s\right)\)</li> <li><strong>Population:  </strong>\(x=\mu +\left(#ofSTDEV\right)\left(\sigma \right)\)</li> </ul> <p>The lower case letter <em data-effect="italics">s</em> represents the sample standard deviation and the Greek letter <em data-effect="italics">σ</em> (sigma, lower case) represents the population standard deviation. <span data-type="newline"><br /> </span> <span data-type="newline"><br /> </span> The symbol \(\overline{x}\) is the sample mean and the Greek symbol \(\mu \) is the population mean.</p> </div> <div id="fs-idm36185184" class="bc-section section" data-depth="2"><h4 data-type="title">Calculating the Standard Deviation</h4> <p id="fs-idm109226656">If <em data-effect="italics">x</em> is a number, then the difference &#8220;<em data-effect="italics">x</em> – mean&#8221; is called its <strong>deviation</strong>. In a data set, there are as many deviations as there are items in the data set. The deviations are used to calculate the standard deviation. If the numbers belong to a population, in symbols a deviation is <em data-effect="italics">x</em> – <em data-effect="italics">μ</em>. For sample data, in symbols a deviation is <em data-effect="italics">x</em> – \(\overline{x}\).</p> <p>The procedure to calculate the standard deviation depends on whether the numbers are the entire population or are data from a sample. The calculations are similar, but not identical. Therefore the symbol used to represent the standard deviation depends on whether it is calculated from a population or a sample. The lower case letter s represents the sample standard deviation and the Greek letter <em data-effect="italics">σ</em> (sigma, lower case) represents the population standard deviation. If the sample has the same characteristics as the population, then s should be a good estimate of <em data-effect="italics">σ</em>.</p> <p id="eip-689">To calculate the standard deviation, we need to calculate the variance first. The <span data-type="term">variance</span> is the <strong>average of the squares of the deviations</strong> (the <em data-effect="italics">x</em> – \(\overline{x}\) values for a sample, or the <em data-effect="italics">x</em> – <em data-effect="italics">μ</em> values for a population). The symbol <em data-effect="italics">σ</em><sup>2</sup> represents the population variance; the population standard deviation <em data-effect="italics">σ</em> is the square root of the population variance. The symbol <em data-effect="italics">s</em><sup>2</sup> represents the sample variance; the sample standard deviation <em data-effect="italics">s</em> is the square root of the sample variance. You can think of the standard deviation as a special average of the deviations.</p> <p id="eip-790">If the numbers come from a census of the entire <strong>population</strong> and not a sample, when we calculate the average of the squared deviations to find the variance, we divide by <em data-effect="italics">N</em>, the number of items in the population. If the data are from a <strong>sample</strong> rather than a population, when we calculate the average of the squared deviations, we divide by <strong><em data-effect="italics">n</em> – 1</strong>, one less than the number of items in the sample.</p> </div> <div id="fs-idm36483152" class="bc-section section" data-depth="2"><h4 data-type="title">Formulas for the Sample Standard Deviation</h4> <ul id="eip-id1172772662272"><li>\(s=\sqrt{\frac{\Sigma {\left(x-\overline{x}\right)}^{2}}{n-1}}\) or \(s=\sqrt{\frac{\Sigma f{\left(x-\overline{x}\right)}^{2}}{n-1}}\)</li> <li>For the sample standard deviation, the denominator is <strong><em data-effect="italics">n</em> &#8211; 1</strong>, that is the sample size MINUS 1.</li> </ul> </div> <div id="fs-idm102220704" class="bc-section section" data-depth="2"><h4 data-type="title">Formulas for the Population Standard Deviation</h4> <ul id="eip-id1171740197716"><li>\(\sigma  = \sqrt{\frac{\Sigma {\left(x-\mu \right)}^{2}}{N}}\) or \(\sigma  = \sqrt{\frac{\Sigma f{\left(x–\mu \right)}^{2}}{N}}\)</li> <li>For the population standard deviation, the denominator is <em data-effect="italics">N</em>, the number of items in the population.</li> </ul> <p id="eip-465">In these formulas, <em data-effect="italics">f</em> represents the frequency with which a value appears. For example, if a value appears once, <em data-effect="italics">f</em> is one. If a value appears three times in the data set or population, <em data-effect="italics">f</em> is three.</p> </div> </div> <div id="fs-idm60688208" class="bc-section section" data-depth="1"><h3 data-type="title">Sampling Variability of a Statistic</h3> <p id="fs-idm41862832">The statistic of a sampling distribution was discussed in <a href="/contents/d9656d2a-5f02-4142-8bc1-8b5cb8feedf9">Descriptive Statistics: Measuring the Center of the Data</a>. How much the statistic varies from one sample to another is known as the <span data-type="term">sampling variability of a statistic</span>. You typically measure the sampling variability of a statistic by its standard error. The <strong>standard error of the mean</strong> is an example of a standard error. It is a special standard deviation and is known as the standard deviation of the sampling distribution of the mean. You will cover the standard error of the mean in the chapter <a href="/contents/3156cbd2-f14e-4bac-bb76-07d69213dfb8">The Central Limit Theorem</a> (not now). The notation for the standard error of the mean is \(\frac{\sigma }{\sqrt{n}}\) where <em data-effect="italics">σ</em> is the standard deviation of the population and n is the size of the sample.</p> <div id="id7166567" class="finger" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="eip-idp582096"><strong>In practice, USE A CALCULATOR OR COMPUTER SOFTWARE TO CALCULATE THE STANDARD DEVIATION. If you are using a TI-83, 83+, 84+ calculator, you need to select the appropriate standard deviation <em data-effect="italics">σ<sub>x</sub></em> or <em data-effect="italics">s<sub>x</sub></em> from the summary statistics.</strong> We will concentrate on using and interpreting the information that the standard deviation gives us. However you should study the following step-by-step example to help you understand how the standard deviation measures variation from the mean. (The calculator instructions appear at the end of this example.)</p> </div> <div id="element-655" class="textbox textbox--examples" data-type="example"><p>In a fifth grade class, the teacher was interested in the average age and the sample standard deviation of the ages of her students. The following data are the ages for a SAMPLE of <em data-effect="italics">n</em> = 20 fifth grade students. The ages are rounded to the nearest half year:</p> <p id="element-433">9;  9.5;  9.5;  10;  10;  10;  10;  10.5;  10.5;  10.5;  10.5;  11;  11;  11;  11;  11;  11;  11.5;  11.5;  11.5;</p> <div id="element-320" data-type="equation">\(\overline{x}=\frac{\text{9 + 9}\text{.5(2) + 10(4) + 10}\text{.5(4) + 11(6) + 11}\text{.5(3)}}{20}=10.525\)</div> <p id="element-642">The average age is 10.53 years, rounded to two places.</p> <p>The variance may be calculated by using a table. Then the standard deviation is calculated by taking the square root of the variance. We will explain the parts of the table after calculating <em data-effect="italics">s</em>.</p> <table summary="This table presents the formulas and calculations of various values. The first column has the data, second column has frequency, third column has deviations, fourth column has deviations squared, fifth column has frequency times deviations squared. There are 6 rows of values."><thead><tr><th>Data</th> <th>Freq.</th> <th>Deviations</th> <th><em data-effect="italics">Deviations</em><sup>2</sup></th> <th>(Freq.)(<em data-effect="italics">Deviations</em><sup>2</sup>)</th> </tr> </thead> <tbody><tr><td><em data-effect="italics">x</em></td> <td><em data-effect="italics">f</em></td> <td>(<em data-effect="italics">x</em> – \(\overline{x}\))</td> <td>(<em data-effect="italics">x</em> – \(\overline{x}\))<sup>2</sup></td> <td>(<em data-effect="italics">f</em>)(<em data-effect="italics">x</em> – \(\overline{x}\))<sup>2</sup></td> </tr> <tr><td>9</td> <td>1</td> <td>9 – 10.525 = –1.525</td> <td>(–1.525)<sup>2</sup> = 2.325625</td> <td>1 × 2.325625 = 2.325625</td> </tr> <tr><td>9.5</td> <td>2</td> <td>9.5 – 10.525 = –1.025</td> <td>(–1.025)<sup>2</sup> = 1.050625</td> <td>2 × 1.050625 = 2.101250</td> </tr> <tr><td>10</td> <td>4</td> <td>10 – 10.525 = –0.525</td> <td>(–0.525)<sup>2</sup> = 0.275625</td> <td>4 × 0.275625 = 1.1025</td> </tr> <tr><td>10.5</td> <td>4</td> <td>10.5 – 10.525 = –0.025</td> <td>(–0.025)<sup>2</sup> = 0.000625</td> <td>4 × 0.000625 = 0.0025</td> </tr> <tr><td>11</td> <td>6</td> <td>11 – 10.525 = 0.475</td> <td>(0.475)<sup>2</sup> = 0.225625</td> <td>6 × 0.225625 = 1.35375</td> </tr> <tr><td>11.5</td> <td>3</td> <td>11.5 – 10.525 = 0.975</td> <td>(0.975)<sup>2</sup> = 0.950625</td> <td>3 × 0.950625 = 2.851875</td> </tr> <tr><td></td> <td></td> <td></td> <td></td> <td>The total is 9.7375</td> </tr> </tbody> </table> <p id="element-367">The sample variance, <em data-effect="italics">s</em><sup>2</sup>, is equal to the sum of the last column (9.7375) divided by the total number of data values minus one (20 – 1):</p> <p id="element-631">\({s}^{2}=\frac{9.7375}{20-1}=0.5125\)</p> <p>The <strong>sample standard deviation</strong> <em data-effect="italics">s</em> is equal to the square root of the sample variance:</p> <p>\(s=\sqrt{0.5125}=0.715891,\) which is rounded to two decimal places, <em data-effect="italics">s</em> = 0.72.</p> <p id="eip-563"><strong>Typically, you do the calculation for the standard deviation on your calculator or computer</strong>. The intermediate results are not rounded. This is done for accuracy.</p> <div id="element-397" data-type="exercise"><div id="id3262370" data-type="problem"><ul id="eip-id1171734426444"><li>For the following problems, recall that <strong>value = mean + (#ofSTDEVs)(standard deviation)</strong>. Verify the mean and standard deviation or a calculator or computer.</li> <li>For a sample: <em data-effect="italics">x</em> = \(\overline{x}\) + (#ofSTDEVs)(<em data-effect="italics">s</em>)</li> <li>For a population: <em data-effect="italics">x</em> = <em data-effect="italics">μ</em> + (#ofSTDEVs)(<em data-effect="italics">σ</em>)</li> <li>For this example, use <em data-effect="italics">x</em> = \(\overline{x}\) + (#ofSTDEVs)(<em data-effect="italics">s</em>) because the data is from a sample</li> </ul> <ol id="eip-idm60068992" type="a"><li>Verify the mean and standard deviation on your calculator or computer.</li> <li>Find the value that is one standard deviation above the mean. Find (\(\overline{x}\) + 1s).</li> <li>Find the value that is two standard deviations below the mean. Find (\(\overline{x}\) – 2s).</li> <li>Find the values that are 1.5 standard deviations <strong>from</strong> (below and above) the mean.</li> </ol> </div> <div id="id1167261579717" data-type="solution"><ol id="eip-idm6421744" type="a"><li><div id="fs-idm22381712" class="statistics calculator finger" data-type="note" data-has-label="true" data-label=""><ul id="fs-idm96652896"><li>Clear lists L1 and L2. Press STAT 4:ClrList. Enter 2nd 1 for L1, the comma (,), and 2nd 2 for L2.</li> <li>Enter data into the list editor. Press STAT 1:EDIT. If necessary, clear the lists by arrowing up into the name. Press CLEAR and arrow down.</li> <li>Put the data values (9, 9.5, 10, 10.5, 11, 11.5) into list L1 and the frequencies (1, 2, 4, 4, 6, 3) into list L2. Use the arrow keys to move around.</li> <li>Press STAT and arrow to CALC. Press 1:1-VarStats and enter L1 (2nd 1), L2 (2nd 2). Do not forget the comma. Press ENTER.</li> <li>\(\overline{x}\) = 10.525</li> <li>Use Sx because this is sample data (not a population): Sx=0.715891</li> </ul> </div> </li> <li>(\(\overline{x}\) + 1s) = 10.53 + (1)(0.72) = 11.25</li> <li>(\(\overline{x}\) – 2<em data-effect="italics">s</em>) = 10.53 – (2)(0.72) = 9.09</li> <li><ul><li>(\(\overline{x}\) – 1.5<em data-effect="italics">s</em>) = 10.53 – (1.5)(0.72) = 9.45</li> <li>(\(\overline{x}\) + 1.5<em data-effect="italics">s</em>) = 10.53 + (1.5)(0.72) = 11.61</li> </ul> </li> </ol> </div> </div> </div> <div id="fs-idp2767424" class="statistics try finger" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm106330688" data-type="exercise"><div id="fs-idm106330432" data-type="problem"><p id="fs-idm116204608">On a baseball team, the ages of each of the players are as follows:</p> <p id="fs-idm29758800">21;  21;  22;  23;  24;  24;  25;  25;  28;  29;  29;  31;  32;  33;  33;  34;  35;  36;  36;  36;  36;  38;  38;  38;  40</p> <p id="fs-idm46692272"><span data-type="newline"><br /> </span>Use your calculator or computer to find the mean and standard deviation. Then find the value that is two standard deviations above the mean.</p> </div> </div> </div> <div id="fs-idm66792288" class="bc-section section" data-depth="2"><h4 data-type="title">Explanation of the standard deviation calculation shown in the table</h4> <p id="element-877">The deviations show how spread out the data are about the mean. The data value 11.5 is farther from the mean than is the data value 11 which is indicated by the deviations 0.97 and 0.47. A positive deviation occurs when the data value is greater than the mean, whereas a negative deviation occurs when the data value is less than the mean. The deviation is –1.525 for the data value nine. <strong>If you add the deviations, the sum is always zero</strong>. (For <a class="autogenerated-content" href="#element-655">(Figure)</a>, there are <em data-effect="italics">n</em> = 20 deviations.) So you cannot simply add the deviations to get the spread of the data. By squaring the deviations, you make them positive numbers, and the sum will also be positive. The variance, then, is the average squared deviation.</p> <p id="element-969">The variance is a squared measure and does not have the same units as the data. Taking the square root solves the problem. The standard deviation measures the spread in the same units as the data.</p> <p id="eip-880">Notice that instead of dividing by <em data-effect="italics">n</em> = 20, the calculation divided by <em data-effect="italics">n</em> – 1 = 20 – 1 = 19 because the data is a sample. For the <strong>sample</strong> variance, we divide by the sample size minus one (<em data-effect="italics">n</em> – 1). Why not divide by <em data-effect="italics">n</em>? The answer has to do with the population variance. <strong>The sample variance is an estimate of the population variance.</strong> Based on the theoretical mathematics that lies behind these calculations, dividing by (<em data-effect="italics">n</em> – 1) gives a better estimate of the population variance.</p> <div class="finger" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="eip-idp106491856">Your concentration should be on what the standard deviation tells us about the data. The standard deviation is a number which measures how far the data are spread from the mean. Let a calculator or computer do the arithmetic.</p> </div> <p id="eip-323">The standard deviation, <em data-effect="italics">s</em> or <em data-effect="italics">σ</em>, is either zero or larger than zero. Describing the data with reference to the spread is called &#8220;variability&#8221;. The variability in data depends upon the method by which the outcomes are obtained; for example, by measuring or by random sampling. When the standard deviation is zero, there is no spread; that is, the all the data values are equal to each other. The standard deviation is small when the data are all concentrated close to the mean, and is larger when the data values show more variation from the mean. When the standard deviation is a lot larger than zero, the data values are very spread out about the mean; outliers can make <em data-effect="italics">s</em> or <em data-effect="italics">σ</em> very large.</p> <p id="element-789">The standard deviation, when first presented, can seem unclear. By graphing your data, you can get a better &#8220;feel&#8221; for the deviations and the standard deviation. You will find that in symmetrical distributions, the standard deviation can be very helpful but in skewed distributions, the standard deviation may not be much help. The reason is that the two sides of a skewed distribution have different spreads. In a skewed distribution, it is better to look at the first quartile, the median, the third quartile, the smallest value, and the largest value. Because numbers can be confusing, <strong>always graph your data</strong>. Display your data in a histogram or a box plot.</p> <div id="element-649" class="textbox textbox--examples" data-type="example"><div id="exer2" data-type="exercise"><div id="id6379302" data-type="problem"><p id="element-592">Use the following data (first exam scores) from Susan Dean&#8217;s spring pre-calculus class:</p> <p id="element-961">33;  42;  49;  49;  53;  55;  55;  61;  63;  67;  68;  68;  69;  69;  72;  73;  74;  78;  80;  83;  88;  88;  88;  90;  92;  94;  94;  94;  94;  96;  100</p> <ol id="element-744" type="a"><li>Create a chart containing the data, frequencies, relative frequencies, and cumulative relative frequencies to three decimal places.</li> <li class="finger">Calculate the following to one decimal place using a TI-83+ or TI-84 calculator: <ol id="element-9170" type="i"><li>The sample mean</li> <li>The sample standard deviation</li> <li>The median</li> <li>The first quartile</li> <li>The third quartile</li> <li><em data-effect="italics">IQR</em></li> </ol> </li> <li>Construct a box plot and a histogram on the same set of axes. Make comments about the box plot, the histogram, and the chart.</li> </ol> </div> <div id="id6379699" data-type="solution"><ol type="a"><li>See <a class="autogenerated-content" href="#id6947804">(Figure)</a></li> <li><ol id="element-904534" type="i"><li>The sample mean = 73.5</li> <li>The sample standard deviation = 17.9</li> <li>The median = 73</li> <li>The first quartile = 61</li> <li>The third quartile = 90</li> <li><em data-effect="italics">IQR</em> = 90 – 61 = 29</li> </ol> </li> <li>The <em data-effect="italics">x</em>-axis goes from 32.5 to 100.5; <em data-effect="italics">y</em>-axis goes from –2.4 to 15 for the histogram. The number of intervals is five, so the width of an interval is (100.5 – 32.5) divided by five, is equal to 13.6. Endpoints of the intervals are as follows: the starting point is 32.5, 32.5 + 13.6 = 46.1, 46.1 + 13.6 = 59.7, 59.7 + 13.6 = 73.3, 73.3 + 13.6 = 86.9, 86.9 + 13.6 = 100.5 = the ending value; No data values fall on an interval boundary.</li> </ol> <div id="id6380826" class="bc-figure figure"><span id="id6380831" data-type="media" data-alt="A hybrid image displaying both a histogram and box plot described in detail in the answer solution above."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch02_09_02-1.jpg" alt="A hybrid image displaying both a histogram and box plot described in detail in the answer solution above." width="350" data-media-type="image/jpg" /></span></div> <table id="id6947804" summary="This table presents the values listed above arranged with the data in the first column, frequency in the second column, relative frequency in the third column, and cumulative relative frequency in the fourth column."><thead><tr><th>Data</th> <th>Frequency</th> <th>Relative Frequency</th> <th>Cumulative Relative Frequency</th> </tr> </thead> <tbody><tr><td>33</td> <td>1</td> <td>0.032</td> <td>0.032</td> </tr> <tr><td>42</td> <td>1</td> <td>0.032</td> <td>0.064</td> </tr> <tr><td>49</td> <td>2</td> <td>0.065</td> <td>0.129</td> </tr> <tr><td>53</td> <td>1</td> <td>0.032</td> <td>0.161</td> </tr> <tr><td>55</td> <td>2</td> <td>0.065</td> <td>0.226</td> </tr> <tr><td>61</td> <td>1</td> <td>0.032</td> <td>0.258</td> </tr> <tr><td>63</td> <td>1</td> <td>0.032</td> <td>0.29</td> </tr> <tr><td>67</td> <td>1</td> <td>0.032</td> <td>0.322</td> </tr> <tr><td>68</td> <td>2</td> <td>0.065</td> <td>0.387</td> </tr> <tr><td>69</td> <td>2</td> <td>0.065</td> <td>0.452</td> </tr> <tr><td>72</td> <td>1</td> <td>0.032</td> <td>0.484</td> </tr> <tr><td>73</td> <td>1</td> <td>0.032</td> <td>0.516</td> </tr> <tr><td>74</td> <td>1</td> <td>0.032</td> <td>0.548</td> </tr> <tr><td>78</td> <td>1</td> <td>0.032</td> <td>0.580</td> </tr> <tr><td>80</td> <td>1</td> <td>0.032</td> <td>0.612</td> </tr> <tr><td>83</td> <td>1</td> <td>0.032</td> <td>0.644</td> </tr> <tr><td>88</td> <td>3</td> <td>0.097</td> <td>0.741</td> </tr> <tr><td>90</td> <td>1</td> <td>0.032</td> <td>0.773</td> </tr> <tr><td>92</td> <td>1</td> <td>0.032</td> <td>0.805</td> </tr> <tr><td>94</td> <td>4</td> <td>0.129</td> <td>0.934</td> </tr> <tr><td>96</td> <td>1</td> <td>0.032</td> <td>0.966</td> </tr> <tr><td>100</td> <td>1</td> <td>0.032</td> <td>0.998 (Why isn&#8217;t this value 1?)</td> </tr> </tbody> </table> </div> </div> <p id="element-242">The long left whisker in the box plot is reflected in the left side of the histogram. The spread of the exam scores in the lower 50% is greater (73 – 33 = 40) than the spread in the upper 50% (100 – 73 = 27). The histogram, box plot, and chart all reflect this. There are a substantial number of A and B grades (80s, 90s, and 100). The histogram clearly shows this. The box plot shows us that the middle 50% of the exam scores (<em data-effect="italics">IQR</em> = 29) are Ds, Cs, and Bs. The box plot also shows us that the lower 25% of the exam scores are Ds and Fs.</p> </div> <div id="fs-idp3274832" class="statistics try finger" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm22220656" data-type="exercise"><div id="fs-idm22220400" data-type="problem"><p id="fs-idm125056">The following data show the different types of pet food stores in the area carry. <span data-type="newline"><br /> </span>6;  6;  6;  6;  7;  7;  7;  7;  7;  8;  9;  9;  9;  9;  10;  10;  10;  10;  10;  11;  11;  11;  11;  12;  12;  12;  12;  12;  12; <span data-type="newline"><br /> </span>Calculate the sample mean and the sample standard deviation to one decimal place using a TI-83+ or TI-84 calculator.</p> </div> </div> </div> </div> </div> <div id="fs-idm111076336" class="bc-section section" data-depth="1"><h3 data-type="title">Standard deviation of Grouped Frequency Tables</h3> <p id="fs-idm31040688">Recall that for grouped data we do not know individual data values, so we cannot describe the typical value of the data with precision. In other words, we cannot find the exact mean, median, or mode. We can, however, determine the best estimate of the measures of center by finding the mean of the grouped data with the formula: \(Mean\text{ }of\text{ }Frequency\text{ }Table=\frac{\sum fm}{\sum f}\) <span data-type="newline"><br /> </span>where \(f=\) interval frequencies and <em data-effect="italics">m</em> = interval midpoints.</p> <p id="fs-idm98694272">Just as we could not find the exact mean, neither can we find the exact standard deviation. Remember that standard deviation describes numerically the expected deviation a data value has from the mean. In simple English, the standard deviation allows us to compare how “unusual” individual data is compared to the mean.</p> <div id="fs-idm67331024" class="textbox textbox--examples" data-type="example"><p id="eip-897">Find the standard deviation for the data in <a class="autogenerated-content" href="#fs-idm67330768">(Figure)</a>.</p> <table id="fs-idm67330768" summary=""><thead><tr><th>Class</th> <th>Frequency, <em data-effect="italics">f</em></th> <th>Midpoint, <em data-effect="italics">m</em></th> <th><em data-effect="italics">m</em><sup>2</sup></th> <th>\(\overline{x}\)<sup>2</sup></th> <th><em data-effect="italics">fm</em><sup>2</sup></th> <th>Standard Deviation</th> </tr> </thead> <tbody><tr><td>0–2</td> <td>1</td> <td>1</td> <td>1</td> <td>7.58</td> <td>1</td> <td>3.5</td> </tr> <tr><td>3–5</td> <td>6</td> <td>4</td> <td>16</td> <td>7.58</td> <td>96</td> <td>3.5</td> </tr> <tr><td>6–8</td> <td>10</td> <td>7</td> <td>49</td> <td>7.58</td> <td>490</td> <td>3.5</td> </tr> <tr><td>9–11</td> <td>7</td> <td>10</td> <td>100</td> <td>7.58</td> <td>700</td> <td>3.5</td> </tr> <tr><td>12–14</td> <td>0</td> <td>13</td> <td>169</td> <td>7.58</td> <td>0</td> <td>3.5</td> </tr> <tr><td>15–17</td> <td>2</td> <td>16</td> <td>256</td> <td>7.58</td> <td>512</td> <td>3.5</td> </tr> </tbody> </table> <p id="fs-idm109620704">For this data set, we have the mean, \(\overline{x}\) = 7.58 and the standard deviation, <em data-effect="italics">s<sub>x</sub></em> = 3.5. This means that a randomly selected data value would be expected to be 3.5 units from the mean. If we look at the first class, we see that the class midpoint is equal to one. This is almost two full standard deviations from the mean since 7.58 – 3.5 – 3.5 = 0.58. While the formula for calculating the standard deviation is not complicated, \({s}_{x}=\sqrt{\frac{f{\left(m-\overline{x}\right)}^{2}}{n-1}}\) where <em data-effect="italics">s<sub>x</sub></em> = sample standard deviation, \(\overline{x}\) = sample mean, the calculations are tedious. It is usually best to use technology when performing the calculations.</p> </div> <div id="fs-idm15915776" class="statistics try finger" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <p id="fs-idm104856336">Find the standard deviation for the data from the previous example</p> <table id="fs-idm8420544" summary=""><thead><tr><th>Class</th> <th>Frequency, <em data-effect="italics">f</em></th> </tr> </thead> <tbody><tr><td>0–2</td> <td>1</td> </tr> <tr><td>3–5</td> <td>6</td> </tr> <tr><td>6–8</td> <td>10</td> </tr> <tr><td>9–11</td> <td>7</td> </tr> <tr><td>12–14</td> <td>0</td> </tr> <tr><td>15–17</td> <td>2</td> </tr> </tbody> </table> <p id="fs-idm94998400">First, press the <strong data-effect="bold">STAT</strong> key and select <strong data-effect="bold">1:Edit</strong></p> <div id="fs-idm103802688" class="bc-figure figure"><span id="fs-idm105687776" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M09_016-1.jpg" alt="" width="250" data-media-type="image/jpg" /></span></div> <p id="fs-idm106690304">Input the midpoint values into <strong data-effect="bold">L1</strong> and the frequencies into <strong data-effect="bold">L2</strong></p> <div id="fs-idm104201392" class="bc-figure figure"><span id="fs-idm104201264" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M09_017-1.jpg" alt="" width="250" data-media-type="image/jpg" /></span></div> <p id="fs-idm48907888">Select <strong data-effect="bold">STAT</strong>, <strong data-effect="bold">CALC</strong>, and <strong data-effect="bold">1: 1-Var Stats</strong></p> <div id="fs-idm63814800" class="bc-figure figure"><span id="fs-idm63814672" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M09_018-1.jpg" alt="" width="250" data-media-type="image/jpg" /></span></div> <p id="fs-idm123128416">Select <strong data-effect="bold">2<sup>nd</sup></strong> then <strong data-effect="bold">1</strong> then , <strong data-effect="bold">2<sup>nd</sup></strong> then <strong data-effect="bold">2 Enter</strong></p> <div id="fs-idm105681296" class="bc-figure figure"><span id="fs-idm105681168" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M09_019-1.jpg" alt="" width="250" data-media-type="image/jpg" /></span></div> <p id="fs-idm108979264">You will see displayed both a population standard deviation, <em data-effect="italics">σ<sub>x</sub></em>, and the sample standard deviation, <em data-effect="italics">s<sub>x</sub></em>.</p> </div> </div> <div id="fs-idm77326096" class="bc-section section" data-depth="1"><h3 data-type="title">Comparing Values from Different Data Sets</h3> <p id="eip-176">The standard deviation is useful when comparing data values that come from different data sets. If the data sets have different means and standard deviations, then comparing the data values directly can be misleading.</p> <ul id="eip-id1170599203736"><li>For each data value, calculate how many standard deviations away from its mean the value is.</li> <li>Use the formula: value = mean + (#ofSTDEVs)(standard deviation); solve for #ofSTDEVs.</li> <li>\(#ofSTDEVs=\frac{\text{value – mean}}{\text{standard deviation}}\)</li> <li>Compare the results of this calculation.</li> </ul> <p id="eip-946">#ofSTDEVs is often called a &#8220;<em data-effect="italics">z</em>-score&#8221;; we can use the symbol <em data-effect="italics">z</em>. In symbols, the formulas become:</p> <table id="eip-329" summary="The table shows the z-score formula."><tbody><tr><td>Sample</td> <td>\(x\) = \(\overline{x}\) + <em data-effect="italics">zs</em></td> <td>\(z=\frac{x\text{ }-\text{ }\overline{x}}{s}\)</td> </tr> <tr><td>Population</td> <td>\(x\) = \(\mu \) + <em data-effect="italics">zσ</em></td> <td>\(z=\frac{x\text{ }-\text{ }\mu }{\sigma }\)</td> </tr> </tbody> </table> <div id="element-387" class="textbox textbox--examples" data-type="example"><div id="exer1" data-type="exercise"><div id="id6380897" data-type="problem"><p id="element-928">Two students, John and Ali, from different high schools, wanted to find out who had the highest GPA when compared to his school. Which student had the highest GPA when compared to his school?</p> <table summary="This table provides two students and their GPAs. The first row represents John and the second row represents Ali. The first column lists students, second column lists GPA, third column lists school mean GPA, and the fourth column list the school standard deviation."><thead><tr><th>Student</th> <th>GPA</th> <th>School Mean GPA</th> <th>School Standard Deviation</th> </tr> </thead> <tbody><tr><td>John</td> <td>2.85</td> <td>3.0</td> <td>0.7</td> </tr> <tr><td>Ali</td> <td>77</td> <td>80</td> <td>10</td> </tr> </tbody> </table> </div> <div id="id7180364" data-type="solution"><p id="element-71">For each student, determine how many standard deviations (#ofSTDEVs) his GPA is away from the average, for his school. Pay careful attention to signs when comparing and interpreting the answer.</p> <p>\(z=# of STDEVs=\frac{\text{value }–\text{mean}}{\text{standard deviation}}=\frac{x–\mu }{\sigma }\)</p> <p id="element-378">For John, \(z=#ofSTDEVs=\frac{2.85–3.0}{0.7}=–0.21\)</p> <p id="element-712">For Ali, \(z=#ofSTDEVs=\frac{77-80}{10}=-0.3\)</p> <p id="element-344">John has the better GPA when compared to his school because his GPA is 0.21 standard deviations <strong>below</strong> his school&#8217;s mean while Ali&#8217;s GPA is 0.3 standard deviations <strong>below</strong> his school&#8217;s mean.</p> <p id="fs-idp36263696">John&#8217;s <em data-effect="italics">z</em>-score of –0.21 is higher than Ali&#8217;s <em data-effect="italics">z</em>-score of –0.3. For GPA, higher values are better, so we conclude that John has the better GPA when compared to his school.</p> </div> </div> </div> <div id="fs-idm22007088" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm30975248" data-type="exercise"><div id="fs-idm30974992" data-type="problem"><p id="fs-idm30974736">Two swimmers, Angie and Beth, from different teams, wanted to find out who had the fastest time for the 50 meter freestyle when compared to her team. Which swimmer had the fastest time when compared to her team?</p> <table id="fs-idm44209040" summary=""><thead><tr><th>Swimmer</th> <th>Time (seconds)</th> <th>Team Mean Time</th> <th>Team Standard Deviation</th> </tr> </thead> <tbody><tr><td>Angie</td> <td>26.2</td> <td>27.2</td> <td>0.8</td> </tr> <tr><td>Beth</td> <td>27.3</td> <td>30.1</td> <td>1.4</td> </tr> </tbody> </table> </div> </div> </div> <p>The following lists give a few facts that provide a little more insight into what the standard deviation tells us about the distribution of the data.</p> <div data-type="list"><div data-type="title">For ANY data set, no matter what the distribution of the data is:</div> <ul><li>At least 75% of the data is within two standard deviations of the mean.</li> <li>At least 89% of the data is within three standard deviations of the mean.</li> <li>At least 95% of the data is within 4.5 standard deviations of the mean.</li> <li>This is known as Chebyshev&#8217;s Rule.</li> </ul> </div> <div id="eip-188" data-type="list"><div data-type="title">For data having a distribution that is BELL-SHAPED and SYMMETRIC:</div> <ul><li>Approximately 68% of the data is within one standard deviation of the mean.</li> <li>Approximately 95% of the data is within two standard deviations of the mean.</li> <li>More than 99% of the data is within three standard deviations of the mean.</li> <li>This is known as the Empirical Rule.</li> <li>It is important to note that this rule only applies when the shape of the distribution of the data is bell-shaped and symmetric. We will learn more about this when studying the &#8220;Normal&#8221; or &#8220;Gaussian&#8221; probability distribution in later chapters.</li> </ul> </div> </div> <div id="fs-idp5887184" class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="fs-idm65006032">Data from Microsoft Bookshelf.</p> <p id="fs-idm65005648">King, Bill.“Graphically Speaking.” Institutional Research, Lake Tahoe Community College. Available online at http://www.ltcc.edu/web/about/institutional-research (accessed April 3, 2013).</p> </div> <div id="fs-idm30772592" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idm30771952">The standard deviation can help you calculate the spread of data. There are different equations to use if are calculating the standard deviation of a sample or of a population.</p> <ul id="fs-idm41288176"><li>The Standard Deviation allows us to compare individual data or classes to the data set mean numerically.</li> <li><em data-effect="italics">s</em> = \(\sqrt{\frac{{\sum }^{\text{​}}{\left(x-\overline{x}\right)}^{2}}{n-1}}\) or <em data-effect="italics">s</em> = \(\sqrt{\frac{{\sum }^{\text{​}}f{\left(x-\overline{x}\right)}^{2}}{n-1}}\) is the formula for calculating the standard deviation of a sample. To calculate the standard deviation of a population, we would use the population mean, <em data-effect="italics">μ</em>, and the formula <em data-effect="italics">σ</em> = \(\sqrt{\frac{{\sum }^{\text{​}}{\left(x-\mu \right)}^{2}}{N}}\) or <em data-effect="italics">σ</em> = \(\sqrt{\frac{{\sum }^{\text{​}}f{\left(x-\mu \right)}^{2}}{N}}\).</li> </ul> </div> <div id="fs-idm71113376" class="formula-review" data-depth="1"><h3 data-type="title">Formula Review</h3> <p id="fs-idm727968">\({s}_{x}=\sqrt{\frac{\sum f{m}^{2}}{n}-{\overline{x}}^{2}}\) where \(\begin{array}{l}{s}_{x}=\text{ sample standard deviation}\\ \overline{x}\text{ = sample mean}\end{array}\)</p> </div> <div id="fs-idm102658016" class="practice" data-depth="1"><h3 data-type="title"><em data-effect="italics">Use the following information to answer the next two exercises</em>: The following data are the distances between 20 retail stores and a large distribution center. The distances are in miles. <span data-type="newline"><br /> </span>29;  37;  38;  40;  58;  67;  68;  69;  76;  86;  87;  95;  96;  96;  99;  106;  112;  127;  145;  150   Use a graphing calculator or computer to find the standard deviation and round to the nearest tenth.</h3> </div> <div id="fs-idm109784640" data-type="solution"><p id="fs-idm109784384"><em data-effect="italics">s</em> = 34.5</p> </div> <div id="fs-idm62691728" data-type="exercise"><div id="fs-idm16323776" data-type="problem"><p id="fs-idm48883840">Find the value that is one standard deviation below the mean.</p> </div> </div> <div id="fs-idm9402560" data-type="exercise"><div id="fs-idm41792944" data-type="problem"><p id="fs-idm41792688">Two baseball players, Fredo and Karl, on different teams wanted to find out who had the higher batting average when compared to his team. Which baseball player had the higher batting average when compared to his team?</p> <table id="fs-idm133764640" summary=""><thead><tr><th>Baseball Player</th> <th>Batting Average</th> <th>Team Batting Average</th> <th>Team Standard Deviation</th> </tr> </thead> <tbody><tr><td>Fredo</td> <td>0.158</td> <td>0.166</td> <td>0.012</td> </tr> <tr><td>Karl</td> <td>0.177</td> <td>0.189</td> <td>0.015</td> </tr> </tbody> </table> </div> <div id="fs-idm34270368" data-type="solution"><p id="fs-idm34270112">For Fredo: <em data-effect="italics">z</em> = \(\frac{0.158\text{ – }0.166}{0.012}\) = –0.67</p> <p id="fs-idm71183520">For Karl: <em data-effect="italics">z</em> = \(\frac{0.177\text{ – }0.189}{0.015}\) = –0.8</p> <p id="fs-idm79202000">Fredo’s <em data-effect="italics">z</em>-score of –0.67 is higher than Karl’s <em data-effect="italics">z</em>-score of –0.8. For batting average, higher values are better, so Fredo has a better batting average compared to his team.</p> </div> </div> <div id="exercisefifteen" data-type="exercise"><div id="id24167266" data-type="problem"><p id="prob_15">Use <a class="autogenerated-content" href="#fs-idm133764640">(Figure)</a> to find the value that is three standard deviations:</p> <ul id="element-012345" data-labeled-item="true" data-mark-suffix="."><li>above the mean</li> <li>below the mean</li> </ul> </div> </div> <p class="finger"><span data-type="newline"><br /> </span><em data-effect="italics">Find the standard deviation for the following frequency tables using the formula. Check the calculations with the TI 83/84</em>.</p> <div id="fs-idm109676912" data-type="exercise"><div id="fs-idm109676656" data-type="problem"><p id="eip-id1166677427293">Find the standard deviation for the following frequency tables using the formula. Check the calculations with the TI 83/84.</p> <ol id="fs-idp11094096" type="a"><li><table id="fs-idm103039104" summary=""><thead><tr><th>Grade</th> <th>Frequency</th> </tr> </thead> <tbody><tr><td>49.5–59.5</td> <td>2</td> </tr> <tr><td>59.5–69.5</td> <td>3</td> </tr> <tr><td>69.5–79.5</td> <td>8</td> </tr> <tr><td>79.5–89.5</td> <td>12</td> </tr> <tr><td>89.5–99.5</td> <td>5</td> </tr> </tbody> </table> </li> <li><table id="fs-idm26532112" summary=""><thead><tr><th>Daily Low Temperature</th> <th>Frequency</th> </tr> </thead> <tbody><tr><td>49.5–59.5</td> <td>53</td> </tr> <tr><td>59.5–69.5</td> <td>32</td> </tr> <tr><td>69.5–79.5</td> <td>15</td> </tr> <tr><td>79.5–89.5</td> <td>1</td> </tr> <tr><td>89.5–99.5</td> <td>0</td> </tr> </tbody> </table> </li> <li><table id="fs-idm21103120" summary=""><thead><tr><th>Points per Game</th> <th>Frequency</th> </tr> </thead> <tbody><tr><td>49.5–59.5</td> <td>14</td> </tr> <tr><td>59.5–69.5</td> <td>32</td> </tr> <tr><td>69.5–79.5</td> <td>15</td> </tr> <tr><td>79.5–89.5</td> <td>23</td> </tr> <tr><td>89.5–99.5</td> <td>2</td> </tr> </tbody> </table> </li> </ol> </div> <div id="fs-idm18306912" data-type="solution"><ol id="fs-idm49462512" type="a"><li>\({s}_{x}=\sqrt{\frac{\sum f{m}^{2}}{n}-{\overline{x}}^{2}}=\sqrt{\frac{193157.45}{30}-{79.5}^{2}}=10.88\)</li> <li>\({s}_{x}=\sqrt{\frac{\sum f{m}^{2}}{n}-{\overline{x}}^{2}}=\sqrt{\frac{380945.3}{101}-{60.94}^{2}}=7.62\)</li> <li>\({s}_{x}=\sqrt{\frac{\sum f{m}^{2}}{n}-{\overline{x}}^{2}}=\sqrt{\frac{440051.5}{86}-{70.66}^{2}}=11.14\)</li> </ol> </div> </div> <div id="fs-idm25848496" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <p id="eip-446"><span data-type="newline"><br /> </span><em data-effect="italics">1) Use the following information to answer the next nine exercises:</em> The population parameters below describe the full-time equivalent number of students (FTES) each year at Lake Tahoe Community College from 1976–1977 through 2004–2005.</p> <ul id="element-479"><li><em data-effect="italics">μ</em> = 1000 FTES</li> <li>median = 1,014 FTES</li> <li><em data-effect="italics">σ</em> = 474 FTES</li> <li>first quartile = 528.5 FTES</li> <li>third quartile = 1,447.5 FTES</li> <li><em data-effect="italics">n</em> = 29 years</li> </ul> <div id="exercisetwenty" data-type="exercise"><div id="id8123889" data-type="problem"><p id="prob_20">A sample of 11 years is taken. About how many are expected to have a FTES of 1014 or above? Explain how you determined your answer.</p> </div> <div id="id8123906" data-type="solution"><p>The median value is the middle value in the ordered list of data values. The median value of a set of 11 will be the 6th number in order. Six years will have totals at or below the median.</p> </div> </div> <div id="exercisetwentyone" data-type="exercise"><div id="id8123930" data-type="problem"><p id="prob_21">75% of all years have an FTES:</p> <ol type="a" data-mark-suffix="."><li>at or below: _____</li> <li>at or above: _____</li> </ol> </div> </div> <div id="exercisetwentytwo" data-type="exercise"><div id="id8181793" data-type="problem"><p id="prob_22">The population standard deviation = _____</p> </div> </div> <div id="exercisetwentythree" data-type="exercise"><div id="id8181834" data-type="problem"><p id="prob_23">What percent of the FTES were from 528.5 to 1447.5? How do you know?</p> </div> </div> <div id="exercisetwentyfour" data-type="exercise"><div id="id8181876" data-type="problem"><p id="prob_24">What is the <em data-effect="italics">IQR</em>? What does the <em data-effect="italics">IQR</em> represent?</p> </div> </div> <div id="exercisetwentyfive" data-type="exercise"><div id="id8181927" data-type="problem"><p id="prob_25">How many standard deviations away from the mean is the median?</p> <p id="eip-4"><em data-effect="italics">Additional Information:</em> The population FTES for 2005–2006 through 2010–2011 was given in an updated report. The data are reported here.</p> <table id="eip-395" summary="This is a table of FTES for 2005-06 through 2010-2011."><tbody><tr><td><strong>Year</strong></td> <td>2005–06</td> <td>2006–07</td> <td>2007–08</td> <td>2008–09</td> <td>2009–10</td> <td>2010–11</td> </tr> <tr><td><strong>Total FTES</strong></td> <td>1,585</td> <td>1,690</td> <td>1,735</td> <td>1,935</td> <td>2,021</td> <td>1,890</td> </tr> </tbody> </table> </div> </div> <div data-type="exercise"><div id="eip-481" data-type="problem"><p id="eip-363">2) Calculate the mean, median, standard deviation, the first quartile, the third quartile and the <em data-effect="italics">IQR</em>. Round to one decimal place.</p> </div> <div id="eip-407" data-type="solution"><ul id="fs-idm80888224"><li>mean = 1,809.3</li> <li>median = 1,812.5</li> <li>standard deviation = 151.2</li> <li>first quartile = 1,690</li> <li>third quartile = 1,935</li> <li><em data-effect="italics">IQR</em> = 245</li> </ul> </div> </div> <div data-type="exercise"><div data-type="problem"><p id="eip-36">What additional information is needed to construct a box plot for the FTES for 2005-2006 through 2010-2011 and a box plot for the FTES for 1976-1977 through 2004-2005?</p> </div> </div> <div id="eip-975" data-type="exercise"><div id="eip-748" data-type="problem"><p>Compare the <em data-effect="italics">IQR</em> for the FTES for 1976–77 through 2004–2005 with the <em data-effect="italics">IQR</em> for the FTES for 2005-2006 through 2010–2011. Why do you suppose the <em data-effect="italics">IQR</em>s are so different?</p> </div> <div id="eip-602" data-type="solution"><p>Hint: Think about the number of years covered by each time period and what happened to higher education during those periods.</p> </div> </div> <div id="element-844" data-type="exercise"><div id="id3734385" data-type="problem"><p>Three students were applying to the same graduate school. They came from schools with different grading systems. Which student had the best GPA when compared to other students at his school? Explain how you determined your answer.</p> <table id="element-814" summary="This table presents three students and their GPAs. The first column lists the students, the second column lists the GPA, the third column lists the school average GPA, and the fourth column lists the school standard deviations. The first row represents Thuy, the second row represents Vichet, and the third row represents Kamala."><thead><tr><th>Student</th> <th>GPA</th> <th>School Average GPA</th> <th>School Standard Deviation</th> </tr> </thead> <tbody><tr><td>Thuy</td> <td>2.7</td> <td>3.2</td> <td>0.8</td> </tr> <tr><td>Vichet</td> <td>87</td> <td>75</td> <td>20</td> </tr> <tr><td>Kamala</td> <td>8.6</td> <td>8</td> <td>0.4</td> </tr> </tbody> </table> </div> </div> <div data-type="exercise"><div id="eip-290" data-type="problem"><p>3) A music school has budgeted to purchase three musical instruments. They plan to purchase a piano costing \$3,000, a guitar costing \$550, and a drum set costing \$600. The mean cost for a piano is \$4,000 with a standard deviation of \$2,500. The mean cost for a guitar is \$500 with a standard deviation of \$200. The mean cost for drums is \$700 with a standard deviation of \$100. Which cost is the lowest, when compared to other instruments of the same type? Which cost is the highest when compared to other instruments of the same type. Justify your answer.</p> <p>&nbsp;</p> </div> </div> <div id="element-799" data-type="exercise"><div id="id6038456" data-type="problem"><p id="fs-idm25455232">4) An elementary school class ran one mile with a mean of 11 minutes and a standard deviation of three minutes. Rachel, a student in the class, ran one mile in eight minutes. A junior high school class ran one mile with a mean of nine minutes and a standard deviation of two minutes. Kenji, a student in the class, ran 1 mile in 8.5 minutes. A high school class ran one mile with a mean of seven minutes and a standard deviation of four minutes. Nedda, a student in the class, ran one mile in eight minutes.</p> <ol id="element-895" type="a" data-mark-suffix="."><li>Why is Kenji considered a better runner than Nedda, even though Nedda ran faster than he?</li> <li>Who is the fastest runner with respect to his or her class? Explain why.</li> </ol> </div> </div> <div id="fs-idm6474736" data-type="exercise"><div id="fs-idm6474480" data-type="problem"><p id="fs-idm6474224">The most obese countries in the world have obesity rates that range from 11.4% to 74.6%. This data is summarized in <a href="#fs-idm115378592">Table 14</a>.</p> <table id="fs-idm115378592" summary=""><thead><tr><th>Percent of Population Obese</th> <th>Number of Countries</th> </tr> </thead> <tbody><tr><td>11.4–20.45</td> <td>29</td> </tr> <tr><td>20.45–29.45</td> <td>13</td> </tr> <tr><td>29.45–38.45</td> <td>4</td> </tr> <tr><td>38.45–47.45</td> <td>0</td> </tr> <tr><td>47.45–56.45</td> <td>2</td> </tr> <tr><td>56.45–65.45</td> <td>1</td> </tr> <tr><td>65.45–74.45</td> <td>0</td> </tr> <tr><td>74.45–83.45</td> <td>1</td> </tr> </tbody> </table> <p>&nbsp;</p> <p id="fs-idm115073344">5) What is the best estimate of the average obesity percentage for these countries? What is the standard deviation for the listed obesity rates? The United States has an average obesity rate of 33.9%. Is this rate above average or below? How “unusual” is the United States’ obesity rate compared to the average rate? Explain.</p> </div> </div> </div> <p>&nbsp;</p> <p>&nbsp;</p> <div class="free-response" data-depth="1"><div id="fs-idm70000608" data-type="exercise"><div id="fs-idm70000352" data-type="problem"><p id="fs-idm116547040"><a class="autogenerated-content" href="#fs-idm116546656">(Figure)</a> gives the percent of children under five considered to be underweight.</p> <table id="fs-idm116546656" summary=""><thead><tr><th>Percent of Underweight Children</th> <th>Number of Countries</th> </tr> </thead> <tbody><tr><td>16–21.45</td> <td>23</td> </tr> <tr><td>21.45–26.9</td> <td>4</td> </tr> <tr><td>26.9–32.35</td> <td>9</td> </tr> <tr><td>32.35–37.8</td> <td>7</td> </tr> <tr><td>37.8–43.25</td> <td>6</td> </tr> <tr><td>43.25–48.7</td> <td>1</td> </tr> </tbody> </table> <p id="fs-idm46041824">6)  What is the best estimate for the mean percentage of underweight children? What is the standard deviation? Which interval(s) could be considered unusual? Explain.</p> </div> </div> </div> <div id="fs-idm9788176a" class="bring-together-homework" data-depth="1"><h3 data-type="title">Bringing It Together</h3> <div data-type="exercise"><div data-type="problem"><p>7)  Twenty-five randomly selected students were asked the number of movies they watched the previous week. The results are as follows:</p> <table summary="The table presents the number of movies 25 students watched in the previous week. The first column lists the number of movies from 0-4, the second column lists the frequency with the values of 5, 9, 6, 4, 1, the third column is for relative frequency and is blank, and the fourth column is for cumulative relative frequency and is blank."><thead><tr><th># of movies</th> <th>Frequency</th> </tr> </thead> <tbody><tr><td>0</td> <td>5</td> </tr> <tr><td>1</td> <td>9</td> </tr> <tr><td>2</td> <td>6</td> </tr> <tr><td>3</td> <td>4</td> </tr> <tr><td>4</td> <td>1</td> </tr> </tbody> </table> <ol type="a"><li>Find the sample mean \(\overline{x}\).</li> <li>Find the approximate sample standard deviation, <em data-effect="italics">s</em>.</li> </ol> </div> <div id="id6101341" data-type="solution"></div> </div> <div id="element-976" data-type="exercise"><div id="id6006371" data-type="problem"><p>&nbsp;</p> <p id="element-567">8)  Forty randomly selected students were asked the number of pairs of sneakers they owned. Let <em data-effect="italics">X</em> = the number of pairs of sneakers owned. The results are as follows:</p> <table id="element-130" summary="The table presents the number of pairs of sneakers forty students own. The first column lists the number of pairs of sneakers owned from 0-7, the second column lists the frequency, the third column is relative frequency and is blank, and the fourth column is cumulative relative frequency and is blank."><thead><tr><th><em data-effect="italics">X</em></th> <th>Frequency</th> </tr> </thead> <tbody><tr><td>1</td> <td>2</td> </tr> <tr><td>2</td> <td>5</td> </tr> <tr><td>3</td> <td>8</td> </tr> <tr><td>4</td> <td>12</td> </tr> <tr><td>5</td> <td>12</td> </tr> <tr><td>6</td> <td>0</td> </tr> <tr><td>7</td> <td>1</td> </tr> </tbody> </table> <ol id="element-277" type="a" data-mark-suffix="."><li>Find the sample mean \(\overline{x}\)</li> <li>Find the sample standard deviation, <em data-effect="italics">s</em></li> <li>Construct a histogram of the data.</li> <li>Complete the columns of the chart.</li> <li>Find the first quartile.</li> <li>Find the median.</li> <li>Find the third quartile.</li> <li>Construct a box plot of the data.</li> <li>What percent of the students owned at least five pairs?</li> <li>Find the 40<sup>th</sup> percentile.</li> <li>Find the 90<sup>th</sup> percentile.</li> <li>Construct a line graph of the data</li> <li>Construct a stemplot of the data</li> </ol> </div> </div> <div id="element-324s" data-type="exercise"><div id="id4231623" data-type="problem"><p>&nbsp;</p> <p id="fs-idm32869728">9)  Following are the published weights (in pounds) of all of the team members of the San Francisco 49ers from a previous year.</p> <p id="element-598">177;  205;  210;  210;  232;  205;  185;  185;  178;  210;  206;  212;  184;  174;  185;  242;  188;  212;  215;  247;  241;  223;  220;  260;  245;  259;  278;  270;  280;  295;  275;  285;  290;  272;  273;  280;  285;  286;  200;  215;  185;  230;  250;  241;  190;  260;  250;  302;  265;  290;  276;  228;  265</p> <ol id="fs-idm96948032" type="a"><li>Organize the data from smallest to largest value.</li> <li>Find the median.</li> <li>Find the first quartile.</li> <li>Find the third quartile.</li> <li>Construct a box plot of the data.</li> <li>The middle 50% of the weights are from _______ to _______.</li> <li>If our population were all professional football players, would the above data be a sample of weights or the population of weights? Why?</li> <li>If our population included every team member who ever played for the San Francisco 49ers, would the above data be a sample of weights or the population of weights? Why?</li> <li>Assume the population was the San Francisco 49ers. Find: <ol id="nestlist4" type="i" data-mark-suffix="."><li>the population mean, <em data-effect="italics">μ</em>.</li> <li>the population standard deviation, <em data-effect="italics">σ</em>.</li> <li>the weight that is two standard deviations below the mean.</li> <li>When Steve Young, quarterback, played football, he weighed 205 pounds. How many standard deviations above or below the mean was he?</li> </ol> </li> <li>That same year, the mean weight for the Dallas Cowboys was 240.08 pounds with a standard deviation of 44.38 pounds. Emmit Smith weighed in at 209 pounds. With respect to his team, who was lighter, Smith or Young? How did you determine your answer?</li> </ol> <p>&nbsp;</p> </div> <div id="id4782460" data-type="solution"></div> </div> <div data-type="exercise"><div id="id4976485" data-type="problem"><p>10) One hundred teachers attended a seminar on mathematical problem solving. The attitudes of a representative sample of 12 of the teachers were measured before and after the seminar. A positive number for change in attitude indicates that a teacher&#8217;s attitude toward math became more positive. The 12 change scores are as follows:</p> <p id="element-6236"><span id="set-linelist1" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">3</span>  <span data-type="item">8  </span><span data-type="item">1  </span><span data-type="item">2</span>  <span data-type="item">0  </span><span data-type="item">5  </span><span data-type="item">3  </span><span data-type="item">1  </span><span data-type="item">1  </span><span data-type="item">6</span>  <span data-type="item">5  </span><span data-type="item">2</span></span></p> <ol type="a" data-mark-suffix="."><li>What is the mean change score?</li> <li>What is the standard deviation for this population?</li> <li>What is the median change score?</li> <li>Find the change score that is 2.2 standard deviations below the mean.</li> </ol> </div> </div> <div data-type="exercise"><div id="id6158752" data-type="problem"><p id="element-223">11)  Refer to <a class="autogenerated-content" href="#fs-idm70725344">(Figure)</a> determine which of the following are true and which are false. Explain your solution to each part in complete sentences.</p> <div id="fs-idm70725344" class="bc-figure figure"><span id="id4115299" data-type="media" data-alt="This shows three graphs. The first is a histogram with a mode of 3 and fairly symmetrical distribution between 1 (minimum value) and 5 (maximum value). The second graph is a histogram with peaks at 1 (minimum value) and 5 (maximum value) with 3 having the lowest frequency. The third graph is a box plot. The first whisker extends from 0 to 1. The box begins at the firs quartile, 1, and ends at the third quartile,6. A vertical, dashed line marks the median at 3. The second whisker extends from 6 on." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch02_13_05-1.jpg" alt="This shows three graphs. The first is a histogram with a mode of 3 and fairly symmetrical distribution between 1 (minimum value) and 5 (maximum value). The second graph is a histogram with peaks at 1 (minimum value) and 5 (maximum value) with 3 having the lowest frequency. The third graph is a box plot. The first whisker extends from 0 to 1. The box begins at the firs quartile, 1, and ends at the third quartile,6. A vertical, dashed line marks the median at 3. The second whisker extends from 6 on." width="550" data-media-type="image/jpg" /></span></div> <ol type="a"><li>The medians for all three graphs are the same.</li> <li>We cannot determine if any of the means for the three graphs is different.</li> <li>The standard deviation for graph b is larger than the standard deviation for graph a.</li> <li>We cannot determine if any of the third quartiles for the three graphs is different.</li> </ol> <p>&nbsp;</p> </div> <div id="id7741745" data-type="solution"></div> </div> <div id="fs-idm40658512" data-type="exercise"><div id="id4298561" data-type="problem"><p id="id12029548">12)  In a recent issue of the <cite><span data-type="cite-title">IEEE Spectrum</span></cite>, 84 engineering conferences were announced. Four conferences lasted two days. Thirty-six lasted three days. Eighteen lasted four days. Nineteen lasted five days. Four lasted six days. One lasted seven days. One lasted eight days. One lasted nine days. Let <em data-effect="italics">X</em> = the length (in days) of an engineering conference.</p> <ol id="id6952326" type="a" data-mark-suffix="."><li>Organize the data in a chart.</li> <li>Find the median, the first quartile, and the third quartile.</li> <li>Find the 65<sup>th</sup> percentile.</li> <li>Find the 10<sup>th</sup> percentile.</li> <li>Construct a box plot of the data.</li> <li>The middle 50% of the conferences last from _______ days to _______ days.</li> <li>Calculate the sample mean of days of engineering conferences.</li> <li>Calculate the sample standard deviation of days of engineering conferences.</li> <li>Find the mode.</li> <li>If you were planning an engineering conference, which would you choose as the length of the conference: mean; median; or mode? Explain why you made that choice.</li> <li>Give two reasons why you think that three to five days seem to be popular lengths of engineering conferences.</li> </ol> </div> </div> <div data-type="exercise"><div id="id3671813" data-type="problem"><p id="element-707">13)  A survey of enrollment at 35 community colleges across the United States yielded the following figures:</p> <p id="element-23455">6414;  1550;  2109;  9350;  21828;  4300;  5944;  5722;  2825;  2044;  5481;  5200;  5853;  2750;  10012;  6357;  27000;  9414;  7681;  3200;  17500;  9200;  7380;  18314;  6557;  13713;  17768;  7493;  2771;  2861;  1263;  7285;  28165;  5080;  11622</p> <ol id="id12992735" type="a" data-mark-suffix="."><li>Organize the data into a chart with five intervals of equal width. Label the two columns &#8220;Enrollment&#8221; and &#8220;Frequency.&#8221;</li> <li>Construct a histogram of the data.</li> <li>If you were to build a new community college, which piece of information would be more valuable: the mode or the mean?</li> <li>Calculate the sample mean.</li> <li>Calculate the sample standard deviation.</li> <li>A school with an enrollment of 8000 would be how many standard deviations away from the mean?</li> </ol> </div> <div id="eip-idm26749440" data-type="solution"></div> </div> <p>&nbsp;</p> <p id="element-225"><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next two exercises.</em><em data-effect="italics">X</em> = the number of days per week that 100 clients use a particular exercise facility.</p> <table id="element-813" summary="This table presents the number of days a week clients use a particular exercise facility. The first column lists the number of days from 0-6 and the second column lists the frequency."><thead><tr><th><em data-effect="italics">x</em></th> <th>Frequency</th> </tr> </thead> <tbody><tr><td>0</td> <td>3</td> </tr> <tr><td>1</td> <td>12</td> </tr> <tr><td>2</td> <td>33</td> </tr> <tr><td>3</td> <td>28</td> </tr> <tr><td>4</td> <td>11</td> </tr> <tr><td>5</td> <td>9</td> </tr> <tr><td>6</td> <td>4</td> </tr> </tbody> </table> <div data-type="exercise"><div id="id4945198" data-type="problem"><p>&nbsp;</p> <p id="element-441">14)  The 80<sup>th</sup> percentile is _____</p> <ol id="ni4" type="a" data-mark-suffix="."><li>5</li> <li>80</li> <li>3</li> <li>4</li> </ol> </div> </div> <div id="element-867" data-type="exercise"><div id="id4861412" data-type="problem"><p>&nbsp;</p> <p id="element-793">15) The number that is 1.5 standard deviations BELOW the mean is approximately _____</p> <ol id="ni5" type="a" data-mark-suffix="."><li>0.7</li> <li>4.8</li> <li>–2.8</li> <li>Cannot be determined</li> </ol> </div> <div id="id7360802" data-type="solution"><p>&nbsp;</p> </div> </div> <div id="eip-961" data-type="exercise"><div id="eip-175" data-type="problem"><p id="eip-902">16) Suppose that a publisher conducted a survey asking adult consumers the number of fiction paperback books they had purchased in the previous month. The results are summarized in the <a class="autogenerated-content" href="#table23">(Figure)</a>.</p> <table id="table23" summary="Publisher B is the table with number of books in the first column, from 0-5, 7, 9, frequency in the second column, and relative frequency in the third column which is blank."><thead><tr><th># of books</th> <th>Freq.</th> <th>Rel. Freq.</th> </tr> </thead> <tbody><tr><td>0</td> <td>18</td> <td></td> </tr> <tr><td>1</td> <td>24</td> <td></td> </tr> <tr><td>2</td> <td>24</td> <td></td> </tr> <tr><td>3</td> <td>22</td> <td></td> </tr> <tr><td>4</td> <td>15</td> <td></td> </tr> <tr><td>5</td> <td>10</td> <td></td> </tr> <tr><td>7</td> <td>5</td> <td></td> </tr> <tr><td>9</td> <td>1</td> <td></td> </tr> </tbody> </table> <ol id="eip-id1170221803861" type="a" data-mark-suffix="."><li>Are there any outliers in the data? Use an appropriate numerical test involving the <em data-effect="italics">IQR</em> to identify outliers, if any, and clearly state your conclusion.</li> <li>If a data value is identified as an outlier, what should be done about it?</li> <li>Are any data values further than two standard deviations away from the mean? In some situations, statisticians may use this criteria to identify data values that are unusual, compared to the other data values. (Note that this criteria is most appropriate to use for data that is mound-shaped and symmetric, rather than for skewed data.)</li> <li>Do parts a and c of this problem give the same answer?</li> <li>Examine the shape of the data. Which part, a or c, of this question gives a more appropriate result for this data?</li> <li>Based on the shape of the data which is the most appropriate measure of center for this data: mean, median or mode?</li> </ol> <p><strong>Answers to odd questions</strong></p> <p>3) For pianos, the cost of the piano is 0.4 standard deviations BELOW the mean. For guitars, the cost of the guitar is 0.25 standard deviations ABOVE the mean. For drums, the cost of the drum set is 1.0 standard deviations BELOW the mean. Of the three, the drums cost the lowest in comparison to the cost of other instruments of the same type. The guitar costs the most in comparison to the cost of other instruments of the same type.</p> </div> <p>5)</p> <ul id="fs-idm65872400"><li>\(\overline{x}=23.32\)</li> <li class="finger">Using the TI 83/84, we obtain a standard deviation of: \({s}_{x}=12.95.\)</li> <li>The obesity rate of the United States is 10.58% higher than the average obesity rate.</li> <li>Since the standard deviation is 12.95, we see that 23.32 + 12.95 = 36.27 is the obesity percentage that is one standard deviation from the mean. The United States obesity rate is slightly less than one standard deviation from the mean. Therefore, we can assume that the United States, while 34% obese, does not hav e an unusually high percentage of obese people.</li> </ul> <p>7)</p> <ol id="element-934" type="a"><li>1.48</li> <li>1.12</li> </ol> <p>9)</p> <ol id="element-189" type="a"><li>174;  177;  178;  184;  185;  185;  185;  185;  188;  190;  200;  205;  205;  206;  210;  210;  210;  212;  212;  215;  215;  220;  223;  228;  230;  232;  241;  241;  242;  245;  247;  250;  250;  259;  260;  260;  265;  265;  270;  272;  273;  275;  276;  278;  280;  280;  285;  285;  286;  290;  290;  295;  302</li> <li>241</li> <li>205.5</li> <li>272.5</li> <li><span id="id8617734" data-type="media" data-alt="A box plot with a whisker between 174 and 205.5, a solid line at 205.5, a dashed line at 241, a solid line at 272.5, and a whisker between 272.5 and 302."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch02_sol_04-1.jpg" alt="A box plot with a whisker between 174 and 205.5, a solid line at 205.5, a dashed line at 241, a solid line at 272.5, and a whisker between 272.5 and 302." data-media-type="image/jpg" data-print-width="2.5in" /></span></li> <li>205.5, 272.5</li> <li>sample</li> <li>population</li> <li><ol id="element-409" type="i" data-mark-suffix="."><li>236.34</li> <li>37.50</li> <li>161.34</li> <li>0.84 std. dev. below the mean</li> </ol> </li> <li>Young</li> </ol> <p>11)</p> <ol id="element-340" type="a"><li>True</li> <li>True</li> <li>True</li> <li>False</li> </ol> <p>13)</p> <ol id="id12992735a" type="a"><li><table id="fs-idm103039104a" summary=""><thead><tr><th>Enrollment</th> <th>Frequency</th> </tr> </thead> <tbody><tr><td>1000-5000</td> <td>10</td> </tr> <tr><td>5000-10000</td> <td>16</td> </tr> <tr><td>10000-15000</td> <td>3</td> </tr> <tr><td>15000-20000</td> <td>3</td> </tr> <tr><td>20000-25000</td> <td>1</td> </tr> <tr><td>25000-30000</td> <td>2</td> </tr> </tbody> </table> </li> <li>Check student’s solution.</li> <li>mode</li> <li>8628.74</li> <li>6943.88</li> <li>–0.09</li> </ol> </div> <p>15) a</p> <div data-type="exercise"><div data-type="problem"><p>&nbsp;</p> </div> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="stddev"><dt>Standard Deviation</dt> <dd id="id20302532">a number that is equal to the square root of the variance and measures how far data values are from their mean; notation: <em data-effect="italics">s</em> for sample standard deviation and σ for population standard deviation.</dd> </dl> <dl id="variance"><dt>Variance</dt> <dd id="id3154337">mean of the squared deviations from the mean, or the square of the standard deviation; for a set of data, a deviation can be represented as <em data-effect="italics">x</em> – \(\overline{x}\) where <em data-effect="italics">x</em> is a value of the data and \(\overline{x}\) is the sample mean. The sample variance is equal to the sum of the squares of the deviations divided by the difference of the sample size and one.</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-measures-of-the-location-of-the-data" title="Chapter 2.7: Measures of Position"><div class="chapter-title-wrap"><h3 class="chapter-number">13</h3><h2 class="chapter-title"><span class="display-none">Chapter 2.7: Measures of Position</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p id="element-280">The common measures of position or location are <span data-type="term">quartiles</span> and <span data-type="term">percentiles</span></p> <p id="fs-idp16986528">Quartiles are special percentiles. The first quartile, <em data-effect="italics">Q</em><sub>1</sub>, is the same as the 25<sup>th</sup> percentile, and the third quartile, <em data-effect="italics">Q</em><sub>3</sub>, is the same as the 75<sup>th</sup> percentile. The median, <em data-effect="italics">M</em>, is called both the second quartile and the 50<sup>th</sup> percentile.</p> <p id="element-105">To calculate quartiles and percentiles, the data must be ordered from smallest to largest. Quartiles divide ordered data into quarters. Percentiles divide ordered data into hundredths. To score in the 90<sup>th</sup> percentile of an exam does not mean, necessarily, that you received 90% on a test. It means that 90% of test scores are the same or less than your score and 10% of the test scores are the same or greater than your test score.</p> <p id="fs-idm12500320">Percentiles are useful for comparing values. For this reason, universities and colleges use percentiles extensively. One instance in which colleges and universities use percentiles is when SAT results are used to determine a minimum testing score that will be used as an acceptance factor. For example, suppose Duke accepts SAT scores at or above the 75<sup>th</sup> percentile. That translates into a score of at least 1220.</p> <p id="fs-idp48110304">Percentiles are mostly used with very large populations. Therefore, if you were to say that 90% of the test scores are less (and not the same or less) than your score, it would be acceptable because removing one particular data value is not significant.</p> <p id="element-681">The <span data-type="term">median</span> is a number that measures the &#8220;center&#8221; of the data. You can think of the median as the &#8220;middle value,&#8221; but it does not actually have to be one of the observed values. It is a number that separates ordered data into halves. Half the values are the same number or smaller than the median, and half the values are the same number or larger. For example, consider the following data. <span data-type="newline"><br /> </span>1;  11.5;  6;  7.2;  4;  8;  9;  10;  6.8;  8.3;  2;  2;  10;  1 <span data-type="newline"><br /> </span>Ordered from smallest to largest: <span data-type="newline"><br /> </span>1;  1;  2;  2;  4;  6;  6. 8;  7.2;  8;  8.3;  9;  10;  10;  11.5</p> <p id="element-546">Since there are 14 observations, the median is between the seventh value, 6.8, and the eighth value, 7.2. To find the median, add the two values together and divide by two.</p> <div data-type="equation">\(\frac{6.8+7.2}{2}=7\)</div> <p id="element-995">The median is seven. Half of the values are smaller than seven and half of the values are larger than seven.</p> <p id="element-308"><span data-type="term">Quartiles</span> are numbers that separate the data into quarters. Quartiles may or may not be part of the data. To find the quartiles, first find the median or second quartile. The first quartile, <em data-effect="italics">Q</em><sub>1</sub>, is the middle value of the lower half of the data, and the third quartile, <em data-effect="italics">Q</em><sub>3</sub>, is the middle value, or median, of the upper half of the data. To get the idea, consider the same data set: <span data-type="newline"><br /> </span>1;  1;  2;  2;  4;  6;  6.8;  7.2;  8;  8.3;  9;  10;  10;  11.5</p> <p id="element-805">The median or <strong>second quartile</strong> is seven. The lower half of the data are 1,  1,  2,  2,  4,  6,  6.8. The middle value of the lower half is two. <span data-type="newline"><br /> </span>1;  1;  2;  2;  4;  6;  6.8</p> <p id="element-227">The number two, which is part of the data, is the <span data-type="term">first quartile</span>. One-fourth of the entire sets of values are the same as or less than two and three-fourths of the values are more than two.</p> <p>The upper half of the data is 7.2,  8,  8.3,  9,  10,  10,  11.5. The middle value of the upper half is nine.</p> <p id="element-386">The <span data-type="term">third quartile</span>, <em data-effect="italics">Q</em>3, is nine. Three-fourths (75%) of the ordered data set are less than nine. One-fourth (25%) of the ordered data set are greater than nine. The third quartile is part of the data set in this example.</p> <p id="element-716">The <span data-type="term">interquartile range</span> is a number that indicates the spread of the middle half or the middle 50% of the data. It is the difference between the third quartile (<em data-effect="italics">Q</em><sub>3</sub>) and the first quartile (<em data-effect="italics">Q</em><sub>1</sub>).</p> <p id="delete_me"><em data-effect="italics">IQR</em> = <em data-effect="italics">Q</em><sub>3</sub> – <em data-effect="italics">Q</em><sub>1</sub></p> <p>The <em data-effect="italics">IQR</em> can help to determine potential <strong>outliers</strong>. <strong>A value is suspected to be a potential outlier if it is less than (1.5)(<em data-effect="italics">IQR</em>) below the first quartile or more than (1.5)(<em data-effect="italics">IQR</em>) above the third quartile</strong>. Potential outliers always require further investigation.</p> <div id="fs-idm10803744" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="fs-idp4345696">A potential outlier is a data point that is significantly different from the other data points. These special data points may be errors or some kind of abnormality or they may be a key to understanding the data.</p> </div> <div id="element-826" class="textbox textbox--examples" data-type="example"><div id="exer5" data-type="exercise"><div id="id45036025" data-type="problem"><p id="element-720">For the following 13 real estate prices, calculate the <em data-effect="italics">IQR</em> and determine if any prices are potential outliers. Prices are in dollars. <span data-type="newline"><br /> </span>389,950;  230,500;  158,000;  479,000;  639,000;  114,950;  5, 500,000;  387,000;  659,000;  529,000;  575,000;  488,800;  1,095,000</p> </div> <div id="id45746296" data-type="solution"><p id="element-939">Order the data from smallest to largest. <span data-type="newline"><br /> </span>114,950;  158,000;  230,500;  387,000;  389,950;  479,000;  488,800;  529,000;  575,000; 639,000; 659,000; 1,095,000; 5,500,000</p> <p id="element-170"><em data-effect="italics">M</em> = 488, 800</p> <p><em data-effect="italics">Q</em><sub>1</sub> = \(\frac{\text{230,500 + 387,000}}{2}\) = 308,750</p> <p><em data-effect="italics">Q</em><sub>3</sub> = \(\frac{\text{639,000 + 659,000}}{2}\) = 649,000</p> <p id="element-290"><em data-effect="italics">IQR</em> = 649,000 – 308,750 = 340,250</p> <p id="element-166">(1.5)(<em data-effect="italics">IQR</em>) = (1.5)(340,250) = 510,375</p> <p id="element-348"><em data-effect="italics">Q</em><sub>1</sub> – (1.5)(<em data-effect="italics">IQR</em>) = 308,750 – 510,375 = –201,625</p> <p id="element-211"><em data-effect="italics">Q</em><sub>3</sub> + (1.5)(<em data-effect="italics">IQR</em>) = 649,000 + 510,375 = 1,159,375</p> <p id="element-109">No house price is less than –201,625. However, 5,500,000 is more than 1,159,375. Therefore, 5,500,000 is a potential <span data-type="term">outlier</span>.</p> </div> </div> </div> <div id="fs-idp16250528" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp63302352" data-type="exercise"><div id="fs-idm22548992" data-type="problem"><p id="fs-idp42507600">For the following 11 salaries, calculate the <em data-effect="italics">IQR</em> and determine if any salaries are outliers. The salaries are in dollars.</p> <p id="fs-idm25187088"><span data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">\$33,000   \$</span><span data-type="item">64,500   \$</span><span data-type="item">28,000   \$</span><span data-type="item">54,000   \$</span><span data-type="item">72,000   \$</span><span data-type="item">68,500   \$</span><span data-type="item">69,000   \$</span><span data-type="item">42,000   \$</span><span data-type="item">54,000   \$</span><span data-type="item">120,000   \$</span><span data-type="item">40,500</span></span></p> </div> </div> </div> <div id="element-17" class="textbox textbox--examples" data-type="example"><div data-type="exercise"><div id="id45587381" data-type="problem"><p id="element-880">For the two data sets in the <a href="#element-583">test scores example</a>, find the following:</p> <ol type="a" data-mark-suffix="."><li>The interquartile range. Compare the two interquartile ranges.</li> <li>Any outliers in either set.</li> </ol> </div> <div id="fs-idm13740032" data-type="solution"><p id="fs-idp37987952">The five number summary for the day and night classes is</p> <table id="fs-idp36487328" summary=""><thead><tr><th></th> <th>Minimum</th> <th><em data-effect="italics">Q</em><sub>1</sub></th> <th>Median</th> <th><em data-effect="italics">Q</em><sub>3</sub></th> <th>Maximum</th> </tr> </thead> <tbody><tr><td><strong data-effect="bold">Day</strong></td> <td>32</td> <td>56</td> <td>74.5</td> <td>82.5</td> <td>99</td> </tr> <tr><td><strong data-effect="bold">Night</strong></td> <td>25.5</td> <td>78</td> <td>81</td> <td>89</td> <td>98</td> </tr> </tbody> </table> <ol id="fs-idm23962720" type="a"><li>The IQR for the day group is <em data-effect="italics">Q</em><sub>3</sub> – <em data-effect="italics">Q</em><sub>1</sub> = 82.5 – 56 = 26.5 <p id="fs-idm7044352">The IQR for the night group is <em data-effect="italics">Q</em><sub>3</sub> – <em data-effect="italics">Q</em><sub>1</sub> = 89 – 78 = 11</p> <p id="fs-idp42547504">The interquartile range (the spread or variability) for the day class is larger than the night class <em data-effect="italics">IQR</em>. This suggests more variation will be found in the day class’s class test scores.</p> </li> <li>Day class outliers are found using the IQR times 1.5 rule. So, <ul id="fs-idm52257968" data-labeled-item="true"><li><em data-effect="italics">Q</em><sub>1</sub> &#8211; <em data-effect="italics">IQR</em>(1.5) = 56 – 26.5(1.5) = 16.25</li> <li><em data-effect="italics">Q</em><sub>3</sub> + <em data-effect="italics">IQR</em>(1.5) = 82.5 + 26.5(1.5) = 122.25</li> </ul> <p id="fs-idp38341744">Since the minimum and maximum values for the day class are greater than 16.25 and less than 122.25, there are no outliers.</p> <p id="fs-idm23940160">Night class outliers are calculated as:</p> <ul id="fs-idp29569184" data-labeled-item="true"><li><em data-effect="italics">Q</em><sub>1</sub> – <em data-effect="italics">IQR</em> (1.5) = 78 – 11(1.5) = 61.5</li> <li><em data-effect="italics">Q</em><sub>3</sub> + IQR(1.5) = 89 + 11(1.5) = 105.5</li> </ul> <p id="fs-idp5005056">For this class, any test score less than 61.5 is an outlier. Therefore, the scores of 45 and 25.5 are outliers. Since no test score is greater than 105.5, there is no upper end outlier.</p> </li> </ol> </div> </div> </div> <div id="fs-idp58037360" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp23368176" data-type="exercise"><div id="fs-idp23368304" data-type="problem"><p id="fs-idp5269648">Find the interquartile range for the following two data sets and compare them.</p> <p id="fs-idp4060048">Test Scores for Class <em data-effect="italics">A</em> <span data-type="newline"><br /> </span>69;  96;  81;  79;  65;  76;  83;  99;  89;  67;  90;  77;  85;  98;  66;  91;  77;  69;  80;  94 <span data-type="newline"><br /> </span>Test Scores for Class <em data-effect="italics">B</em> <span data-type="newline"><br /> </span>90;  72;  80;  92;  90;  97;  92;  75;  79;  68;  70;  80;  99;  95;  78;  73;  71;  68;  95;  100</p> </div> </div> </div> <div id="element-84" class="textbox textbox--examples" data-type="example"><p id="element-913">Fifty statistics students were asked how much sleep they get per school night (rounded to the nearest hour). The results were:</p> <table id="id4431204" summary="This table presents the amount of sleep per school night in hours in the first column, from 4-10 hours, frequency in the second column, relative frequency in the third column, and cumulative relative frequency in the fourth column."><thead><tr><th>AMOUNT OF SLEEP PER SCHOOL NIGHT (HOURS)</th> <th>FREQUENCY</th> <th>RELATIVE FREQUENCY</th> <th>CUMULATIVE RELATIVE FREQUENCY</th> </tr> </thead> <tbody><tr><td>4</td> <td>2</td> <td>0.04</td> <td>0.04</td> </tr> <tr><td>5</td> <td>5</td> <td>0.10</td> <td>0.14</td> </tr> <tr><td>6</td> <td>7</td> <td>0.14</td> <td>0.28</td> </tr> <tr><td>7</td> <td>12</td> <td>0.24</td> <td>0.52</td> </tr> <tr><td>8</td> <td>14</td> <td>0.28</td> <td>0.80</td> </tr> <tr><td>9</td> <td>7</td> <td>0.14</td> <td>0.94</td> </tr> <tr><td>10</td> <td>3</td> <td>0.06</td> <td>1.00</td> </tr> </tbody> </table> <p id="element-688"><strong>Find the 28<sup>th</sup> percentile</strong>. Notice the 0.28 in the &#8220;cumulative relative frequency&#8221; column. Twenty-eight percent of 50 data values is 14 values. There are 14 values less than the 28<sup>th</sup> percentile. They include the two 4s, the five 5s, and the seven 6s. The 28<sup>th</sup> percentile is between the last six and the first seven. <strong>The 28<sup>th</sup> percentile is 6.5.</strong></p> <p id="element-488"><strong>Find the median</strong>. Look again at the &#8220;cumulative relative frequency&#8221; column and find 0.52. The median is the 50<sup>th</sup> percentile or the second quartile. 50% of 50 is 25. There are 25 values less than the median. They include the two 4s, the five 5s, the seven 6s, and eleven of the 7s. The median or 50<sup>th</sup> percentile is between the 25<sup>th</sup>, or seven, and 26<sup>th</sup>, or seven, values. <strong>The median is seven.</strong></p> <p id="element-539"><strong>Find the third quartile</strong>. The third quartile is the same as the 75<sup>th</sup> percentile. You can &#8220;eyeball&#8221; this answer. If you look at the &#8220;cumulative relative frequency&#8221; column, you find 0.52 and 0.80. When you have all the fours, fives, sixes and sevens, you have 52% of the data. When you include all the 8s, you have 80% of the data. <strong>The 75<sup>th</sup> percentile, then, must be an eight</strong>. Another way to look at the problem is to find 75% of 50, which is 37.5, and round up to 38. The third quartile, <em data-effect="italics">Q</em><sub>3</sub>, is the 38<sup>th</sup> value, which is an eight. You can check this answer by counting the values. (There are 37 values below the third quartile and 12 values above.)</p> </div> <div id="fs-idm52647472" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try it</div> <div id="fs-idm18606176" data-type="exercise"><div id="fs-idm21314496" data-type="problem"><p id="fs-idm44305856">Forty bus drivers were asked how many hours they spend each day running their routes (rounded to the nearest hour). Find the 65<sup>th</sup> percentile.</p> <table id="fs-idm24649760" summary=""><thead><tr><th>Amount of time spent on route (hours)</th> <th>Frequency</th> <th>Relative Frequency</th> <th>Cumulative Relative Frequency</th> </tr> </thead> <tbody><tr><td>2</td> <td>12</td> <td>0.30</td> <td>0.30</td> </tr> <tr><td>3</td> <td>14</td> <td>0.35</td> <td>0.65</td> </tr> <tr><td>4</td> <td>10</td> <td>0.25</td> <td>0.90</td> </tr> <tr><td>5</td> <td>4</td> <td>0.10</td> <td>1.00</td> </tr> </tbody> </table> </div> </div> </div> <div id="element-572" class="textbox textbox--examples" data-type="example"><div id="element-2353" data-type="exercise"><div id="id45288379" data-type="problem"><p id="element-23532">Using <a class="autogenerated-content" href="#id4431204">(Figure)</a>:</p> <ol type="a"><li>Find the 80<sup>th</sup> percentile.</li> <li>Find the 90<sup>th</sup> percentile.</li> <li>Find the first quartile. What is another name for the first quartile?</li> </ol> </div> <div id="fs-idp60869984" data-type="solution"><p id="fs-idp15042704">Using the data from the frequency table, we have:</p> <ol id="fs-idm54301152" type="a"><li>The 80<sup>th</sup> percentile is between the last eight and the first nine in the table (between the 40<sup>th</sup> and 41<sup>st</sup> values). Therefore, we need to take the mean of the 40<sup>th</sup> an 41<sup>st</sup> values. The 80<sup>th</sup> percentile \(=\frac{8+9}{2}=8.5\)</li> <li>The 90<sup>th</sup> percentile will be the 45<sup>th</sup> data value (location is 0.90(50) = 45) and the 45<sup>th</sup> data value is nine.</li> <li><em data-effect="italics">Q</em><sub>1</sub> is also the 25<sup>th</sup> percentile. The 25<sup>th</sup> percentile location calculation: <em data-effect="italics">P</em><sub>25</sub> = 0.25(50) = 12.5 ≈ 13 the 13<sup>th</sup> data value. Thus, the 25th percentile is six.</li> </ol> </div> </div> </div> <div id="fs-idm56651440" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp38065168" data-type="exercise"><div id="fs-idm27880528" data-type="problem"><p id="fs-idp54653312">Refer to the <a class="autogenerated-content" href="#fs-idm24649760">(Figure)</a>. Find the third quartile. What is another name for the third quartile?</p> </div> </div> </div> <div id="fs-idm13393536" class="statistics collab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Collaborative Statistics</div> <p id="element-758">Your instructor or a member of the class will ask everyone in class how many sweaters they own. Answer the following questions:</p> <ol id="exlist"><li>How many students were surveyed?</li> <li>What kind of sampling did you do?</li> <li>Construct two different histograms. For each, starting value = _____ ending value = ____.</li> <li>Find the median, first quartile, and third quartile.</li> <li>Construct a table of the data to find the following: <ol id="exlist2" type="a"><li>the 10<sup>th</sup> percentile</li> <li>the 70<sup>th</sup> percentile</li> <li>the percent of students who own less than four sweaters</li> </ol> </li> </ol> </div> <div id="fs-idm21580416" class="bc-section section" data-depth="1"><h3 data-type="title">A Formula for Finding the <em data-effect="italics">k</em>th Percentile</h3> <p id="fs-idp1786064">If you were to do a little research, you would find several formulas for calculating the <em data-effect="italics">k</em><sup>th</sup> percentile. Here is one of them.</p> <p id="fs-idp3096416"><em data-effect="italics">k</em> = the <em data-effect="italics">k<sup>th</sup></em> percentile. It may or may not be part of the data.</p> <p id="fs-idp1947472"><em data-effect="italics">i</em> = the index (ranking or position of a data value)</p> <p id="fs-idm946480"><em data-effect="italics">n</em> = the total number of data</p> <ul id="fs-idm9831088"><li>Order the data from smallest to largest.</li> <li>Calculate \(i=\frac{k}{100}\left(n+1\right)\)</li> <li>If <em data-effect="italics">i</em> is an integer, then the <em data-effect="italics">k<sup>th</sup></em> percentile is the data value in the <em data-effect="italics">i<sup>th</sup></em> position in the ordered set of data.</li> <li>If <em data-effect="italics">i</em> is not an integer, then round <em data-effect="italics">i</em> up and round <em data-effect="italics">i</em> down to the nearest integers. Average the two data values in these two positions in the ordered data set. This is easier to understand in an example.</li> </ul> <div id="fs-idm4569232" class="textbox textbox--examples" data-type="example"><div id="fs-idm105708208" data-type="exercise"><div id="fs-idm3783968" data-type="problem"><p id="fs-idp1509664">Listed are 29 ages for Academy Award winning best actors <em data-effect="italics">in order from smallest to largest.</em> <span data-type="newline"><br /> </span>18;  21;  22;  25;  26;  27;  29;  30;  31;  33;  36;  37;  41;  42;  47;  52;  55;  57;  58;  62;  64;  67;  69;  71;  72;  73;  74;  76;  77</p> <ol id="fs-idm1901040" type="a"><li>Find the 70<sup>th</sup> percentile.</li> <li>Find the 83<sup>rd</sup> percentile.</li> </ol> </div> <div id="fs-idp40713040" data-type="solution"><ol id="fs-idm62647008" type="a"><li><ul id="fs-idp14170864" data-labeled-item="true"><li><em data-effect="italics">k</em> = 70</li> <li><em data-effect="italics">i</em> = the index</li> <li><em data-effect="italics">n</em> = 29</li> </ul> <p><em data-effect="italics">i</em> = \(\frac{k}{100}\) (<em data-effect="italics">n</em> + 1) = (\(\frac{70}{100}\))(29 + 1) = 21. Twenty-one is an integer, and the data value in the 21<sup>st</sup> position in the ordered data set is 64. The 70<sup>th</sup> percentile is 64 years.</p></li> <li><ul id="fs-idm21563168" data-labeled-item="true"><li><em data-effect="italics">k</em> = 83<sup>rd</sup> percentile</li> <li><em data-effect="italics">i</em> = the index</li> <li><em data-effect="italics">n</em> = 29</li> </ul> <p><em data-effect="italics">i</em>  = \(\frac{k}{100}\) (<em data-effect="italics">n</em> + 1) = (\(\frac{83}{100}\))(29 + 1) = 24.9, which is NOT an integer.</p> <p>Round it down to 24 and up to 25. The age in the 24<sup>th</sup> position is 71 and the age in the 25<sup>th</sup> position is 72. Average 71 and 72. The 83<sup>rd</sup> percentile is 71.5 years.</p></li> </ol> </div> </div> </div> <div id="fs-idm16529696" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm3894192" data-type="exercise"><div id="fs-idp25866864" data-type="problem"><p id="fs-idp25866992">Listed are 29 ages for Academy Award winning best actors <em data-effect="italics">in order from smallest to largest.</em></p> <p id="fs-idm19734064">18;  21;  22;  25;  26;  27;  29;  30;  31;  33;  36;  37;  41;  42;  47;  52;  55;  57;  58;  62;  64;  67;  69;  71;  72;  73;  74;  76;  77 <span data-type="newline"><br /> </span>Calculate the 20<sup>th</sup> percentile and the 55<sup>th</sup> percentile.</p> </div> </div> </div> <div id="eip-404" class="finger" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="fs-idp26669920">You can calculate percentiles using calculators and computers. There are a variety of online calculators.</p> </div> </div> <div id="fs-idp2972176" class="bc-section section" data-depth="1"><h3 data-type="title">A Formula for Finding the Percentile of a Value in a Data Set</h3> <ul id="fs-idm17756640"><li>Order the data from smallest to largest.</li> <li><em data-effect="italics">x</em> = the number of data values counting from the bottom of the data list up to but not including the data value for which you want to find the percentile.</li> <li><em data-effect="italics">y</em> = the number of data values equal to the data value for which you want to find the percentile.</li> <li><em data-effect="italics">n</em> = the total number of data.</li> <li>Calculate \(\frac{x+0.5y}{n}\)(100). Then round to the nearest integer.</li> </ul> <div id="fs-idm3849664" class="textbox textbox--examples" data-type="example"><div id="fs-idp28609648" data-type="exercise"><div id="fs-idp28609904" data-type="problem"><p id="fs-idp38890112">Listed are 29 ages for Academy Award winning best actors <em data-effect="italics">in order from smallest to largest.</em> <span data-type="newline"><br /> </span>18;  21;  22;  25;  26;  27;  29;  30;  31;  33;  36;  37;  41;  42;  47;  52;  55;  57;  58;  62;  64;  67;  69;  71;  72;  73;  74;  76;  77</p> <ol id="fs-idm21168272" type="a"><li>Find the percentile for 58.</li> <li>Find the percentile for 25.</li> </ol> </div> <div id="fs-idm170490752" data-type="solution"><ol id="fs-idm170490496" type="a"><li>Counting from the bottom of the list, there are 18 data values less than 58. There is one value of 58. <p id="fs-idm3871584"><em data-effect="italics">x</em> = 18 and <em data-effect="italics">y</em> = 1. \(\frac{x+0.5y}{n}\)(100) = \(\frac{18+0.5\left(1\right)}{29}\)(100) = 63.80. 58 is the 64<sup>th</sup> percentile.</p> </li> <li>Counting from the bottom of the list, there are three data values less than 25. There is one value of 25. <p id="fs-idm21523472"><em data-effect="italics">x</em> = 3 and <em data-effect="italics">y</em> = 1. \(\frac{x+0.5y}{n}\)(100) = \(\frac{3+0.5\left(1\right)}{29}\)(100) = 12.07. Twenty-five is the 12<sup>th</sup> percentile.</p> </li> </ol> </div> </div> </div> <div id="fs-idm170943360" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm170942864" data-type="exercise"><div id="fs-idp35294448" data-type="problem"><p id="fs-idp35294576">Listed are 30 ages for Academy Award winning best actors <u data-effect="underline">in order from smallest to largest.</u></p> <p id="fs-idp13252768">18;  21;  22;  25;  26;  27;  29;  30;  31;  31;  33;  36;  37;  41;  42;  47;  52;  55;  57;  58;  62;  64;  67;  69;  71;  72;  73;  74;  76;  77 <span data-type="newline"><br /> </span>Find the percentiles for 47 and 31.</p> </div> </div> </div> </div> <div id="fs-idp45793312" class="bc-section section" data-depth="1"><h3 data-type="title">Interpreting Percentiles, Quartiles, and Median</h3> <p id="eip-400">A percentile indicates the relative standing of a data value when data are sorted into numerical order from smallest to largest. Percentages of data values are less than or equal to the pth percentile. For example, 15% of data values are less than or equal to the 15<sup>th</sup> percentile.</p> <ul id="eip-id1164310609380" data-bullet-style="bullet"><li>Low percentiles always correspond to lower data values.</li> <li>High percentiles always correspond to higher data values.</li> </ul> <p id="fs-idp44902944">A percentile may or may not correspond to a value judgment about whether it is &#8220;good&#8221; or &#8220;bad.&#8221; The interpretation of whether a certain percentile is &#8220;good&#8221; or &#8220;bad&#8221; depends on the context of the situation to which the data applies. In some situations, a low percentile would be considered &#8220;good;&#8221; in other contexts a high percentile might be considered &#8220;good&#8221;. In many situations, there is no value judgment that applies.</p> <p id="fs-idm23920480">Understanding how to interpret percentiles properly is important not only when describing data, but also when calculating probabilities in later chapters of this text.</p> </div> <div id="fs-idm106923680" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="fs-idm20251376">When writing the interpretation of a percentile in the context of the given data, the sentence should contain the following information.</p> <ul id="eip-id1168197264788"><li>information about the context of the situation being considered</li> <li>the data value (value of the variable) that represents the percentile</li> <li>the percent of individuals or items with data values below the percentile</li> <li>the percent of individuals or items with data values above the percentile.</li> </ul> </div> <div id="eip-id1170215995305" class="textbox textbox--examples" data-type="example"><div id="fs-idm91768592" data-type="exercise"><div id="fs-idm91768464" data-type="problem"><p id="eip-id1170184310084">On a timed math test, the first quartile for time it took to finish the exam was 35 minutes. Interpret the first quartile in the context of this situation.</p> </div> <div id="fs-idm53128368" data-type="solution"><ul id="eip-id1170179452695"><li>Twenty-five percent of students finished the exam in 35 minutes or less.</li> <li>Seventy-five percent of students finished the exam in 35 minutes or more.</li> <li>A low percentile could be considered good, as finishing more quickly on a timed exam is desirable. (If you take too long, you might not be able to finish.)</li> </ul> </div> </div> </div> <div id="fs-idp16945648" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp33388848" data-type="exercise"><div id="fs-idm41955616" data-type="problem"><p id="fs-idp20699248">For the 100-meter dash, the third quartile for times for finishing the race was 11.5 seconds. Interpret the third quartile in the context of the situation.</p> </div> </div> </div> <div id="eip-id1170441826663" class="textbox textbox--examples" data-type="example"><div id="fs-idm148596320" data-type="exercise"><div id="fs-idm170402432" data-type="problem"><p id="eip-id1170436117670">On a 20 question math test, the 70<sup>th</sup> percentile for number of correct answers was 16. Interpret the 70<sup>th</sup> percentile in the context of this situation.</p> </div> </div> </div> <div id="fs-idp77029680" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp34034288" data-type="exercise"><div id="fs-idp48692144" data-type="problem"><p id="fs-idp55037680">On a 60 point written assignment, the 80<sup>th</sup> percentile for the number of points earned was 49. Interpret the 80<sup>th</sup> percentile in the context of this situation.</p> </div> </div> </div> <div id="eip-id7060500" class="textbox textbox--examples" data-type="example"><div id="fs-idm205091056" data-type="exercise"><div id="fs-idm15124096" data-type="problem"><p id="eip-id1170610063171">At a community college, it was found that the 30<sup>th</sup> percentile of credit units that students are enrolled for is seven units. Interpret the 30<sup>th</sup> percentile in the context of this situation.</p> </div> </div> </div> <div id="fs-idp80590208" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp73731328" data-type="exercise"><div id="fs-idp42792528" data-type="problem"><p id="fs-idm23433888">During a season, the 40<sup>th</sup> percentile for points scored per player in a game is eight. Interpret the 40<sup>th</sup> percentile in the context of this situation.</p> </div> </div> </div> <div id="fs-idp9603904" class="textbox textbox--examples" data-type="example"><p id="fs-idp45664304">Sharpe Middle School is applying for a grant that will be used to add fitness equipment to the gym. The principal surveyed 15 anonymous students to determine how many minutes a day the students spend exercising. The results from the 15 anonymous students are shown.</p> <p id="fs-idp39768656">0 minutes; 40 minutes; 60 minutes; 30 minutes; 60 minutes</p> <p id="fs-idm13969776">10 minutes; 45 minutes; 30 minutes; 300 minutes; 90 minutes;</p> <p id="fs-idp22597008">30 minutes; 120 minutes; 60 minutes; 0 minutes; 20 minutes</p> <p id="fs-idp53167440">Determine the following five values.</p> <ul id="fs-idp70490496" data-labeled-item="true"><li>Min = 0</li> <li><em data-effect="italics">Q</em><sub>1</sub> = 20</li> <li>Med = 40</li> <li><em data-effect="italics">Q</em><sub>3</sub> = 60</li> <li>Max = 300</li> </ul> <p id="fs-idp83565376">If you were the principal, would you be justified in purchasing new fitness equipment? Since 75% of the students exercise for 60 minutes or less daily, and since the <em data-effect="italics">IQR</em> is 40 minutes (60 – 20 = 40), we know that half of the students surveyed exercise between 20 minutes and 60 minutes daily. This seems a reasonable amount of time spent exercising, so the principal would be justified in purchasing the new equipment.</p> <p id="fs-idm77236544">However, the principal needs to be careful. The value 300 appears to be a potential outlier.</p> <p id="fs-idm9669376"><em data-effect="italics">Q</em><sub>3</sub> + 1.5(<em data-effect="italics">IQR</em>) = 60 + (1.5)(40) = 120.</p> <p id="fs-idp13270336">The value 300 is greater than 120 so it is a potential outlier. If we delete it and calculate the five values, we get the following values:</p> <ul id="fs-idm2894688" data-labeled-item="true"><li>Min = 0</li> <li><em data-effect="italics">Q</em><sub>1</sub> = 20</li> <li><em data-effect="italics">Q</em><sub>3</sub> = 60</li> <li>Max = 120</li> </ul> <p id="fs-idm6660656">We still have 75% of the students exercising for 60 minutes or less daily and half of the students exercising between 20 and 60 minutes a day. However, 15 students is a small sample and the principal should survey more students to be sure of his survey results.</p> </div> <div id="fs-idm63224784" class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="fs-idm63224288">Cauchon, Dennis, Paul Overberg. “Census data shows minorities now a majority of U.S. births.” USA Today, 2012. Available online at http://usatoday30.usatoday.com/news/nation/story/2012-05-17/minority-birthscensus/55029100/1 (accessed April 3, 2013).</p> <p id="fs-idm76887104">Data from the United States Department of Commerce: United States Census Bureau. Available online at http://www.census.gov/ (accessed April 3, 2013).</p> <p id="fs-idm76886560">“1990 Census.” United States Department of Commerce: United States Census Bureau. Available online at http://www.census.gov/main/www/cen1990.html (accessed April 3, 2013).</p> <p id="fs-idm76885984">Data from <em data-effect="italics">San Jose Mercury News</em>.</p> <p id="fs-idm76885600">Data from <em data-effect="italics">Time Magazine</em>; survey by Yankelovich Partners, Inc.</p> </div> <div id="fs-idm13790128" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idp1397504">The values that divide a rank-ordered set of data into 100 equal parts are called percentiles. Percentiles are used to compare and interpret data. For example, an observation at the 50<sup>th</sup> percentile would be greater than 50 percent of the other obeservations in the set. Quartiles divide data into quarters. The first quartile (<em data-effect="italics">Q</em><sub>1</sub>) is the 25<sup>th</sup> percentile,the second quartile (<em data-effect="italics">Q</em><sub>2</sub> or median) is 50<sup>th</sup> percentile, and the third quartile (<em data-effect="italics">Q</em><sub>3</sub>) is the the 75<sup>th</sup> percentile. The interquartile range, or <em data-effect="italics">IQR</em>, is the range of the middle 50 percent of the data values. The <em data-effect="italics">IQR</em> is found by subtracting <em data-effect="italics">Q</em><sub>1</sub> from <em data-effect="italics">Q</em><sub>3</sub>, and can help determine outliers by using the following two expressions.</p> <ul id="fs-idp12766560"><li><em data-effect="italics">Q</em><sub>3</sub> + <em data-effect="italics">IQR</em>(1.5)</li> <li><em data-effect="italics">Q</em><sub>1</sub> – <em data-effect="italics">IQR</em>(1.5)</li> </ul> </div> <div id="fs-idm202752" class="formula-review" data-depth="1"><h3 data-type="title">Formula Review</h3> <p id="fs-idp5126816">\(i=\left(\frac{k}{100}\right)\left(n+1\right)\)</p> <p id="fs-idp706784">where <em data-effect="italics">i</em> = the ranking or position of a data value,</p> <p id="fs-idm55046864"><em data-effect="italics">k</em> = the kth percentile,</p> <p id="fs-idm6916704"><em data-effect="italics">n</em> = total number of data.</p> <p id="fs-idp294352">Expression for finding the percentile of a data value: \(\left(\frac{x\text{ + }0.5y}{n}\right)\)(100)</p> <p id="fs-idp17176000">where <em data-effect="italics">x</em> = the number of values counting from the bottom of the data list up to but not including the data value for which you want to find the percentile,</p> <p id="fs-idm2884704"><em data-effect="italics">y</em> = the number of data values equal to the data value for which you want to find the percentile,</p> <p id="fs-idm5691392"><em data-effect="italics">n</em> = total number of data</p> </div> <div id="fs-idp40431760" class="practice" data-depth="1"><div id="fs-idm1110784" data-type="exercise"><div id="fs-idm38839376" data-type="problem"><p id="fs-idm38839120">Listed are 29 ages for Academy Award winning best actors <em data-effect="italics">in order from smallest to largest.</em></p> <p id="fs-idm6939584">18;  21;  22;  25;  26;  27;  29;  30;  31;  33;  36;  37;  41;  42;  47;  52;  55;  57;  58;  62;  64;  67;  69;  71;  72;  73;  74;  76;  77</p> <ol id="fs-idp12728784" type="a"><li>Find the 40<sup>th</sup> percentile.</li> <li>Find the 78<sup>th</sup> percentile.</li> </ol> </div> <div id="fs-idm60642032" data-type="solution"><ol id="fs-idp34749472" type="a"><li>The 40<sup>th</sup> percentile is 37 years.</li> <li>The 78<sup>th</sup> percentile is 70 years.</li> </ol> </div> </div> <div id="fs-idm4719584" data-type="exercise"><div id="fs-idm4719328" data-type="problem"><p id="fs-idp14289472">Listed are 32 ages for Academy Award winning best actors <em data-effect="italics">in order from smallest to largest.</em></p> <p id="fs-idm62636912">18;  18;  21;  22;  25;  26;  27;  29;  30;  31;  31;  33;  36;  37;  37;  41;  42;  47;  52;  55;  57;  58;  62;  64;  67;  69;  71;  72;  73;  74;  76;  77</p> <ol id="fs-idm82651968" type="a"><li>Find the percentile of 37.</li> <li>Find the percentile of 72.</li> </ol> </div> </div> <div id="fs-idp30887728" data-type="exercise"><div id="fs-idp30887984" data-type="problem"><p id="fs-idp30888240">Jesse was ranked 37<sup>th</sup> in his graduating class of 180 students. At what percentile is Jesse’s ranking?</p> </div> <div id="fs-idm44160976" data-type="solution"><p id="fs-idm44160720">Jesse graduated 37<sup>th</sup> out of a class of 180 students. There are 180 – 37 = 143 students ranked below Jesse. There is one rank of 37.</p> <p id="fs-idm80018544"><em data-effect="italics">x</em> = 143 and <em data-effect="italics">y</em> = 1. \(\frac{x+0.5y}{n}\)(100) = \(\frac{143+0.5\left(1\right)}{180}\)(100) = 79.72. Jesse’s rank of 37 puts him at the 80<sup>th</sup> percentile.</p> </div> </div> <div id="eip-id1168182229232" data-type="exercise"><div id="eip-id1168183555687" data-type="problem"><ol id="eip-id1168185190290" type="a" data-mark-suffix="."><li>For runners in a race, a low time means a faster run. The winners in a race have the shortest running times. Is it more desirable to have a finish time with a high or a low percentile when running a race?</li> <li>The 20<sup>th</sup> percentile of run times in a particular race is 5.2 minutes. Write a sentence interpreting the 20<sup>th</sup> percentile in the context of the situation.</li> <li>A bicyclist in the 90<sup>th</sup> percentile of a bicycle race completed the race in 1 hour and 12 minutes. Is he among the fastest or slowest cyclists in the race? Write a sentence interpreting the 90<sup>th</sup> percentile in the context of the situation.</li> </ol> </div> </div> <div id="eip-id1168182273864" data-type="exercise"><div id="eip-id1168191796049" data-type="problem"><ol id="eip-id5724192" type="a" data-mark-suffix="."><li>For runners in a race, a higher speed means a faster run. Is it more desirable to have a speed with a high or a low percentile when running a race?</li> <li>The 40<sup>th</sup> percentile of speeds in a particular race is 7.5 miles per hour. Write a sentence interpreting the 40<sup>th</sup> percentile in the context of the situation.</li> </ol> </div> <div id="eip-id1168199883378" data-type="solution"><ol id="eip-id1168196369910" type="a" data-mark-suffix="."><li>For runners in a race it is more desirable to have a high percentile for speed. A high percentile means a higher speed which is faster.</li> <li>40% of runners ran at speeds of 7.5 miles per hour or less (slower). 60% of runners ran at speeds of 7.5 miles per hour or more (faster).</li> </ol> </div> </div> <div id="eip-id1168217995987" data-type="exercise"><div id="eip-id1168183864592" data-type="problem"><p id="eip-id1168226425380">On an exam, would it be more desirable to earn a grade with a high or low percentile? Explain.</p> </div> </div> <div id="eip-id1168183173702" data-type="exercise"><div id="eip-id1168227404691" data-type="problem"><p id="eip-id1168230025239">Mina is waiting in line at the Department of Motor Vehicles (DMV). Her wait time of 32 minutes is the 85<sup>th</sup> percentile of wait times. Is that good or bad? Write a sentence interpreting the 85<sup>th</sup> percentile in the context of this situation.</p> </div> <div id="eip-id7704128" data-type="solution"><p id="eip-id1168214950316">When waiting in line at the DMV, the 85<sup>th</sup> percentile would be a long wait time compared to the other people waiting. 85% of people had shorter wait times than Mina. In this context, Mina would prefer a wait time corresponding to a lower percentile. 85% of people at the DMV waited 32 minutes or less. 15% of people at the DMV waited 32 minutes or longer.</p> </div> </div> <div id="eip-id1168213546999" data-type="exercise"><div id="eip-id1168188876815" data-type="problem"><p id="eip-id7349223">In a survey collecting data about the salaries earned by recent college graduates, Li found that her salary was in the 78<sup>th</sup> percentile. Should Li be pleased or upset by this result? Explain.</p> </div> </div> <div id="eip-id1168214876383" data-type="exercise"><div id="eip-id7327842" data-type="problem"><p id="eip-id1168230657040">In a study collecting data about the repair costs of damage to automobiles in a certain type of crash tests, a certain model of car had \$1,700 in damage and was in the 90<sup>th</sup> percentile. Should the manufacturer and the consumer be pleased or upset by this result? Explain and write a sentence that interprets the 90<sup>th</sup> percentile in the context of this problem.</p> </div> <div id="eip-id1168214946038" data-type="solution"><p id="eip-id1168234799988">The manufacturer and the consumer would be upset. This is a large repair cost for the damages, compared to the other cars in the sample. INTERPRETATION: 90% of the crash tested cars had damage repair costs of \$1700 or less; only 10% had damage repair costs of \$1700 or more.</p> </div> </div> <div id="eip-id1168195852900" data-type="exercise"><div id="eip-id1168225195383" data-type="problem"><p id="eip-idm9549040">The University of California has two criteria used to set admission standards for freshman to be admitted to a college in the UC system:</p> <ol id="eip-id1168211096380" type="a" data-mark-suffix=""><li>Students&#8217; GPAs and scores on standardized tests (SATs and ACTs) are entered into a formula that calculates an &#8220;admissions index&#8221; score. The admissions index score is used to set eligibility standards intended to meet the goal of admitting the top 12% of high school students in the state. In this context, what percentile does the top 12% represent?</li> <li>Students whose GPAs are at or above the 96<sup>th</sup> percentile of all students at their high school are eligible (called eligible in the local context), even if they are not in the top 12% of all students in the state. What percentage of students from each high school are &#8220;eligible in the local context&#8221;?</li> </ol> </div> </div> <div id="eip-id1168223160542" data-type="exercise"><div id="eip-id7507305" data-type="problem"><p id="eip-id1168211272126">Suppose that you are buying a house. You and your realtor have determined that the most expensive house you can afford is the 34<sup>th</sup> percentile. The 34<sup>th</sup> percentile of housing prices is \$240,000 in the town you want to move to. In this town, can you afford 34% of the houses or 66% of the houses?</p> </div> <div id="eip-id1168213876148" data-type="solution"><p id="eip-id1168225765198">You can afford 34% of houses. 66% of the houses are too expensive for your budget. INTERPRETATION: 34% of houses cost \$240,000 or less. 66% of houses cost \$240,000 or more.</p> </div> </div> <p id="element-726">Use the following information to answer the next six exercises. Sixty-five randomly selected car salespersons were asked the number of cars they generally sell in one week. Fourteen people answered that they generally sell three cars; nineteen generally sell four cars; twelve generally sell five cars; nine generally sell six cars; eleven generally sell seven cars.</p> <div id="exercisenine" data-type="exercise"><div id="id21439538" data-type="problem"><p>First quartile = _______</p> </div> </div> <div id="exerciseten" data-type="exercise"><div id="id4433542" data-type="problem"><p>Second quartile = median = 50<sup>th</sup> percentile = _______</p> </div> <div id="id12404344" data-type="solution"><p id="element-23635">4</p> </div> </div> <div id="exerciseeleven" data-type="exercise"><div id="id21413333" data-type="problem"><p>Third quartile = _______</p> </div> </div> <div id="exercisetwelve" data-type="exercise"><div id="id13392439" data-type="problem"><p>Interquartile range (<em data-effect="italics">IQR</em>) = _____ – _____ = _____</p> </div> <div id="id10710871" data-type="solution"><p id="element-23646">6 – 4 = 2</p> </div> </div> <div id="exercisethirteen" data-type="exercise"><div id="id14610610" data-type="problem"><p id="prob_13">10<sup>th</sup> percentile = _______</p> </div> </div> <div id="exercisefourteen" data-type="exercise"><div id="id21409553" data-type="problem"><p id="prob_14">70<sup>th</sup> percentile = _______</p> </div> <div id="id23430727" data-type="solution"><p id="element-234636">6</p> </div> </div> </div> <div id="fs-idm1839472" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <div id="element-927" data-type="exercise"><div id="id3483376" data-type="problem"><p id="element-746">1)  Six hundred adult Americans were asked by telephone poll, &#8220;What do you think constitutes a middle-class income?&#8221; The results are in <a class="autogenerated-content" href="#element-588">(Figure)</a>. Also, include left endpoint, but not the right endpoint.</p> <table id="element-588" summary="This table presents the results from a poll on what Americans thought constituted middle class. The first column lists the salary and the second column lists the relative frequency. There are 8 rows."><thead><tr><th>Salary (\$)</th> <th>Relative Frequency</th> </tr> </thead> <tbody><tr><td>&lt; 20,000</td> <td>0.02</td> </tr> <tr><td>20,000–25,000</td> <td>0.09</td> </tr> <tr><td>25,000–30,000</td> <td>0.19</td> </tr> <tr><td>30,000–40,000</td> <td>0.26</td> </tr> <tr><td>40,000–50,000</td> <td>0.18</td> </tr> <tr><td>50,000–75,000</td> <td>0.17</td> </tr> <tr><td>75,000–99,999</td> <td>0.02</td> </tr> <tr><td>100,000+</td> <td>0.01</td> </tr> </tbody> </table> <ol id="element-295" type="a"><li>What percentage of the survey answered &#8220;not sure&#8221;?</li> <li>What percentage think that middle-class is from \$25,000 to \$50,000?</li> <li>Construct a histogram of the data. <ol id="nestlist3" type="i" data-mark-suffix="."><li>Should all bars have the same width, based on the data? Why or why not?</li> <li>How should the &lt;20,000 and the 100,000+ intervals be handled? Why?</li> </ol> </li> <li>Find the 40<sup>th</sup> and 80<sup>th</sup> percentiles</li> <li>Construct a bar graph of the data</li> </ol> </div> </div> </div> <p>&nbsp;</p> <p>&nbsp;</p> <div class="free-response" data-depth="1"><div id="fs-idp35930608" data-type="exercise"><div id="id3500598" data-type="problem"><p>2) Given the following box plot:</p> <div id="fs-idm476896" class="bc-figure figure"><span id="id4775287" data-type="media" data-alt="This is a horizontal boxplot graphed over a number line from 0 to 13. The first whisker extends from the smallest value, 0, to the first quartile, 2. The box begins at the first quartile and extends to third quartile, 12. A vertical, dashed line is drawn at median, 10. The second whisker extends from the third quartile to largest value, 13."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch02_13_02-1.jpg" alt="This is a horizontal boxplot graphed over a number line from 0 to 13. The first whisker extends from the smallest value, 0, to the first quartile, 2. The box begins at the first quartile and extends to third quartile, 12. A vertical, dashed line is drawn at median, 10. The second whisker extends from the third quartile to largest value, 13." width="400" data-media-type="image/jpg" /></span></div> <ol id="element-328" type="a"><li>which quarter has the smallest spread of data? What is that spread?</li> <li>which quarter has the largest spread of data? What is that spread?</li> <li>find the interquartile range (<em data-effect="italics">IQR</em>).</li> <li>are there more data in the interval 5–10 or in the interval 10–13? How do you know this?</li> <li>which interval has the fewest data in it? How do you know this? <ol id="nestlist7" type="i" data-mark-suffix="."><li>0–2</li> <li>2–4</li> <li>10–12</li> <li>12–13</li> <li>need more information</li> </ol> </li> </ol> </div> </div> <div id="element-284" data-type="exercise"><div id="id3912087" data-type="problem"><p>&nbsp;</p> <p id="element-874">3) The following box plot shows the U.S. population for 1990, the latest available year.</p> <div id="fs-idm132205520" class="bc-figure figure"><span id="id7587202" data-type="media" data-alt="A box plot with values from 0 to 105, with Q1 at 17, M at 33, and Q3 at 50."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch02_13_08-1.jpg" alt="A box plot with values from 0 to 105, with Q1 at 17, M at 33, and Q3 at 50." width="400" data-media-type="image/jpg" /></span></div> <ol type="a"><li>Are there fewer or more children (age 17 and under) than senior citizens (age 65 and over)? How do you know?</li> <li>12.6% are age 65 and over. Approximately what percentage of the population are working age adults (above age 17 to age 65)?</li> </ol> <p>&nbsp;</p> </div> <div id="id7597969" data-type="solution"><p id="fs-idm5042080">The median age for U.S. blacks currently is 30.9 years; for U.S. whites it is 42.3 years.</p> <ol id="fs-idm33581296" type="a"><li>Based upon this information, give two reasons why the black median age could be lower than the white median age.</li> <li>Does the lower median age for blacks necessarily mean that blacks die younger than whites? Why or why not?</li> <li>How might it be possible for blacks and whites to die at approximately the same age, but for the median age for whites to be higher?</li> </ol> </div> </div> </div> <p>Answers to odd questions</p> <p>1)</p> <ol id="element-295a" type="a"><li>1 – (0.02+0.09+0.19+0.26+0.18+0.17+0.02+0.01) = 0.06</li> <li>0.19+0.26+0.18 = 0.63</li> <li>Check student’s solution.</li> <li><p id="eip-idp139654864">40<sup>th</sup> percentile will fall between 30,000 and 40,000</p> <p id="eip-idp139655632">80<sup>th</sup> percentile will fall between 50,000 and 75,000</p> </li> <li>Check student’s solution.</li> </ol> <p>3)</p> <ol type="a" data-mark-suffix="."><li>more children; the left whisker shows that 25% of the population are children 17 and younger. The right whisker shows that 25% of the population are adults 50 and older, so adults 65 and over represent less than 25%.</li> <li>62.4%</li> </ol> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="iqr"><dt>Interquartile Range</dt> <dd id="id15896860">or <em data-effect="italics">IQR</em>, is the range of the middle 50 percent of the data values; the <em data-effect="italics">IQR</em> is found by subtracting the first quartile from the third quartile.</dd> </dl> <dl id="outlier"><dt>Outlier</dt> <dd id="id1171166689919">an observation that does not fit the rest of the data</dd> </dl> <dl id="percentile"><dt>Percentile</dt> <dd id="id19436015">a number that divides ordered data into hundredths; percentiles may or may not be part of the data. The median of the data is the second quartile and the 50<sup>th</sup> percentile. The first and third quartiles are the 25<sup>th</sup> and the 75<sup>th</sup> percentiles, respectively.</dd> </dl> <dl id="quartiles"><dt>Quartiles</dt> <dd id="id1164416504778">the numbers that separate the data into quarters; quartiles may or may not be part of the data. The second quartile is the median of the data.</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-box-plots" title="Chapter 2.8: Box Plots"><div class="chapter-title-wrap"><h3 class="chapter-number">14</h3><h2 class="chapter-title"><span class="display-none">Chapter 2.8: Box Plots</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p><span data-type="term">Box plots</span> (also called <span data-type="term">box-and-whisker plots</span> or <span data-type="term">box-whisker plots</span>) give a good graphical image of the concentration of the data. They also show how far the extreme values are from most of the data. A box plot is constructed from five values: the minimum value, the first quartile, the median, the third quartile, and the maximum value. We use these values to compare how close other data values are to them.</p> <p>To construct a box plot, use a horizontal or vertical number line and a rectangular box. The smallest and largest data values label the endpoints of the axis. The first quartile marks one end of the box and the third quartile marks the other end of the box. Approximately <strong>the middle 50 percent of the data fall inside the box.</strong> The &#8220;whiskers&#8221; extend from the ends of the box to the smallest and largest data values. The median or second quartile can be between the first and third quartiles, or it can be one, or the other, or both. The box plot gives a good, quick picture of the data.</p> <div id="eip-724" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="eip-idp95320688">You may encounter box-and-whisker plots that have dots marking outlier values. In those cases, the whiskers are not extending to the minimum and maximum values.</p> </div> <p>Consider, again, this dataset.</p> <p id="element-238907"><span id="set-476" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">1</span>  <span data-type="item">1</span>  <span data-type="item">2</span>  <span data-type="item">2</span>  <span data-type="item">4</span>  <span data-type="item">6</span>  <span data-type="item">6.8</span>  <span data-type="item">7.2 </span> <span data-type="item">8</span>  <span data-type="item">8.3 </span> <span data-type="item">9</span>  <span data-type="item">10</span>  <span data-type="item">10</span>  <span data-type="item">11.5</span> </span></p> <p id="element-23123">The first quartile is two, the median is seven, and the third quartile is nine. The smallest value is one, and the largest value is 11.5. The following image shows the constructed box plot.</p> <div id="fs-idp40306080" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="eip-idp22327712">See the calculator instructions on the <a href="http://education.ti.com/educationportal/sites/US/sectionHome/support.html">TI web site</a> or in the appendix.</p> </div> <div id="fs-idp11793488" class="bc-figure figure"><span id="id8621327" data-type="media" data-alt="Horizontal boxplot's first whisker extends from the smallest value, 1, to the first quartile, 2, the box begins at the first quartile and extends to the third quartile, 9, a vertical dashed line is drawn at the median, 7, and the second whisker extends from the third quartile to the largest value of 11.5." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch_02_05_01-1.jpg" alt="Horizontal boxplot's first whisker extends from the smallest value, 1, to the first quartile, 2, the box begins at the first quartile and extends to the third quartile, 9, a vertical dashed line is drawn at the median, 7, and the second whisker extends from the third quartile to the largest value of 11.5." width="420" data-media-type="image/jpg" /></span></div> <p>The two whiskers extend from the first quartile to the smallest value and from the third quartile to the largest value. The median is shown with a dashed line.</p> <div id="fs-idp59493440" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="fs-idp69379904">It is important to start a box plot with a <strong data-effect="bold">scaled number line</strong>. Otherwise the box plot may not be useful.</p> </div> <div id="element-32" class="textbox textbox--examples" data-type="example"><p>The following data are the heights of 40 students in a statistics class.</p> <p id="element-731"><span id="element-2134" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">59</span>  <span data-type="item">60 </span> <span data-type="item">61 </span> <span data-type="item">62 </span> <span data-type="item">62 </span> <span data-type="item">63 </span> <span data-type="item">63 </span> <span data-type="item">64 </span> <span data-type="item">64</span>  <span data-type="item">64 </span> <span data-type="item">65</span>  <span data-type="item">65</span>  <span data-type="item">65</span>  <span data-type="item">65</span>  <span data-type="item">65</span>  <span data-type="item">65</span>  <span data-type="item">65</span>  <span data-type="item">65 </span> <span data-type="item">65 </span> <span data-type="item">66 </span> <span data-type="item">66 </span> <span data-type="item">67</span>  <span data-type="item">67</span>  <span data-type="item">68</span>  <span data-type="item">68 </span> <span data-type="item">69 </span> <span data-type="item">70 </span> <span data-type="item">70 </span> <span data-type="item">70</span>  <span data-type="item">70</span>  <span data-type="item">70</span>  <span data-type="item">71</span>  <span data-type="item">71 </span> <span data-type="item">72</span>  <span data-type="item">72</span>  <span data-type="item">73</span>  <span data-type="item">74</span>  <span data-type="item">74 </span> <span data-type="item">75 </span> <span data-type="item">77</span> </span></p> <p id="element-483">Construct a box plot with the following properties; the calculator intructions for the minimum and maximum values as well as the quartiles follow the example.</p> <ul id="element-172"><li>Minimum value = 59</li> <li>Maximum value = 77</li> <li><em data-effect="italics">Q</em>1: First quartile = 64.5</li> <li><em data-effect="italics">Q</em>2: Second quartile or median= 66</li> <li><em data-effect="italics">Q</em>3: Third quartile = 70</li> </ul> <div id="fs-idm18897872" class="bc-figure figure"><span id="id1164416794841" data-type="media" data-alt="Horizontal boxplot with first whisker extending from smallest value, 59, to Q1, 64.5, box beginning from Q1 to Q3, 70, median dashed line at Q2, 66, and second whisker extending from Q3 to largest value, 77." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch_02_05_02-1.jpg" alt="Horizontal boxplot with first whisker extending from smallest value, 59, to Q1, 64.5, box beginning from Q1 to Q3, 70, median dashed line at Q2, 66, and second whisker extending from Q3 to largest value, 77." width="420" data-media-type="image/jpg" /></span></div> <ol id="element-754" type="a" data-mark-suffix="."><li>Each quarter has approximately 25% of the data.</li> <li>The spreads of the four quarters are 64.5 – 59 = 5.5 (first quarter), 66 – 64.5 = 1.5 (second quarter), 70 – 66 = 4 (third quarter), and 77 – 70 = 7 (fourth quarter). So, the second quarter has the smallest spread and the fourth quarter has the largest spread.</li> <li>Range = maximum value – the minimum value = 77 – 59 = 18</li> <li>Interquartile Range: <em data-effect="italics">IQR</em> = <em data-effect="italics">Q</em>3 – <em data-effect="italics">Q</em>1 = 70 – 64.5 = 5.5.</li> <li>The interval 59–65 has more than 25% of the data so it has more data in it than the interval 66 through 70 which has 25% of the data.</li> <li>The middle 50% (middle half) of the data has a range of 5.5 inches.</li> </ol> </div> <div id="fs-idp87772032" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="fs-idm41967520">To find the minimum, maximum, and quartiles:</p> <p id="fs-idp68357280">Enter data into the list editor (Pres STAT 1:EDIT). If you need to clear the list, arrow up to the name L1, press CLEAR, and then arrow down.</p> <p id="fs-idp62743968">Put the data values into the list L1.</p> <p id="fs-idp69578208">Press STAT and arrow to CALC. Press 1:1-VarStats. Enter L1.</p> <p id="fs-idp74492720">Press ENTER.</p> <p id="fs-idp69067168">Use the down and up arrow keys to scroll.</p> <p id="fs-idp53473856">Smallest value = 59.</p> <p id="fs-idp58572832">Largest value = 77.</p> <p id="fs-idp87802880"><em data-effect="italics">Q</em><sub>1</sub>: First quartile = 64.5.</p> <p id="fs-idp30335392"><em data-effect="italics">Q</em><sub>2</sub>: Second quartile or median = 66.</p> <p id="fs-idp53554416"><em data-effect="italics">Q</em><sub>3</sub>: Third quartile = 70.</p> <p>&nbsp;</p> <p id="fs-idp67418928">To construct the box plot:</p> <p id="fs-idp65417472">Press 4:Plotsoff. Press ENTER.</p> <p id="fs-idm19087248">Arrow down and then use the right arrow key to go to the fifth picture, which is the box plot. Press ENTER.</p> <p id="fs-idp74884208">Arrow down to Xlist: Press 2nd 1 for L1</p> <p id="fs-idp72873744">Arrow down to Freq: Press ALPHA. Press 1.</p> <p id="fs-idp75665280">Press Zoom. Press 9: ZoomStat.</p> <p id="fs-idp65135968">Press TRACE, and use the arrow keys to examine the box plot.</p> </div> <div id="fs-idp31328448" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp52595328" data-type="exercise"><div id="fs-idm27882400" data-type="problem"><p id="fs-idp69268960">The following data are the number of pages in 40 books on a shelf. Construct a box plot using a graphing calculator, and state the interquartile range.</p> <p id="fs-idm41586448"><span data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">136</span>  <span data-type="item">140 </span> <span data-type="item">178 </span> <span data-type="item">190 </span> <span data-type="item">205 </span> <span data-type="item">215</span>  <span data-type="item">217</span>  <span data-type="item">218</span>  <span data-type="item">232 </span> <span data-type="item">234 </span> <span data-type="item">240 </span> <span data-type="item">255 </span> <span data-type="item">270 </span> <span data-type="item">275 </span> <span data-type="item">290 </span> <span data-type="item">301</span>  <span data-type="item">303 </span> <span data-type="item">315 </span> <span data-type="item">317 </span> <span data-type="item">318 </span> <span data-type="item">326 </span> <span data-type="item">333 </span> <span data-type="item">343 </span> <span data-type="item">349</span>  <span data-type="item">360</span>  <span data-type="item">369 </span> <span data-type="item">377 </span> <span data-type="item">388 </span> <span data-type="item">391 </span> <span data-type="item">392 </span> <span data-type="item">398 </span> <span data-type="item">400</span>  <span data-type="item">402 </span> <span data-type="item">405 </span> <span data-type="item">408 </span> <span data-type="item">422 </span> <span data-type="item">429</span>  <span data-type="item">450</span>  <span data-type="item">475 </span> <span data-type="item">512</span> </span></p> </div> </div> </div> <p id="element-155">For some sets of data, some of the largest value, smallest value, first quartile, median, and third quartile may be the same. For instance, you might have a data set in which the median and the third quartile are the same. In this case, the diagram would not have a dotted line inside the box displaying the median. The right side of the box would display both the third quartile and the median. For example, if the smallest value and the first quartile were both one, the median and the third quartile were both five, and the largest value was seven, the box plot would look like:</p> <div id="fs-idm16338080" class="bc-figure figure"><span id="id1164414369705" data-type="media" data-alt="Horizontal boxplot box begins at the smallest value and Q1, 1, until the Q3 and median, 5, no median line is designated, and has its lone whisker extending from the Q3 to the largest value, 7." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch_02_05_03-1.jpg" alt="Horizontal boxplot box begins at the smallest value and Q1, 1, until the Q3 and median, 5, no median line is designated, and has its lone whisker extending from the Q3 to the largest value, 7." width="380" data-media-type="image/jpg" /></span></div> <p id="fs-idm26870032">In this case, at least 25% of the values are equal to one. Twenty-five percent of the values are between one and five, inclusive. At least 25% of the values are equal to five. The top 25% of the values fall between five and seven, inclusive.</p> <div id="element-583" class="textbox textbox--examples" data-type="example"><p id="element-601">Test scores for a college statistics class held during the day are:</p> <p id="element-891"><span id="element-127" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">99</span>  <span data-type="item">56</span>  <span data-type="item">78</span>  <span data-type="item">55.5 </span> <span data-type="item">32 </span> <span data-type="item">90 </span> <span data-type="item">80</span>  <span data-type="item">81</span>  <span data-type="item">56</span>  <span data-type="item">59</span>  <span data-type="item">45 </span> <span data-type="item">77 </span> <span data-type="item">84.5 </span> <span data-type="item">84 </span> <span data-type="item">70 </span> <span data-type="item">72 </span> <span data-type="item">68</span>  <span data-type="item">32</span>  <span data-type="item">79</span>  <span data-type="item">90</span> </span></p> <p id="element-212">Test scores for a college statistics class held during the evening are:</p> <p id="element-763"><span id="element-711" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">98 </span> <span data-type="item">78</span>  <span data-type="item">68</span>  <span data-type="item">83</span>  <span data-type="item">81 </span> <span data-type="item">89</span>  <span data-type="item">88</span>  <span data-type="item">76</span>  <span data-type="item">65</span>  <span data-type="item">45</span>  <span data-type="item">98</span>  <span data-type="item">90 </span> <span data-type="item">80</span>  <span data-type="item">84.5 </span> <span data-type="item">85</span>  <span data-type="item">79</span>  <span data-type="item">78</span>  <span data-type="item">98</span>  <span data-type="item">90</span>  <span data-type="item">79</span>  <span data-type="item">81</span>  <span data-type="item">25.5</span> </span></p> <div id="element-23526" data-type="exercise"><div id="id1164420007817" data-type="problem"><ol id="element-778" type="a"><li>Find the smallest and largest values, the median, and the first and third quartile for the day class.</li> <li>Find the smallest and largest values, the median, and the first and third quartile for the night class.</li> <li>For each data set, what percentage of the data is between the smallest value and the first quartile? the first quartile and the median? the median and the third quartile? the third quartile and the largest value? What percentage of the data is between the first quartile and the largest value?</li> <li>Create a box plot for each set of data. Use one number line for both box plots.</li> <li>Which box plot has the widest spread for the middle 50% of the data (the data between the first and third quartiles)? What does this mean for that set of data in comparison to the other set of data?</li> </ol> </div> <div id="element-601s" data-type="solution" data-print-placement="end"><ol type="a"><li><ul id="eip-idp9991488" data-labeled-item="true"><li>Min = 32</li> <li><em data-effect="italics">Q</em><sub>1</sub> = 56</li> <li><em data-effect="italics">M</em> = 74.5</li> <li><em data-effect="italics">Q</em><sub>3</sub> = 82.5</li> <li>Max = 99</li> </ul> </li> <li><ul id="element-1062" data-labeled-item="true"><li>Min = 25.5</li> <li><em data-effect="italics">Q</em><sub>1</sub> = 78</li> <li><em data-effect="italics">M</em> = 81</li> <li><em data-effect="italics">Q</em><sub>3</sub> = 89</li> <li>Max = 98</li> </ul> </li> <li>Day class: There are six data values ranging from 32 to 56: 30%. There are six data values ranging from 56 to 74.5: 30%. There are five data values ranging from 74.5 to 82.5: 25%. There are five data values ranging from 82.5 to 99: 25%. There are 16 data values between the first quartile, 56, and the largest value, 99: 75%. Night class:</li> <li><div id="fs-idp65829088" class="bc-figure figure"><span id="id1164411411845" data-type="media" data-alt="Two box plots over a number line from 0 to 100. The top plot shows a whisker from 32 to 56, a solid line at 56, a dashed line at 74.5, a solid line at 82.5, and a whisker from 82.5 to 99. The lower plot shows a whisker from 25.5 to 78, solid line at 78, dashed line at 81, solid line at 89, and a whisker from 89 to 98." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch02_sol_01-1.jpg" alt="Two box plots over a number line from 0 to 100. The top plot shows a whisker from 32 to 56, a solid line at 56, a dashed line at 74.5, a solid line at 82.5, and a whisker from 82.5 to 99. The lower plot shows a whisker from 25.5 to 78, solid line at 78, dashed line at 81, solid line at 89, and a whisker from 89 to 98." width="380" data-media-type="image/jpg" /></span></div> </li> <li>The first data set has the wider spread for the middle 50% of the data. The <em data-effect="italics">IQR</em> for the first data set is greater than the <em data-effect="italics">IQR</em> for the second set. This means that there is more variability in the middle 50% of the first data set.</li> </ol> </div> </div> </div> <div id="fs-idp63834272" class="statistics try finger" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm19314304" data-type="exercise"><div id="fs-idm1151200" data-type="problem"><p id="fs-idp29298016">The following data set shows the heights in inches for the boys in a class of 40 students.</p> <p id="fs-idm35019456">66;  66;  67;  67;  68;  68;  68;  68;  68;  69;  69;  69;  70;  71;  72;  72;  72;  73;  73;  74 <span data-type="newline"><br /> </span>The following data set shows the heights in inches for the girls in a class of 40 students. <span data-type="newline"><br /> </span>61;  61;  62;  62;  63;  63;  63;  65;  65;  65;  66;  66;  66;  67;  68;  68;  68;  69;  69;  69 <span data-type="newline"><br /> </span>Construct a box plot using a graphing calculator for each data set, and state which box plot has the wider spread for the middle 50% of the data.</p> </div> </div> </div> <div id="fs-idm38612912" class="textbox textbox--examples" data-type="example"><p id="fs-idm15332992">Graph a box-and-whisker plot for the data values shown.</p> <p id="fs-idp63131872"><span data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">10</span><span data-type="item">10</span><span data-type="item">10</span><span data-type="item">15</span><span data-type="item">35</span><span data-type="item">75</span><span data-type="item">90</span><span data-type="item">95</span><span data-type="item">100</span><span data-type="item">175</span><span data-type="item">420</span><span data-type="item">490</span><span data-type="item">515</span><span data-type="item">515</span><span data-type="item">790</span> </span></p> <p id="fs-idp59445184">The five numbers used to create a box-and-whisker plot are:</p> <ul id="fs-idp3956176" data-labeled-item="true"><li>Min: 10</li> <li><em data-effect="italics">Q</em><sub>1</sub>: 15</li> <li>Med: 95</li> <li><em data-effect="italics">Q</em><sub>3</sub>: 490</li> <li>Max: 790</li> </ul> <p id="fs-idp68810672">The following graph shows the box-and-whisker plot.</p> <div id="fs-idm67197680" class="bc-figure figure"><span id="fs-idp52050960" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C02_M05_015-1.jpg" alt="" width="420" data-media-type="image/jpg" /></span></div> </div> <div id="fs-idp43137808" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="eip-idp22858976" data-type="exercise"><div id="eip-idp22859232" data-type="problem"><p id="fs-idm36238000">Follow the steps you used to graph a box-and-whisker plot for the data values shown.</p> <p id="fs-idm33689504"><span data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">0  </span><span data-type="item">5  </span><span data-type="item">5  </span><span data-type="item">1  5  </span><span data-type="item">3  0  </span><span data-type="item">3  0  </span><span data-type="item">4  5  </span><span data-type="item">5  0  </span><span data-type="item">5  0  </span><span data-type="item">6  0  </span><span data-type="item">7  5  </span><span data-type="item">1  1  0  </span><span data-type="item">1  4  0  </span><span data-type="item">2  4  0  </span><span data-type="item">3  3  0</span> </span></p> </div> </div> </div> <div id="fs-idp34607616" class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="fs-idm91067600">Data from <em data-effect="italics">West Magazine</em>.</p> </div> <div id="fs-idp44166016" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idp62162768">Box plots are a type of graph that can help visually organize data. To graph a box plot the following data points must be calculated: the minimum value, the first quartile, the median, the third quartile, and the maximum value. Once the box plot is graphed, you can display and compare distributions of data.</p> </div> <div id="fs-idm61864272" class="practice" data-depth="1"><p id="fs-idp75507648">Use the following information to answer the next two exercises. Sixty-five randomly selected car salespersons were asked the number of cars they generally sell in one week. Fourteen people answered that they generally sell three cars; nineteen generally sell four cars; twelve generally sell five cars; nine generally sell six cars; eleven generally sell seven cars.</p> <div id="fs-idm23427584" data-type="exercise"><div id="fs-idm58196288" data-type="problem"><p id="element-580">Construct a box plot below. Use a ruler to measure and scale accurately.</p> </div> </div> <div id="fs-idm39140224" data-type="exercise"><div id="fs-idp35412208" data-type="problem"><p>Looking at your box plot, does it appear that the data are concentrated together, spread out evenly, or concentrated in some areas, but not in others? How can you tell?</p> </div> <div id="fs-idp72108720" data-type="solution"><p id="fs-idp104832256">More than 25% of salespersons sell four cars in a typical week. You can see this concentration in the box plot because the first quartile is equal to the median. The top 25% and the bottom 25% are spread out evenly; the whiskers have the same length.</p> </div> </div> </div> <div id="fs-idp27456352" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <div id="fs-idp68490208" data-type="exercise"><div id="fs-idp68490464" data-type="problem"><p>&nbsp;</p> </div> </div> <div id="element-606" data-type="exercise"><div id="id4047646" data-type="problem"><p id="fs-idm12685520">1)  Given the following box plot, answer the questions.</p> <div id="fs-idp53170192" class="bc-figure figure"><span id="id4992419" data-type="media" data-alt="This is a boxplot graphed over a number line from 0 to 150. There is no first, or left, whisker. The box starts at the first quartile, 0, and ends at the third quartile, 80. A vertical, dashed line marks the median, 20. The second whisker extends the third quartile to the largest value, 150." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch02_13_03-1.jpg" alt="This is a boxplot graphed over a number line from 0 to 150. There is no first, or left, whisker. The box starts at the first quartile, 0, and ends at the third quartile, 80. A vertical, dashed line marks the median, 20. The second whisker extends the third quartile to the largest value, 150." width="400" data-media-type="image/jpg" /></span></div> <ol id="element-554" type="a" data-mark-suffix="."><li>Think of an example (in words) where the data might fit into the above box plot. In 2–5 sentences, write down the example.</li> <li>What does it mean to have the first and second quartiles so close together, while the second to third quartiles are far apart?</li> </ol> </div> </div> </div> <p>&nbsp;</p> <p>&nbsp;</p> <div class="free-response" data-depth="1"><div id="element-990" data-type="exercise"><div id="id9907048" data-type="problem"><p>2) Given the following box plots, answer the questions.</p> <div id="fs-idm76469696" class="bc-figure figure"><span id="id4260732" data-type="media" data-alt="This shows two boxplots graphed over number lines from 0 to 7. The first whisker in the data 1 boxplot extends from 0 to 2. The box begins at the firs quartile, 2, and ends at the third quartile, 5. A vertical, dashed line marks the median at 4. The second whisker extends from the third quartile to the largest value, 7. The first whisker in the data 2 box plot extends from 0 to 1.3. The box begins at the first quartile, 1.3, and ends at the third quartile, 2.5. A vertical, dashed line marks the medial at 2. The second whisker extends from the third quartile to the largest value, 7." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch02_13_06-1.jpg" alt="This shows two boxplots graphed over number lines from 0 to 7. The first whisker in the data 1 boxplot extends from 0 to 2. The box begins at the firs quartile, 2, and ends at the third quartile, 5. A vertical, dashed line marks the median at 4. The second whisker extends from the third quartile to the largest value, 7. The first whisker in the data 2 box plot extends from 0 to 1.3. The box begins at the first quartile, 1.3, and ends at the third quartile, 2.5. A vertical, dashed line marks the medial at 2. The second whisker extends from the third quartile to the largest value, 7." width="400" data-media-type="image/jpg" /></span></div> <ol id="element-718" type="a" data-mark-suffix="."><li>In complete sentences, explain why each statement is false. <ol id="nestlist13" type="i" data-mark-suffix="."><li><strong>Data 1</strong> has more data values above two than <strong>Data 2</strong> has above two.</li> <li>The data sets cannot have the same mode.</li> <li>For <strong>Data 1</strong>, there are more data values below four than there are above four.</li> </ol> </li> <li>For which group, Data 1 or Data 2, is the value of “7” more likely to be an outlier? Explain why in complete sentences.</li> </ol> </div> </div> <div id="fs-idp23710160" data-type="exercise"><div id="fs-idp61030656" data-type="problem"><p>&nbsp;</p> <p id="fs-idp61030912">3) A survey was conducted of 130 purchasers of new BMW 3 series cars, 130 purchasers of new BMW 5 series cars, and 130 purchasers of new BMW 7 series cars. In it, people were asked the age they were when they purchased their car. The following box plots display the results.</p> <div id="fs-idm78146976" class="bc-figure figure"><span id="fs-idp33210608" data-type="media" data-alt="This shows three boxplots graphed over a number line from 25 to 80. The first whisker on the BMW 3 plot extends from 25 to 30. The box begins at the firs quartile, 30 and ends at the thir quartile, 41. A verical, dashed line marks the median at 34. The second whisker extends from the third quartile to 66. The first whisker on the BMW 5 plot extends from 31 to 40. The box begins at the firs quartile, 40, and ends at the third quartile, 55. A vertical, dashed line marks the median at 41. The second whisker extends from 55 to 64. The first whisker on the BMW 7 plot extends from 35 to 41. The box begins at the first quartile, 41, and ends at the third quartile, 59. A vertical, dashed line marks the median at 46. The second whisker extends from 59 to 68." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch02_13_07-1.jpg" alt="This shows three boxplots graphed over a number line from 25 to 80. The first whisker on the BMW 3 plot extends from 25 to 30. The box begins at the firs quartile, 30 and ends at the thir quartile, 41. A verical, dashed line marks the median at 34. The second whisker extends from the third quartile to 66. The first whisker on the BMW 5 plot extends from 31 to 40. The box begins at the firs quartile, 40, and ends at the third quartile, 55. A vertical, dashed line marks the median at 41. The second whisker extends from 55 to 64. The first whisker on the BMW 7 plot extends from 35 to 41. The box begins at the first quartile, 41, and ends at the third quartile, 59. A vertical, dashed line marks the median at 46. The second whisker extends from 59 to 68." width="400" data-media-type="image/jpg" /></span></div> <ol id="element-24" type="a"><li>In complete sentences, describe what the shape of each box plot implies about the distribution of the data collected for that car series.</li> <li>Which group is most likely to have an outlier? Explain how you determined that.</li> <li>Compare the three box plots. What do they imply about the age of purchasing a BMW from the series when compared to each other?</li> <li>Look at the BMW 5 series. Which quarter has the smallest spread of data? What is the spread?</li> <li>Look at the BMW 5 series. Which quarter has the largest spread of data? What is the spread?</li> <li>Look at the BMW 5 series. Estimate the interquartile range (IQR).</li> <li>Look at the BMW 5 series. Are there more data in the interval 31 to 38 or in the interval 45 to 55? How do you know this?</li> <li>Look at the BMW 5 series. Which interval has the fewest data in it? How do you know this? <ol id="fs-idp34051952" type="i" data-mark-suffix="."><li>31–35</li> <li>38–41</li> <li>41–64</li> </ol> </li> </ol> </div> </div> </div> <p>&nbsp;</p> <p>&nbsp;</p> <div class="free-response" data-depth="1"><div id="element-833" data-type="exercise"><div id="id3776067" data-type="problem"><p id="element-202">4)  Twenty-five randomly selected students were asked the number of movies they watched the previous week. The results are as follows:</p> <table id="fs-idm41926784" summary="The table presents the number of movies 25 students watched in the previous week. The first column lists the number of movies from 0-4, the second column lists the frequency with the values of 5, 9, 6, 4, 1, the third column is for relative frequency and is blank, and the fourth column is for cumulative relative frequency and is blank."><thead><tr><th># of movies</th> <th>Frequency</th> </tr> </thead> <tbody><tr><td>0</td> <td>5</td> </tr> <tr><td>1</td> <td>9</td> </tr> <tr><td>2</td> <td>6</td> </tr> <tr><td>3</td> <td>4</td> </tr> <tr><td>4</td> <td>1</td> </tr> </tbody> </table> <p id="fs-idp36227856">Construct a box plot of the data.</p> <p>&nbsp;</p> </div> </div> </div> <div id="fs-idm80158176" class="bring-together-homework" data-depth="1"><h3 data-type="title">Bringing It Together</h3> <div id="fs-idp67679120" data-type="exercise"><div id="fs-idp67679376" data-type="problem"><p id="fs-idp43357184">5) Santa Clara County, CA, has approximately 27,873 Japanese-Americans. Their ages are as follows:</p> <table id="fs-idm41920944" summary="This table presents Japanese-Americans and their ages from Santa Clara County. The first column lists the age group and the second column lists the percent of the community. There are 7 rows."><thead><tr><th>Age Group</th> <th>Percent of Community</th> </tr> </thead> <tbody><tr><td>0–17</td> <td>18.9</td> </tr> <tr><td>18–24</td> <td>8.0</td> </tr> <tr><td>25–34</td> <td>22.8</td> </tr> <tr><td>35–44</td> <td>15.0</td> </tr> <tr><td>45–54</td> <td>13.1</td> </tr> <tr><td>55–64</td> <td>11.9</td> </tr> <tr><td>65+</td> <td>10.3</td> </tr> </tbody> </table> <ol id="fs-idm34981104" type="a"><li>Construct a histogram of the Japanese-American community in Santa Clara County, CA. The bars will <strong>not</strong> be the same width for this example. Why not? What impact does this have on the reliability of the graph?</li> <li>What percentage of the community is under age 35?</li> <li>Which box plot most resembles the information above?</li> </ol> <div id="fs-idm2636592" class="bc-figure figure"><span id="fs-idm26491920" data-type="media" data-alt="Three box plots with values between 0 and 100. Plot i has Q1 at 24, M at 34, and Q3 at 53; Plot ii has Q1 at 18, M at 34, and Q3 at 45; Plot iii has Q1 at 24, M at 25, and Q3 at 54." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch02_13_04-1.jpg" alt="Three box plots with values between 0 and 100. Plot i has Q1 at 24, M at 34, and Q3 at 53; Plot ii has Q1 at 18, M at 34, and Q3 at 45; Plot iii has Q1 at 24, M at 25, and Q3 at 54." width="450" data-media-type="image/jpg" /></span></div> </div> </div> </div> <p>&nbsp;</p> <p>&nbsp;</p> <div class="bring-together-homework" data-depth="1"><div data-type="exercise"><div id="fs-idp100901312" data-type="solution"><p>6)  In a survey of 20-year-olds in China, Germany, and the United States, people were asked the number of foreign countries they had visited in their lifetime. The following box plots display the results.</p> <div id="fs-idm75185920" class="bc-figure figure"><span id="fs-idm9643216" data-type="media" data-alt="This shows three boxplots graphed over a number line from 0 to 11. The boxplots match the supplied data, and compare the countries' results. The China boxplot has a single whisker from 0 to 5. The Germany box plot's median is equal to the third quartile, so there is a dashed line at right edge of box. The America boxplot does not have a left whisker." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch02_13_01_rev-1.jpg" alt="This shows three boxplots graphed over a number line from 0 to 11. The boxplots match the supplied data, and compare the countries' results. The China boxplot has a single whisker from 0 to 5. The Germany box plot's median is equal to the third quartile, so there is a dashed line at right edge of box. The America boxplot does not have a left whisker." width="420" data-media-type="image/jpg" /></span></div> <ol id="fs-idm29105040" type="a"><li>In complete sentences, describe what the shape of each box plot implies about the distribution of the data collected.</li> <li>Have more Americans or more Germans surveyed been to over eight foreign countries?</li> <li>Compare the three box plots. What do they imply about the foreign travel of 20-year-old residents of the three countries when compared to each other?</li> </ol> <p><strong>Answers to odd questions</strong></p> <p>1)</p> <ol id="fs-idp69928608" type="a"><li>Answers will vary. Possible answer: State University conducted a survey to see how involved its students are in community service. The box plot shows the number of community service hours logged by participants over the past year.</li> <li>Because the first and second quartiles are close, the data in this quarter is very similar. There is not much variation in the values. The data in the third quarter is much more variable, or spread out. This is clear because the second quartile is so far away from the third quartile.</li> </ol> </div> </div> </div> <p>3)</p> <ol id="fs-idp2465552" type="a"><li>Each box plot is spread out more in the greater values. Each plot is skewed to the right, so the ages of the top 50% of buyers are more variable than the ages of the lower 50%.</li> <li>The BMW 3 series is most likely to have an outlier. It has the longest whisker.</li> <li>Comparing the median ages, younger people tend to buy the BMW 3 series, while older people tend to buy the BMW 7 series. However, this is not a rule, because there is so much variability in each data set.</li> <li>The second quarter has the smallest spread. There seems to be only a three-year difference between the first quartile and the median.</li> <li>The third quarter has the largest spread. There seems to be approximately a 14-year difference between the median and the third quartile.</li> <li><em data-effect="italics">IQR</em> ~ 17 years</li> <li>There is not enough information to tell. Each interval lies within a quarter, so we cannot tell exactly where the data in that quarter is concentrated.</li> <li>The interval from 31 to 35 years has the fewest data values. Twenty-five percent of the values fall in the interval 38 to 41, and 25% fall between 41 and 64. Since 25% of values fall between 31 and 38, we know that fewer than 25% fall between 31 and 35.</li> </ol> <p>5)</p> <ol id="fs-idp68335440" type="a" data-mark-suffix="."><li>For graph, check student&#8217;s solution.</li> <li>49.7% of the community is under the age of 35.</li> <li>Based on the information in the table, graph (a) most closely represents the data.</li> </ol> <p>&nbsp;</p> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="fs-idp72865536"><dt>Box plot</dt> <dd id="fs-idp43755328">a graph that gives a quick picture of the middle 50% of the data</dd> </dl> <dl id="fs-idp102508064"><dt>First Quartile</dt> <dd id="fs-idp111069232">the value that is the median of the of the lower half of the ordered data set</dd> </dl> <dl id="fs-idp97747904"><dt>Frequency Polygon</dt> <dd id="fs-idm46549456">looks like a line graph but uses intervals to display ranges of large amounts of data</dd> </dl> <dl id="fs-idp84101520"><dt>Interval</dt> <dd id="fs-idm8914352">also called a class interval; an interval represents a range of data and is used when displaying large data sets</dd> </dl> <dl id="fs-idm26718304"><dt>Paired Data Set</dt> <dd id="fs-idp116205376">two data sets that have a one to one relationship so that: <ul id="fs-idp85390096"><li>both data sets are the same size, and</li> <li>each data point in one data set is matched with exactly one point from the other set.</li> </ul> </dd> </dl> <dl id="fs-idp31125760"><dt>Skewed</dt> <dd id="fs-idp57238128">used to describe data that is not symmetrical; when the right side of a graph looks “chopped off” compared the left side, we say it is “skewed to the left.” When the left side of the graph looks “chopped off” compared to the right side, we say the data is “skewed to the right.” Alternatively: when the lower values of the data are more spread out, we say the data are skewed to the left. When the greater values are more spread out, the data are skewed to the right.</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-descriptive-statistics" title="Activity 2.9: Descriptive Statistics"><div class="chapter-title-wrap"><h3 class="chapter-number">15</h3><h2 class="chapter-title"><span class="display-none">Activity 2.9: Descriptive Statistics</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-id1169266836319" class="statistics lab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Descriptive Statistics</div> <p id="element-674">Class Time:</p> <p>Names:</p> <div id="element-397s" data-type="list"><div data-type="title">Student Learning Outcomes</div> <ul><li>The student will construct a histogram and a box plot.</li> <li>The student will calculate univariate statistics.</li> <li>The student will examine the graphs to interpret what the data implies.</li> </ul> </div> <p id="element-324"><span data-type="title">Collect the Data</span> Record the number of pairs of shoes you own.</p> <ol id="listhioaasdfadsf"><li>Randomly survey 30 classmates about the number of pairs of shoes they own. Record their values.<br /> <table id="element-30" summary="This is a blank table with 30 cells for recording values."><caption><span data-type="title">Survey Results</span></caption> <tbody><tr><td>_____</td> <td>_____</td> <td>_____</td> <td>_____</td> <td>_____</td> </tr> <tr><td>_____</td> <td>_____</td> <td>_____</td> <td>_____</td> <td>_____</td> </tr> <tr><td>_____</td> <td>_____</td> <td>_____</td> <td>_____</td> <td>_____</td> </tr> <tr><td>_____</td> <td>_____</td> <td>_____</td> <td>_____</td> <td>_____</td> </tr> <tr><td>_____</td> <td>_____</td> <td>_____</td> <td>_____</td> <td>_____</td> </tr> <tr><td>_____</td> <td>_____</td> <td>_____</td> <td>_____</td> <td>_____</td> </tr> </tbody> </table> </li> <li>Construct a histogram. Make five to six intervals. Sketch the graph using a ruler and pencil and scale the axes. <div id="element-2356" class="bc-figure figure"><span id="id7552200" data-type="media" data-alt="A blank graph template for use with this problem."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch02_14_01-1.png" alt="A blank graph template for use with this problem." width="380" data-media-type="image/png" /></span></div> </li> <li>Calculate the following values. <ol id="eip-idp140444857889360" type="a"><li>\(\overline{x}\) = _____</li> <li><em data-effect="italics">s</em> = _____</li> </ol> </li> <li>Are the data discrete or continuous? How do you know?</li> <li>In complete sentences, describe the shape of the histogram.</li> <li>Are there any potential outliers? List the value(s) that could be outliers. Use a formula to check the end values to determine if they are potential outliers.</li> </ol> <div id="list6798" data-type="list"><div data-type="title">Analyze the Data</div> <ol><li>Determine the following values. <ol id="eip-idp41229376" type="a"><li>Min = _____</li> <li><em data-effect="italics">M</em> = _____</li> <li>Max = _____</li> <li><em data-effect="italics">Q</em><sub>1</sub> = _____</li> <li><em data-effect="italics">Q</em><sub>3</sub> = _____</li> <li><em data-effect="italics">IQR</em> = _____</li> </ol> </li> <li>Construct a box plot of data</li> <li>What does the shape of the box plot imply about the concentration of data? Use complete sentences.</li> <li>Using the box plot, how can you determine if there are potential outliers?</li> <li>How does the standard deviation help you to determine concentration of the data and whether or not there are potential outliers?</li> <li>What does the <em data-effect="italics">IQR</em> represent in this problem?</li> <li>Show your work to find the value that is 1.5 standard deviations: <ol id="nestlist2" type="a"><li>above the mean.</li> <li>below the mean.</li> </ol> </li> </ol> </div> </div> </div></div>
<div class="part " id="part-linear-regression-and-correlation"><div class="part-title-wrap"><h3 class="part-number">III</h3><h1 class="part-title">Chapter 3: Linear Regression and Correlation</h1></div><div class="ugc part-ugc"></div></div>
<div class="chapter standard" id="chapter-introduction-24" title="Chapter 3.1: Introduction"><div class="chapter-title-wrap"><h3 class="chapter-number">16</h3><h2 class="chapter-title"><span class="display-none">Chapter 3.1: Introduction</span></h2></div><div class="ugc chapter-ugc"><p>[latexpage]</p> <div id="fs-idm37352640" class="splash"><div class="bc-figcaption figcaption">Linear regression and correlation can help you determine if an auto mechanic’s salary is related to his work experience. (credit: Joshua Rothhaas)</div> <p><span id="fs-idm47477408" data-type="media" data-alt="This is a photo of a car mechanic’s shop. There are three United States Postal Services trucks being serviced, and one not being serviced."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/CNX_Stats_C12_CO-1.jpg" alt="This is a photo of a car mechanic’s shop. There are three United States Postal Services trucks being serviced, and one not being serviced." width="380" data-media-type="image/jpeg" /></span></p> </div> <div id="fs-idm16563264" class="chapter-objectives" data-type="note" data-has-label="true" data-label=""><div data-type="title">Chapter Objectives</div> <p>By the end of this chapter, the student should be able to:</p> <ul><li>Discuss basic ideas of linear regression and correlation.</li> <li>Create and interpret a line of best fit.</li> <li>Calculate and interpret the correlation coefficient.</li> <li>Calculate and interpret outliers.</li> </ul> </div> <p>Professionals often want to know how two or more numeric variables are related. For example, is there a relationship between the grade on the second math exam a student takes and the grade on the final exam? If there is a relationship, what is the relationship and how strong is it?</p> <p>In another example, your income may be determined by your education, your profession, your years of experience, and your ability. The amount you pay a repair person for labor is often determined by an initial amount plus an hourly fee.</p> <p>The type of data described in the examples is <span data-type="term">bivariate</span> data — &#8220;bi&#8221; for two variables. In reality, statisticians use <span data-type="term">multivariate</span> data, meaning many variables.</p> <p>In this chapter, you will be studying the simplest form of regression, &#8220;linear regression&#8221; with one independent variable (<em data-effect="italics">x</em>). This involves data that fits a line in two dimensions. You will also study correlation which measures how strong the relationship is.</p> </div></div>
<div class="chapter standard" id="chapter-linear-equations" title="Chapter 3.2: Linear Equations"><div class="chapter-title-wrap"><h3 class="chapter-number">17</h3><h2 class="chapter-title"><span class="display-none">Chapter 3.2: Linear Equations</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p>Linear regression for two variables is based on a linear equation with one independent variable. The equation has the form:</p> <div data-type="equation">\(y=a+\text{bx}\)</div> <p><span data-type="newline"><br /> </span>where <em data-effect="italics">a</em> and <em data-effect="italics">b</em> are constant numbers.</p> <p>The variable <strong><em data-effect="italics">x</em> is the independent variable, and <em data-effect="italics">y</em> is the dependent variable.</strong> Typically, you choose a value to substitute for the independent variable and then solve for the dependent variable.</p> <div class="textbox textbox--examples" data-type="example"><p>The following examples are linear equations.</p> <div id="element-12495" data-type="equation">\(y=3+\text{2x}\)</div> <div id="element-357238" data-type="equation">\(y=–0.01+\text{1.2x}\)</div> </div> <div id="fs-idp88007136" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div data-type="problem"><p>Is the following an example of a linear equation?</p> <p id="eip-idp75670576"><em data-effect="italics">y</em> = –0.125 – 3.5<em data-effect="italics">x</em></p> </div> </div> </div> <p id="eip-498">The graph of a linear equation of the form <em data-effect="italics">y</em> = <em data-effect="italics">a</em> + <em data-effect="italics">bx</em> is a <strong>straight line</strong>. Any line that is not vertical can be described by this equation.</p> <div class="textbox textbox--examples" data-type="example"><p>Graph the equation <em data-effect="italics">y</em> = –1 + 2<em data-effect="italics">x</em>.</p> <div id="linrgs_lineq1" class="bc-figure figure"><span id="idp38882144" data-type="media" data-alt="Graph of the equation y = -1 + 2x. This is a straight line that crosses the y-axis at -1 and is sloped up and to the right, rising 2 units for every one unit of run."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch12_02_01-1.jpg" alt="Graph of the equation y = -1 + 2x. This is a straight line that crosses the y-axis at -1 and is sloped up and to the right, rising 2 units for every one unit of run." width="380" data-media-type="image/jpeg" /></span></div> </div> <div id="fs-idp3088096" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div id="eip-373" data-type="problem"><p>Is the following an example of a linear equation? Why or why not?</p> <div id="fs-idp22692736" class="bc-figure figure"><span id="eip-idp139727523885744" data-type="media" data-alt="This is a graph of an equation. The x-axis is labeled in intervals of 2 from 0 - 14; the y-axis is labeled in intervals of 2 from 0 - 12. The equation's graph is a curve that crosses the y-axis at 2 and curves upward and to the right." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C012_M02_tryit001-1.png" alt="This is a graph of an equation. The x-axis is labeled in intervals of 2 from 0 - 14; the y-axis is labeled in intervals of 2 from 0 - 12. The equation's graph is a curve that crosses the y-axis at 2 and curves upward and to the right." width="250" data-media-type="image/png" /></span></div> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><p>Aaron&#8217;s Word Processing Service (AWPS) does word processing. The rate for services is \$32 per hour plus a \$31.50 one-time charge. The total cost to a customer depends on the number of hours it takes to complete the job.</p> <div data-type="exercise"><div id="idp59356256" data-type="problem"><p>Find the equation that expresses the <strong>total cost</strong> in terms of the <strong>number of hours</strong> required to complete the job.</p> </div> <div id="idp59359008" data-type="solution"><p>Let <em data-effect="italics">x</em> = the number of hours it takes to get the job done. <span data-type="newline"><br /> </span>Let <em data-effect="italics">y</em> = the total cost to the customer.</p> <p>The \$31.50 is a fixed cost. If it takes <em data-effect="italics">x</em> hours to complete the job, then (32)(<em data-effect="italics">x</em>) is the cost of the word processing only. The total cost is: <em data-effect="italics">y</em> = 31.50 + 32<em data-effect="italics">x</em></p> </div> </div> </div> <div id="fs-idm27179040" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div data-type="problem"><p>Emma’s Extreme Sports hires hang-gliding instructors and pays them a fee of \$50 per class as well as \$20 per student in the class. The total cost Emma pays depends on the number of students in a class. Find the equation that expresses the total cost in terms of the number of students in a class.</p> </div> </div> </div> <div id="fs-idm35899664" class="bc-section section" data-depth="1"><h3 data-type="title">Slope and <em data-effect="italics">Y</em>-Intercept of a Linear Equation</h3> <p id="fs-idp15969440">For the linear equation <em data-effect="italics">y</em> = <em data-effect="italics">a</em> + <em data-effect="italics">bx</em>, <em data-effect="italics">b</em> = slope and <em data-effect="italics">a</em> = <em data-effect="italics">y</em>-intercept. From algebra recall that the slope is a number that describes the steepness of a line, and the <em data-effect="italics">y</em>-intercept is the <em data-effect="italics">y</em> coordinate of the point (0, <em data-effect="italics">a</em>) where the line crosses the <em data-effect="italics">y</em>-axis.</p> <div id="linrgs_slope1" class="bc-figure figure"><div class="bc-figcaption figcaption">Three possible graphs of <em data-effect="italics">y</em> = <em data-effect="italics">a</em> + <em data-effect="italics">bx</em>. (a) If <em data-effect="italics">b</em> &gt; 0, the line slopes upward to the right. (b) If <em data-effect="italics">b</em> = 0, the line is horizontal. (c) If <em data-effect="italics">b</em> &lt; 0, the line slopes downward to the right.</div> <p><span id="idp66522560a" data-type="media">0 and so the line slopes upward to the right. For the second, b = 0 and the graph of the equation is a horizontal line. In the third graph, (c), b <img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch12_03_01-1.jpg" data-media-type="image/jpg" alt="image" /> 0 and so the line slopes upward to the right. For the second, b = 0 and the graph of the equation is a horizontal line. In the third graph, (c), b </span></p> </div> <div class="textbox textbox--examples" data-type="example"><p>Svetlana tutors to make extra money for college. For each tutoring session, she charges a one-time fee of \$25 plus \$15 per hour of tutoring. A linear equation that expresses the total amount of money Svetlana earns for each session she tutors is <em data-effect="italics">y</em> = 25 + 15<em data-effect="italics">x</em>.</p> <div id="element-00112" data-type="exercise"><div id="idp139982000" data-type="problem"><p>What are the independent and dependent variables? What is the <em data-effect="italics">y</em>-intercept and what is the slope? Interpret them using complete sentences.</p> </div> <div id="idp27992816" data-type="solution"><p>The independent variable (<em data-effect="italics">x</em>) is the number of hours Svetlana tutors each session. The dependent variable (<em data-effect="italics">y</em>) is the amount, in dollars, Svetlana earns for each session.</p> <p>The <em data-effect="italics">y</em>-intercept is 25 (<em data-effect="italics">a</em> = 25). At the start of the tutoring session, Svetlana charges a one-time fee of \$25 (this is when <em data-effect="italics">x</em> = 0). The slope is 15 (<em data-effect="italics">b</em> = 15). For each session, Svetlana earns \$15 for each hour she tutors.</p> </div> </div> </div> <div id="fs-idp5590144" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div data-type="problem"><p id="eip-379">Ethan repairs household appliances like dishwashers and refrigerators. For each visit, he charges \$25 plus \$20 per hour of work. A linear equation that expresses the total amount of money Ethan earns per visit is <em data-effect="italics">y</em> = 25 + 20<em data-effect="italics">x</em>.</p> <p id="eip-idm158116832">What are the independent and dependent variables? What is the <em data-effect="italics">y</em>-intercept and what is the slope? Interpret them using complete sentences.</p> </div> </div> </div> </div> <div class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="eip-idp123435600">Data from the Centers for Disease Control and Prevention.</p> <p id="eip-idp89356864">Data from the National Center for agency reporting flu cases and TB Prevention.</p> </div> <div class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idm110721248">The most basic type of association is a linear association. This type of relationship can be defined algebraically by the equations used, numerically with actual or predicted data values, or graphically from a plotted curve. (Lines are classified as straight curves.) Algebraically, a linear equation typically takes the form <strong><em data-effect="italics">y = mx + b</em></strong>, where <strong><em data-effect="italics">m</em></strong> and <strong><em data-effect="italics">b</em></strong> are constants, <strong><em data-effect="italics">x</em></strong> is the independent variable, <strong><em data-effect="italics">y</em></strong> is the dependent variable. In a statistical context, a linear equation is written in the form <strong><em data-effect="italics">y = a + bx</em></strong>, where <strong><em data-effect="italics">a</em></strong> and <strong><em data-effect="italics">b</em></strong> are the constants. This form is used to help readers distinguish the statistical context from the algebraic context. In the equation <em data-effect="italics">y = a + bx</em>, the constant <em data-effect="italics">b</em> that multiplies the <strong><em data-effect="italics">x</em></strong> variable (<em data-effect="italics">b</em> is called a coefficient) is called as the <strong>slope</strong>. The slope describes the rate of change between the independent and dependent variables; in other words, the rate of change describes the change that occurs in the dependent variable as the independent variable is changed. In the equation <em data-effect="italics">y = a + bx</em>, the constant a is called as the <em data-effect="italics">y</em>-intercept. Graphically, the <em data-effect="italics">y</em>-intercept is the <em data-effect="italics">y</em> coordinate of the point where the graph of the line crosses the <em data-effect="italics">y</em> axis. At this point <em data-effect="italics">x</em> = 0.</p> <p>The <strong>slope of a line</strong> is a value that describes the rate of change between the independent and dependent variables. The <strong>slope</strong> tells us how the dependent variable (<em data-effect="italics">y</em>) changes for every one unit increase in the independent (<em data-effect="italics">x</em>) variable, on average. The <strong><em data-effect="italics">y</em>-intercept</strong> is used to describe the dependent variable when the independent variable equals zero. Graphically, the slope is represented by three line types in elementary statistics.</p> </div> <div class="formula-review" data-depth="1"><h3 data-type="title">Formula Review</h3> <p><em data-effect="italics">y</em> = <em data-effect="italics">a</em> + <em data-effect="italics">bx</em> where <em data-effect="italics">a</em> is the <em data-effect="italics">y</em>-intercept and <em data-effect="italics">b</em> is the slope. The variable <em data-effect="italics">x</em> is the independent variable and <em data-effect="italics">y</em> is the dependent variable.</p> </div> <div class="practice" data-depth="1"><p><em data-effect="italics">Use the following information to answer the next three exercises</em>. A vacation resort rents SCUBA equipment to certified divers. The resort charges an up-front fee of \$25 and another fee of \$12.50 an hour.</p> <div data-type="exercise"><div data-type="problem"><p>What are the dependent and independent variables?</p> </div> <div data-type="solution"><p>dependent variable: fee amount; independent variable: time</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>Find the equation that expresses the total fee in terms of the number of hours the equipment is rented.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>Graph the equation from <a class="autogenerated-content" href="#eip-683">(Figure)</a>.</p> </div> <div data-type="solution"><div id="fs-idm133416208" class="bc-figure figure"><span id="eip-idp89831856" data-type="media" data-alt="This is a graph of the equation y = 25 + 12.50x. The x-axis is labeled in intervals of 1 from 0 - 7; the y-axis is labeled in intervals of 25 from 0 - 100. The equation's graph is a line that crosses the y-axis at 25 and is sloped up and to the right, rising 12.50 units for every one unit of run." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C012_M02_item001anno-1.jpg" alt="This is a graph of the equation y = 25 + 12.50x. The x-axis is labeled in intervals of 1 from 0 - 7; the y-axis is labeled in intervals of 25 from 0 - 100. The equation's graph is a line that crosses the y-axis at 25 and is sloped up and to the right, rising 12.50 units for every one unit of run." width="380" data-media-type="image/jpeg" /></span></div> </div> </div> <p id="eip-128"><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next two exercises</em>. A credit card company charges \$10 when a payment is late, and \$5 a day each day the payment remains unpaid.</p> <div data-type="exercise"><div id="eip-775" data-type="problem"><p>Find the equation that expresses the total fee in terms of the number of days the payment is late.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>Graph the equation from <a class="autogenerated-content" href="#eip-23">(Figure)</a>.</p> </div> <div data-type="solution"><div id="fs-idp120210480" class="bc-figure figure"><span id="eip-idp116656032" data-type="media" data-alt="This is a graph of the equation y = 10 + 5x. The x-axis is labeled in intervals of 1 from 0 - 7; the y-axis is labeled in intervals of 10 from 0 - 50. The equation's graph is a line that crosses the y-axis at 10 and is sloped up and to the right, rising 5 units for every one unit of run." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C012_M02_item002anno-1.jpg" alt="This is a graph of the equation y = 10 + 5x. The x-axis is labeled in intervals of 1 from 0 - 7; the y-axis is labeled in intervals of 10 from 0 - 50. The equation's graph is a line that crosses the y-axis at 10 and is sloped up and to the right, rising 5 units for every one unit of run." width="380" data-media-type="image/jpeg" /></span></div> </div> </div> <div data-type="exercise"><div data-type="problem"><p>Is the equation <em data-effect="italics">y</em> = 10 + 5<em data-effect="italics">x</em> – 3<em data-effect="italics">x</em><sup>2</sup> linear? Why or why not?</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>Which of the following equations are linear?</p> <p id="eip-idm139280304">a. <em data-effect="italics">y</em> = 6<em data-effect="italics">x</em> + 8</p> <p id="eip-idm178729312">b. <em data-effect="italics">y</em> + 7 = 3<em data-effect="italics">x</em></p> <p id="eip-idm156515616">c. <em data-effect="italics">y</em> – <em data-effect="italics">x</em> = 8<em data-effect="italics">x</em><sup>2</sup></p> <p id="eip-idm183078944">d. 4<em data-effect="italics">y</em> = 8</p> </div> <div data-type="solution"><p><em data-effect="italics">y</em> = 6<em data-effect="italics">x</em> + 8, 4<em data-effect="italics">y</em> = 8, and <em data-effect="italics">y</em> + 7 = 3<em data-effect="italics">x</em> are all linear equations.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>Does the graph show a linear equation? Why or why not?</p> <div id="fs-idm48879424" class="bc-figure figure"><span id="eip-idp1549088" data-type="media" data-alt="This is a graph of an equation. The x-axis is labeled in intervals of 1 from -5 to 5; the y-axis is labeled in intervals of 1 from 0 - 8. The equation's graph is a parabola, a u-shaped curve that has a minimum value at (0, 0)." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C012_M02_item003-1.jpg" alt="This is a graph of an equation. The x-axis is labeled in intervals of 1 from -5 to 5; the y-axis is labeled in intervals of 1 from 0 - 8. The equation's graph is a parabola, a u-shaped curve that has a minimum value at (0, 0)." width="380" data-media-type="image/jpeg" /></span></div> </div> </div> <p><a class="autogenerated-content" href="#element-806">(Figure)</a> contains real data for the first two decades of flu reporting.</p> <table summary="This table presents the year of reporting flu cases and deaths in the first column, number of flu cases diagnosed in the second column, and number of flu deaths in the third column."><caption><span data-type="title">Adults and Adolescents only, United States </span></caption> <tbody><tr><td><strong>Year </strong></td> <td><strong># flu cases diagnosed</strong></td> <td><strong># flu deaths </strong></td> </tr> <tr><td>Pre-1981</td> <td>91</td> <td>29</td> </tr> <tr><td>1981</td> <td>319</td> <td>121</td> </tr> <tr><td>1982</td> <td>1,170</td> <td>453</td> </tr> <tr><td>1983</td> <td>3,076</td> <td>1,482</td> </tr> <tr><td>1984</td> <td>6,240</td> <td>3,466</td> </tr> <tr><td>1985</td> <td>11,776</td> <td>6,878</td> </tr> <tr><td>1986</td> <td>19,032</td> <td>11,987</td> </tr> <tr><td>1987</td> <td>28,564</td> <td>16,162</td> </tr> <tr><td>1988</td> <td>35,447</td> <td>20,868</td> </tr> <tr><td>1989</td> <td>42,674</td> <td>27,591</td> </tr> <tr><td>1990</td> <td>48,634</td> <td>31,335</td> </tr> <tr><td>1991</td> <td>59,660</td> <td>36,560</td> </tr> <tr><td>1992</td> <td>78,530</td> <td>41,055</td> </tr> <tr><td>1993</td> <td>78,834</td> <td>44,730</td> </tr> <tr><td>1994</td> <td>71,874</td> <td>49,095</td> </tr> <tr><td>1995</td> <td>68,505</td> <td>49,456</td> </tr> <tr><td>1996</td> <td>59,347</td> <td>38,510</td> </tr> <tr><td>1997</td> <td>47,149</td> <td>20,736</td> </tr> <tr><td>1998</td> <td>38,393</td> <td>19,005</td> </tr> <tr><td>1999</td> <td>25,174</td> <td>18,454</td> </tr> <tr><td>2000</td> <td>25,522</td> <td>17,347</td> </tr> <tr><td>2001</td> <td>25,643</td> <td>17,402</td> </tr> <tr><td>2002</td> <td>26,464</td> <td>16,371</td> </tr> <tr><td><strong>Total</strong></td> <td><strong>802,118</strong></td> <td><strong>489,093</strong></td> </tr> </tbody> </table> <div data-type="exercise"><div id="id3310573" data-type="problem"><p>Use the columns &#8220;year&#8221; and &#8220;# flu cases diagnosed. Why is “year” the independent variable and “# flu cases diagnosed.” the dependent variable (instead of the reverse)?</p> </div> <div id="fs-idm135570288" data-type="solution"><p id="fs-idm96391920">The number of flu cases depends on the year. Therefore, year becomes the independent variable and the number of flu cases is the dependent variable.</p> </div> </div> <p><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next two exercises</em>. A specialty cleaning company charges an equipment fee and an hourly labor fee. A linear equation that expresses the total amount of the fee the company charges for each session is <em data-effect="italics">y</em> = 50 + 100<em data-effect="italics">x</em>.</p> <div data-type="exercise"><div id="eip-319" data-type="problem"><p id="eip-869a">What are the independent and dependent variables?</p> </div> </div> <div data-type="exercise"><div id="eip-782" data-type="problem"><p>What is the <em data-effect="italics">y</em>-intercept and what is the slope? Interpret them using complete sentences.</p> </div> <div data-type="solution"><p>The <em data-effect="italics">y</em>-intercept is 50 (<em data-effect="italics">a</em> = 50). At the start of the cleaning, the company charges a one-time fee of \$50 (this is when <em data-effect="italics">x</em> = 0). The slope is 100 (<em data-effect="italics">b</em> = 100). For each session, the company charges \$100 for each hour they clean.</p> </div> </div> <p><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next three questions</em>. Due to erosion, a river shoreline is losing several thousand pounds of soil each year. A linear equation that expresses the total amount of soil lost per year is <em data-effect="italics">y</em> = 12,000<em data-effect="italics">x</em>.</p> <div data-type="exercise"><div data-type="problem"><p id="eip-161a">What are the independent and dependent variables?</p> </div> </div> <div id="eip-160" data-type="exercise"><div id="eip-362" data-type="problem"><p id="eip-149">How many pounds of soil does the shoreline lose in a year?</p> </div> <div data-type="solution"><p>12,000 pounds of soil</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p id="eip-924a">What is the <em data-effect="italics">y</em>-intercept? Interpret its meaning.</p> </div> </div> <p><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next two exercises</em>. The price of a single issue of stock can fluctuate throughout the day. A linear equation that represents the price of stock for Shipment Express is <em data-effect="italics">y</em> = 15 – 1.5<em data-effect="italics">x</em> where <em data-effect="italics">x</em> is the number of hours passed in an eight-hour day of trading.</p> <div data-type="exercise"><div data-type="problem"><p>What are the slope and <em data-effect="italics">y</em>-intercept? Interpret their meaning.</p> </div> <div data-type="solution"><p>The slope is –1.5 (<em data-effect="italics">b</em> = –1.5). This means the stock is losing value at a rate of \$1.50 per hour. The <em data-effect="italics">y</em>-intercept is \$15 (<em data-effect="italics">a</em> = 15). This means the price of stock before the trading day was \$15.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>If you owned this stock, would you want a positive or negative slope? Why?</p> </div> </div> </div> <div id="fs-idm70124224" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <div data-type="exercise"><div id="id44884101" data-type="problem"><p>1) For each of the following situations, state the independent variable and the dependent variable.</p> <ol type="a"><li>A study is done to determine if elderly drivers are involved in more motor vehicle fatalities than other drivers. The number of fatalities per 100,000 drivers is compared to the age of drivers.</li> <li>A study is done to determine if the weekly grocery bill changes based on the number of family members.</li> <li>Insurance companies base life insurance premiums partially on the age of the applicant.</li> <li>Utility bills vary according to power consumption.</li> <li>A study is done to determine if a higher education reduces the crime rate in a population.</li> </ol> <p>&nbsp;</p> </div> <div id="id44884342" data-type="solution"></div> </div> <div id="fs-idm114941024" data-type="exercise"><div id="fs-idm130118464" data-type="problem"><p id="fs-idm128829552">2) Piece-rate systems are widely debated incentive payment plans. In a recent study of loan officer effectiveness, the following piece-rate system was examined:</p> <table id="fs-idm107180912" summary=".."><caption> </caption> <tbody><tr><td>% of goal reached</td> <td>&lt; 80</td> <td>80</td> <td>100</td> <td>120</td> </tr> <tr><td>Incentive</td> <td>n/a</td> <td>\$4,000 with an additional \$125 added per percentage point from 81–99%</td> <td>\$6,500 with an additional \$125 added per percentage point from 101–119%</td> <td>\$9,500 with an additional \$125 added per percentage point starting at 121%</td> </tr> </tbody> </table> <p id="fs-idm114950432">If a loan officer makes 95% of his or her goal, write the linear function that applies based on the incentive plan table. In context, explain the <em data-effect="italics">y</em>-intercept and slope.</p> <p><strong>Answers to odd questions</strong></p> <p>1)</p> <ol id="fs-idm263168" type="a"><li>independent variable: age; dependent variable: fatalities</li> <li>independent variable: # of family members; dependent variable: grocery bill</li> <li>independent variable: age of applicant; dependent variable: insurance premium</li> <li>independent variable: power consumption; dependent variable: utility</li> <li>independent variable: higher education (years); dependent variable: crime rates</li> </ol> </div> </div> </div> </div></div>
<div class="chapter standard" id="chapter-scatter-plots" title="Chapter 3.3: Scatter Plots"><div class="chapter-title-wrap"><h3 class="chapter-number">18</h3><h2 class="chapter-title"><span class="display-none">Chapter 3.3: Scatter Plots</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p>Before we take up the discussion of linear regression and correlation, we need to examine a way to display the relation between two variables <em data-effect="italics">x</em> and <em data-effect="italics">y</em>. The most common and easiest way is a <strong>scatter plot</strong>. The following example illustrates a scatter plot.</p> <div id="element-777" class="textbox textbox--examples" data-type="example"><p>In Europe and Asia, m-commerce is popular. M-commerce users have special mobile phones that work like electronic wallets as well as provide phone and Internet services. Users can do everything from paying for parking to buying a TV set or soda from a machine to banking to checking sports scores on the Internet. For the years 2000 through 2004, was there a relationship between the year and the number of m-commerce users? Construct a scatter plot. Let <em data-effect="italics">x</em> = the year and let <em data-effect="italics">y</em> = the number of m-commerce users, in millions.</p> <table id="linrgs_scater1" summary=""><caption>Table showing the number of m-commerce users (in millions) by year.</caption> <thead><tr><th>\(x\) (year)</th> <th>\(y\) (# of users)</th> </tr> </thead> <tbody><tr><td>2000</td> <td>0.5</td> </tr> <tr><td>2002</td> <td>20.0</td> </tr> <tr><td>2003</td> <td>33.0</td> </tr> <tr><td>2004</td> <td>47.0</td> </tr> </tbody> </table> <div id="linrgs_scater12" class="bc-figure figure"><div class="bc-figcaption figcaption">Scatter plot showing the number of m-commerce users (in millions) by year.</div> <p><span id="id1171452159644" data-type="media" data-alt="This is a scatter plot for the data provided. The x-axis represents the year and the y-axis represents the number of m-commerce users in millions. There are four points plotted, at (2000, 0.5), (2002, 20.0), (2003, 33.0), (2004, 47.0)."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch12_04_01-1.jpg" alt="This is a scatter plot for the data provided. The x-axis represents the year and the y-axis represents the number of m-commerce users in millions. There are four points plotted, at (2000, 0.5), (2002, 20.0), (2003, 33.0), (2004, 47.0)." width="380" data-media-type="image/jpeg" /></span></p> </div> </div> <div id="fs-idp52837344" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p>To create a scatter plot:</p> <ol id="fs-idp71144528"><li>Enter your X data into list L1 and your Y data into list L2.</li> <li>Press 2nd STATPLOT ENTER to use Plot 1. On the input screen for PLOT 1, highlight On and press ENTER. (Make sure the other plots are OFF.)</li> <li>For TYPE: highlight the very first icon, which is the scatter plot, and press ENTER.</li> <li>For Xlist:, enter L1 ENTER and for Ylist: L2 ENTER.</li> <li>For Mark: it does not matter which symbol you highlight, but the square is the easiest to see. Press ENTER.</li> <li>Make sure there are no other equations that could be plotted. Press Y = and clear any equations out.</li> <li>Press the ZOOM key and then the number 9 (for menu item &#8220;ZoomStat&#8221;) ; the calculator will fit the window to the data. You can press WINDOW to see the scaling of the axes.</li> </ol> </div> <div id="fs-idm16826784" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div data-type="problem"><p>Amelia plays basketball for her high school. She wants to improve to play at the college level. She notices that the number of points she scores in a game goes up in response to the number of hours she practices her jump shot each week. She records the following data:</p> <table id="eip-idp1116432" summary=""><thead><tr><th><em data-effect="italics">X</em> (hours practicing jump shot)</th> <th><em data-effect="italics">Y</em> (points scored in a game)</th> </tr> </thead> <tbody><tr><td>5</td> <td>15</td> </tr> <tr><td>7</td> <td>22</td> </tr> <tr><td>9</td> <td>28</td> </tr> <tr><td>10</td> <td>31</td> </tr> <tr><td>11</td> <td>33</td> </tr> <tr><td>12</td> <td>36</td> </tr> </tbody> </table> <p id="eip-idm36661536">Construct a scatter plot and state if what Amelia thinks appears to be true.</p> </div> </div> </div> <p>A scatter plot shows the <strong>direction</strong> of a relationship between the variables. A clear direction happens when there is either:</p> <ul><li>High values of one variable occurring with high values of the other variable or low values of one variable occurring with low values of the other variable.</li> <li>High values of one variable occurring with low values of the other variable.</li> </ul> <p>You can determine the <strong data-effect="bold">strength</strong> of the relationship by looking at the scatter plot and seeing how close the points are to a line, a power function, an exponential function, or to some other type of function. For a linear relationship there is an exception. Consider a scatter plot where all the points fall on a horizontal line providing a &#8220;perfect fit.&#8221; The horizontal line would in fact show no relationship.</p> <p>When you look at a scatterplot, you want to notice the <strong>overall pattern</strong> and any <strong>deviations</strong> from the pattern. The following scatterplot examples illustrate these concepts.</p> <div id="lingrgs10" class="bc-figure figure"><span id="id1171450550347" data-type="media" data-alt="The first graph is a scatter plot with 6 points plotted. The points form a pattern that moves upward to the right, almost in a straight line. The second graph is a scatter plot with the same 6 points as the first graph. A 7th point is plotted in the top left corner of the quadrant. It falls outside the general pattern set by the other 6 points."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch12_04_02-1.jpg" alt="The first graph is a scatter plot with 6 points plotted. The points form a pattern that moves upward to the right, almost in a straight line. The second graph is a scatter plot with the same 6 points as the first graph. A 7th point is plotted in the top left corner of the quadrant. It falls outside the general pattern set by the other 6 points." width="380" data-media-type="image/jpeg" /></span></div> <div id="lingrgs20" class="bc-figure figure"><span id="id1171453496318" data-type="media" data-alt="The first graph is a scatter plot with 6 points plotted. The points form a pattern that moves downward to the right, almost in a straight line. The second graph is a scatter plot of 8 points. These points form a general downward pattern, but the point do not align in a tight pattern."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch12_04_03-1.jpg" alt="The first graph is a scatter plot with 6 points plotted. The points form a pattern that moves downward to the right, almost in a straight line. The second graph is a scatter plot of 8 points. These points form a general downward pattern, but the point do not align in a tight pattern." width="380" data-media-type="image/jpeg" /></span></div> <div id="lingrgs30" class="bc-figure figure"><span id="id1171454842074" data-type="media" data-alt="The first graph is a scatter plot of 7 points in an exponential pattern. The pattern of the points begins along the x-axis and curves steeply upward to the right side of the quadrant. The second graph shows a scatter plot with many points scattered everywhere, exhibiting no pattern."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch12_04_04-1.jpg" alt="The first graph is a scatter plot of 7 points in an exponential pattern. The pattern of the points begins along the x-axis and curves steeply upward to the right side of the quadrant. The second graph shows a scatter plot with many points scattered everywhere, exhibiting no pattern." width="380" data-media-type="image/jpeg" /></span></div> <p>In this chapter, we are interested in scatter plots that show a linear pattern. Linear patterns are quite common. The linear relationship is strong if the points are close to a straight line, except in the case of a horizontal line where there is no relationship. If we think that the points show a linear relationship, we would like to draw a line on the scatter plot. This line can be calculated through a process called <span data-type="term">linear regression</span>. However, we only calculate a regression line if one of the variables helps to explain or predict the other variable. If <em data-effect="italics">x</em> is the independent variable and <em data-effect="italics">y</em> the dependent variable, then we can use a regression line to predict <em data-effect="italics">y</em> for a given value of <em data-effect="italics">x</em></p> <div class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idm18585216">Scatter plots are particularly helpful graphs when we want to see if there is a linear relationship among data points. They indicate both the direction of the relationship between the <em data-effect="italics">x</em> variables and the <em data-effect="italics">y</em> variables, and the strength of the relationship. We calculate the strength of the relationship between an independent variable and a dependent variable using linear regression.</p> </div> <div class="practice" data-depth="1"><div data-type="exercise"><div data-type="problem"><p id="eip-780">Does the scatter plot appear linear? Strong or weak? Positive or negative?</p> <div id="fs-idp130114112" class="bc-figure figure"><span id="eip-idm143400624" data-type="media" data-alt="This is a scatterplot with several points plotted in the first quadrant. The points form a clear pattern, moving upward to the right. The points do not line up , but the overall pattern can be modeled with a line." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C012_M04_item001-1.jpg" alt="This is a scatterplot with several points plotted in the first quadrant. The points form a clear pattern, moving upward to the right. The points do not line up , but the overall pattern can be modeled with a line." width="450" data-media-type="image/jpeg" /></span></div> </div> <div data-type="solution"><p>The data appear to be linear with a strong, positive correlation.</p> </div> </div> <div id="eip-281" data-type="exercise"><div id="eip-986" data-type="problem"><p>Does the scatter plot appear linear? Strong or weak? Positive or negative?</p> <div id="fs-idp68059856" class="bc-figure figure"><span id="eip-idp14193472" data-type="media" data-alt="This is a scatterplot with several points plotted in the first quadrant. The points move downward to the right. The overall pattern can be modeled with a line, but the points are widely scattered." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C012_M04_item002-1.jpg" alt="This is a scatterplot with several points plotted in the first quadrant. The points move downward to the right. The overall pattern can be modeled with a line, but the points are widely scattered." width="450" data-media-type="image/jpeg" /></span></div> </div> </div> <div data-type="exercise"><div data-type="problem"><p>Does the scatter plot appear linear? Strong or weak? Positive or negative?</p> <div id="fs-idp63135408" class="bc-figure figure"><span id="eip-idp1376736" data-type="media" data-alt="This is a scatter plot with several points plotted all over the first quadrant. There is no pattern." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C012_M04_item003-1.jpg" alt="This is a scatter plot with several points plotted all over the first quadrant. There is no pattern." width="450" data-media-type="image/jpeg" /></span></div> </div> <div id="eip-921" data-type="solution"><p>The data appear to have no correlation.</p> </div> </div> </div> <div id="fs-idm75649888" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <div id="fs-idp128138576" data-type="exercise"><div id="fs-idp35640784" data-type="problem"><p id="fs-idp63162736">1) The Gross Domestic Product Purchasing Power Parity is an indication of a country’s currency value compared to another country. <a class="autogenerated-content" href="#fs-idm59733200">(Figure)</a> shows the GDP PPP of Cuba as compared to US dollars. Construct a scatter plot of the data.</p> <table id="fs-idm59733200" summary=".."><caption> </caption> <thead><tr><th>Year</th> <th>Cuba’s PPP</th> <th>Year</th> <th>Cuba’s PPP</th> </tr> </thead> <tbody><tr><td>1999</td> <td>1,700</td> <td>2006</td> <td>4,000</td> </tr> <tr><td>2000</td> <td>1,700</td> <td>2007</td> <td>11,000</td> </tr> <tr><td>2002</td> <td>2,300</td> <td>2008</td> <td>9,500</td> </tr> <tr><td>2003</td> <td>2,900</td> <td>2009</td> <td>9,700</td> </tr> <tr><td>2004</td> <td>3,000</td> <td>2010</td> <td>9,900</td> </tr> <tr><td>2005</td> <td>3,500</td> <td></td> <td></td> </tr> </tbody> </table> </div> <div id="fs-idm27096576" data-type="solution"><p id="fs-idm47595680"></p></div> </div> <div id="fs-idm75154176" data-type="exercise"><div id="fs-idm56634624" data-type="problem"><p id="fs-idp103485488">2) The following table shows the poverty rates and cell phone usage in the United States. Construct a scatter plot of the data</p> <table id="fs-idm6414736" summary=".."><thead><tr><th>Year</th> <th>Poverty Rate</th> <th>Cellular Usage per Capita</th> </tr> </thead> <tbody><tr><td>2003</td> <td>12.7</td> <td>54.67</td> </tr> <tr><td>2005</td> <td>12.6</td> <td>74.19</td> </tr> <tr><td>2007</td> <td>12</td> <td>84.86</td> </tr> <tr><td>2009</td> <td>12</td> <td>90.82</td> </tr> </tbody> </table> <p>&nbsp;</p> </div> </div> <div id="fs-idp83642048" data-type="exercise"><div id="fs-idp127740240" data-type="problem"><p id="fs-idm35544512">3) Does the higher cost of tuition translate into higher-paying jobs? The table lists the top ten colleges based on mid-career salary and the associated yearly tuition costs. Construct a scatter plot of the data.</p> <table id="fs-idp39411984" summary=".."><caption> </caption> <thead><tr><th>School</th> <th>Mid-Career Salary (in thousands)</th> <th>Yearly Tuition</th> </tr> </thead> <tbody><tr><td>Princeton</td> <td>137</td> <td>28,540</td> </tr> <tr><td>Harvey Mudd</td> <td>135</td> <td>40,133</td> </tr> <tr><td>CalTech</td> <td>127</td> <td>39,900</td> </tr> <tr><td>US Naval Academy</td> <td>122</td> <td>0</td> </tr> <tr><td>West Point</td> <td>120</td> <td>0</td> </tr> <tr><td>MIT</td> <td>118</td> <td>42,050</td> </tr> <tr><td>Lehigh University</td> <td>118</td> <td>43,220</td> </tr> <tr><td>NYU-Poly</td> <td>117</td> <td>39,565</td> </tr> <tr><td>Babson College</td> <td>117</td> <td>40,400</td> </tr> <tr><td>Stanford</td> <td>114</td> <td>54,506</td> </tr> </tbody> </table> </div> <div id="fs-idp37956800" data-type="solution"><p id="fs-idp99428928"></p></div> </div> <div id="eip-201" data-type="exercise"><div data-type="problem"><p>4) If the level of significance is 0.05 and the <em data-effect="italics">p</em>-value is 0.06, what conclusion can you draw?</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="eip-idm68833888" data-type="problem"><p id="eip-idm163228576">5) If there are 15 data points in a set of data, what is the number of degree of freedom?</p> </div> <div id="eip-idm194939824" data-type="solution"><p><strong>Answers to odd questions</strong></p> <p>3)  Note that tuition is the independent variable and salary is the dependent variable.</p> <p>5) 13</p> </div> </div> </div> </div></div>
<div class="chapter standard" id="chapter-the-regression-equation" title="Chapter 3.4: The Regression Equation"><div class="chapter-title-wrap"><h3 class="chapter-number">19</h3><h2 class="chapter-title"><span class="display-none">Chapter 3.4: The Regression Equation</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p>Data rarely fit a straight line exactly. Usually, you must be satisfied with rough predictions. Typically, you have a set of data whose scatter plot appears to <strong>&#8220;fit&#8221;</strong> a straight line. This is called a <span data-type="term">Line of Best Fit</span> <strong>or</strong> <span data-type="term">Least-Squares Line</span>.</p> <div id="fs-idm66386384" class="statistics collab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Collaborative Exercise</div> <p id="element-900">If you know a person&#8217;s pinky (smallest) finger length, do you think you could predict that person&#8217;s height? Collect data from your class (pinky finger length, in inches). The independent variable, <em data-effect="italics">x</em>, is pinky finger length and the dependent variable, <em data-effect="italics">y</em>, is height. For each set of data, plot the points on graph paper. Make your graph big enough and <strong>use a ruler</strong>. Then &#8220;by eye&#8221; draw a line that appears to &#8220;fit&#8221; the data. For your line, pick two convenient points and use them to find the slope of the line. Find the <em data-effect="italics">y</em>-intercept of the line by extending your line so it crosses the <em data-effect="italics">y</em>-axis. Using the slopes and the <em data-effect="italics">y</em>-intercepts, write your equation of &#8220;best fit.&#8221; Do you think everyone will have the same equation? Why or why not? According to your equation, what is the predicted height for a pinky length of 2.5 inches?</p> </div> <div class="textbox textbox--examples" data-type="example"><p id="element-998">A random sample of 11 statistics students produced the following data, where <em data-effect="italics">x</em> is the third exam score out of 80, and <em data-effect="italics">y</em> is the final exam score out of 200. Can you predict the final exam score of a random student if you know the third exam score?</p> <table summary=""><caption>Table showing the scores on the final exam based on scores from the third exam.</caption> <thead><tr><th>x (third exam score)</th> <th>y (final exam score)</th> </tr> </thead> <tbody><tr><td>65</td> <td>175</td> </tr> <tr><td>67</td> <td>133</td> </tr> <tr><td>71</td> <td>185</td> </tr> <tr><td>71</td> <td>163</td> </tr> <tr><td>66</td> <td>126</td> </tr> <tr><td>75</td> <td>198</td> </tr> <tr><td>67</td> <td>153</td> </tr> <tr><td>70</td> <td>163</td> </tr> <tr><td>71</td> <td>159</td> </tr> <tr><td>69</td> <td>151</td> </tr> <tr><td>69</td> <td>159</td> </tr> </tbody> </table> <div id="eip-idm440920048" class="bc-figure figure"><div class="bc-figcaption figcaption">Scatter plot showing the scores on the final exam based on scores from the third exam.</div> <p><span id="id1164262330756" data-type="media" data-alt="This is a scatter plot of the data provided. The third exam score is plotted on the x-axis, and the final exam score is plotted on the y-axis. The points form a strong, positive, linear pattern."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch12_05_01-1.jpg" alt="This is a scatter plot of the data provided. The third exam score is plotted on the x-axis, and the final exam score is plotted on the y-axis. The points form a strong, positive, linear pattern." width="380" data-media-type="image/jpeg" /></span></p> </div> </div> <div id="fs-idp122176240" class="statistics try finger" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="eip-397" data-type="exercise"><div data-type="problem"><p>SCUBA divers have maximum dive times they cannot exceed when going to different depths. The data in <a class="autogenerated-content" href="#eip-idm70490496">(Figure)</a> show different depths with the maximum dive times in minutes. Use your calculator to find the least squares regression line and predict the maximum dive time for 110 feet.</p> <table id="eip-idm70490496" summary=""><thead><tr><th><em data-effect="italics">X</em> (depth in feet)</th> <th><em data-effect="italics">Y</em> (maximum dive time)</th> </tr> </thead> <tbody><tr><td>50</td> <td>80</td> </tr> <tr><td>60</td> <td>55</td> </tr> <tr><td>70</td> <td>45</td> </tr> <tr><td>80</td> <td>35</td> </tr> <tr><td>90</td> <td>25</td> </tr> <tr><td>100</td> <td>22</td> </tr> </tbody> </table> </div> </div> </div> <p>The third exam score, <em data-effect="italics">x</em>, is the independent variable and the final exam score, <em data-effect="italics">y</em>, is the dependent variable. We will plot a regression line that best &#8220;fits&#8221; the data. If each of you were to fit a line &#8220;by eye,&#8221; you would draw different lines. We can use what is called a <span data-type="term">least-squares regression line</span> to obtain the best fit line.</p> <p>Consider the following diagram. Each point of data is of the the form (<em data-effect="italics">x</em>, <em data-effect="italics">y</em>) and each point ofthe line of best fit using least-squares linear regression has the form (<em data-effect="italics">x</em>, <em data-effect="italics">ŷ</em>).</p> <p>The <em data-effect="italics">ŷ</em> is read <strong>&#8220;<em data-effect="italics">y</em> hat&#8221;</strong> and is the <strong>estimated value of <em data-effect="italics">y</em></strong>. It is the value of <em data-effect="italics">y</em> obtained using the regression line. It is not generally equal to <em data-effect="italics">y</em> from data.</p> <div id="linrgs_regeq2" class="bc-figure figure"><span id="id1164271221679" data-type="media" data-alt="The scatter plot of exam scores with a line of best fit. One data point is highlighted along with the corresponding point on the line of best fit. Both points have the same x-coordinate. The distance between these two points illustrates how to compute the sum of squared errors."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch12_05_02-1.jpg" alt="The scatter plot of exam scores with a line of best fit. One data point is highlighted along with the corresponding point on the line of best fit. Both points have the same x-coordinate. The distance between these two points illustrates how to compute the sum of squared errors." width="380" data-media-type="image/jpeg" /></span></div> <p>The term <em data-effect="italics">y</em><sub>0</sub> – <em data-effect="italics">ŷ</em><sub>0</sub> = <em data-effect="italics">ε</em><sub>0</sub> is called the <strong>&#8220;error&#8221; or</strong> <span data-type="term">residual</span>. It is not an error in the sense of a mistake. The <span data-type="term">absolute value of a residual</span> measures the vertical distance between the actual value of <em data-effect="italics">y</em> and the estimated value of <em data-effect="italics">y</em>. In other words, it measures the vertical distance between the actual data point and the predicted point on the line.</p> <p>If the observed data point lies above the line, the residual is positive, and the line underestimates the actual data value for <em data-effect="italics">y</em>. If the observed data point lies below the line, the residual is negative, and the line overestimates that actual data value for <em data-effect="italics">y</em>.</p> <p>In the diagram in <a class="autogenerated-content" href="#linrgs_regeq2">(Figure)</a>, <em data-effect="italics">y</em><sub>0</sub> – <em data-effect="italics">ŷ</em><sub>0</sub> = ε<sub>0</sub> is the residual for the point shown. Here the point lies above the line and the residual is positive.</p> <p><em data-effect="italics">ε</em> = the Greek letter <strong>epsilon</strong></p> <p>For each data point, you can calculate the residuals or errors, <em data-effect="italics">y</em><sub>i</sub> &#8211; <em data-effect="italics">ŷ</em><sub>i</sub> = <em data-effect="italics">ε</em><sub>i</sub> for <em data-effect="italics">i</em> = 1, 2, 3, &#8230;, 11.</p> <p id="element-670">Each |<em data-effect="italics">ε</em>| is a vertical distance.</p> <p>For the example about the third exam scores and the final exam scores for the 11 statistics students, there are 11 data points. Therefore, there are 11 <em data-effect="italics">ε</em> values. If yousquare each ε and add, you get</p> <p>\({\left({\epsilon }_{1}\right)}^{2}+{\left({\epsilon }_{2}\right)}^{2}+&#8230;+{\left({\epsilon }_{11}\right)}^{2}=\stackrel{11}{\underset{i\text{ }=\text{ }1}{\Sigma }}{\epsilon }^{2}\)</p> <p>This is called the <span data-type="term">Sum of Squared Errors (SSE)</span>.</p> <p>Using calculus, you can determine the values of <em data-effect="italics">a</em> and <em data-effect="italics">b</em> that make the <strong>SSE</strong> a minimum. When you make the <strong>SSE</strong> a minimum, you have determined the points that are on the line of best fit. It turns out that the line of best fit has the equation:</p> <div data-type="equation">\(\stackrel{^}{y}=a+bx\)</div> <p>where \(a=\overline{y}-b\overline{x}\) and \(b=\frac{\Sigma \left(x-\overline{x}\right)\left(y-\overline{y}\right)}{\Sigma {\left(x-\overline{x}\right)}^{2}}\).</p> <p>The sample means of the <em data-effect="italics">x</em> values and the <em data-effect="italics">y</em> values are \(\overline{x}\) and \(\overline{y}\), respectively. The best fit line always passes through the point \(\left(\overline{x},\overline{y}\right)\).</p> <p>The slope <em data-effect="italics">b</em> can be written as \(b=r\left(\frac{{s}_{y}}{{s}_{x}}\right)\) where <em data-effect="italics">s</em><sub><em data-effect="italics">y</em></sub> = the standard deviation of the <em data-effect="italics">y</em> values and <em data-effect="italics">s</em><sub><em data-effect="italics">x</em></sub> = the standard deviation of the <em data-effect="italics">x</em> values. <em data-effect="italics">r</em> is the correlation coefficient, which is discussed in the next section.</p> <div id="fs-idm120169264" class="bc-section section" data-depth="1"><h3 data-type="title">Least Squares Criteria for Best Fit</h3> <p>The process of fitting the best-fit line is called <strong>linear regression</strong>. The idea behind finding the best-fit line is based on the assumption that the data are scattered about a straight line. The criteria for the best fit line is that the sum of the squared errors (SSE) is minimized, that is, made as small as possible. Any other line you might choose would have a higher SSE than the best fit line. This best fit line is called the <strong>least-squares regression line </strong>.</p> <div id="id1164273503037" class="finger" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="fs-idp15798576">Computer spreadsheets, statistical software, and many calculators can quickly calculate the best-fit line and create the graphs. The calculations tend to be tedious if done by hand. Instructions to use the TI-83, TI-83+, and TI-84+ calculators to find the best-fit line and create a scatterplot are shown at the end of this section.</p> </div> <p><span data-type="title">THIRD EXAM vs FINAL EXAM EXAMPLE:</span> The graph of the line of best fit for the third-exam/final-exam example is as follows:</p> <div id="linrgs_regeq3" class="bc-figure figure"><span id="id1164250764889" data-type="media" data-alt="The scatter plot of exam scores with a line of best fit. One data point is highlighted along with the corresponding point on the line of best fit."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch12_05_03-1.jpg" alt="The scatter plot of exam scores with a line of best fit. One data point is highlighted along with the corresponding point on the line of best fit." width="380" data-media-type="image/jpeg" /></span></div> <p>The least squares regression line (best-fit line) for the third-exam/final-exam example has the equation:</p> <div data-type="equation">\(\stackrel{^}{y}=-173.51+4.83x\)</div> <div data-type="note" data-has-label="true" data-label=""><div data-type="title">Reminder</div> <p id="fs-idp965680">Remember, it is always important to plot a scatter diagram first. If the scatter plot indicates that there is a linear relationship between the variables, then it is reasonable to use a best fit line to make predictions for <em data-effect="italics">y</em> given <em data-effect="italics">x</em> within the domain of <em data-effect="italics">x</em>-values in the sample data, <strong>but not necessarily for <em data-effect="italics">x</em>-values outside that domain.</strong> You could use the line to predict the final exam score for a student who earned a grade of 73 on the third exam. You should NOT use the line to predict the final exam score for a student who earned a grade of 50 on the third exam, because 50 is not within the domain of the <em data-effect="italics">x</em>-values in the sample data, which are between 65 and 75.</p> </div> </div> <div id="fs-idp153906304" class="bc-section section" data-depth="1"><h3 data-type="title">UNDERSTANDING SLOPE</h3> <p>The slope of the line, <em data-effect="italics">b</em>, describes how changes in the variables are related. It is important to interpret the slope of the line in the context of the situation represented by the data. You should be able to write a sentence interpreting the slope in plain English.</p> <p><strong>INTERPRETATION OF THE SLOPE:</strong> The slope of the best-fit line tells us how the dependent variable (<em data-effect="italics">y</em>) changes for every one unit increase in the independent (<em data-effect="italics">x</em>) variable, on average.</p> <p id="fs-idm33827856"><span data-type="title">THIRD EXAM vs FINAL EXAM EXAMPLE </span>Slope: The slope of the line is <em data-effect="italics">b</em> = 4.83. <span data-type="newline"><br /> </span>Interpretation: For a one-point increase in the score on the third exam, the final exam score increases by 4.83 points, on average.</p> <div id="fs-idm68284544" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p>Using the Linear Regression T Test: LinRegTTest</p> <p>&nbsp;</p> <ol type="1"><li>In the STAT list editor, enter the X data in list L1 and the Y data in list L2, paired so that the corresponding (<em data-effect="italics">x</em>,<em data-effect="italics">y</em>) values are next to each other in the lists. (If a particular pair of values is repeated, enter it as many times as it appears in the data.)</li> <li>On the STAT TESTS menu, scroll down with the cursor to select the LinRegTTest. (Be careful to select LinRegTTest, as some calculators may also have a different item called LinRegTInt.)</li> <li>On the LinRegTTest input screen enter: Xlist: L1 ; Ylist: L2 ; Freq: 1</li> <li>On the next line, at the prompt <em data-effect="italics">β</em> or <em data-effect="italics">ρ</em>, highlight &#8220;≠ 0&#8221; and press ENTER</li> <li>Leave the line for &#8220;RegEq:&#8221; blank</li> <li>Highlight Calculate and press ENTER.</li> </ol> <div id="linregttestscreens" class="bc-figure figure"><span id="id53013501" data-type="media" data-alt="1. Image of calculator input screen for LinRegTTest with input matching the instructions above. 2.Image of corresponding output calculator output screen for LinRegTTest: Output screen shows: Line 1. LinRegTTest; Line 2. y = a + bx; Line 3. beta does not equal 0 and rho does not equal 0; Line 4. t = 2.657560155; Line 5. df = 9; Line 6. a = 173.513363; Line 7. b = 4.827394209; Line 8. s = 16.41237711; Line 9. r squared = .4396931104; Line 10. r = .663093591"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch12_05_04-1.jpg" alt="1. Image of calculator input screen for LinRegTTest with input matching the instructions above. 2.Image of corresponding output calculator output screen for LinRegTTest: Output screen shows: Line 1. LinRegTTest; Line 2. y = a + bx; Line 3. beta does not equal 0 and rho does not equal 0; Line 4. t = 2.657560155; Line 5. df = 9; Line 6. a = 173.513363; Line 7. b = 4.827394209; Line 8. s = 16.41237711; Line 9. r squared = .4396931104; Line 10. r = .663093591" width="380" data-media-type="image/jpeg" /></span></div> <p>The output screen contains a lot of information. For now we will focus on a few items from the output, and will return later to the other items. <span data-type="newline"><br /> </span>The second line says <em data-effect="italics">y</em> = <em data-effect="italics">a</em> + <em data-effect="italics">bx</em>. Scroll down to find the values <em data-effect="italics">a</em> = –173.513, and <em data-effect="italics">b</em> = 4.8273; the equation of the best fit line is <em data-effect="italics">ŷ</em> = –173.51 + 4.83<em data-effect="italics">x</em> <span data-type="newline"><br /> </span>The two items at the bottom are <em data-effect="italics">r</em><sub>2</sub> = 0.43969 and <em data-effect="italics">r</em> = 0.663. For now, just note where to find these values; we will discuss them in the next two sections.</p> <p>Graphing the Scatterplot and Regression Line</p> <p>&nbsp;</p> <ol type="1"><li>We are assuming your X data is already entered in list L1 and your Y data is in list L2</li> <li>Press 2nd STATPLOT ENTER to use Plot 1</li> <li>On the input screen for PLOT 1, highlight <strong>On</strong>, and press ENTER</li> <li>For TYPE: highlight the very first icon which is the scatterplot and press ENTER</li> <li>Indicate Xlist: L1 and Ylist: L2</li> <li>For Mark: it does not matter which symbol you highlight.</li> <li>Press the ZOOM key and then the number 9 (for menu item &#8220;ZoomStat&#8221;) ; the calculator will fit the window to the data</li> <li>To graph the best-fit line, press the &#8220;Y=&#8221; key and type the equation –173.5 + 4.83X into equation Y1. (The X key is immediately left of the STAT key). Press ZOOM 9 again to graph it.</li> <li>Optional: If you want to change the viewing window, press the WINDOW key. Enter your desired window using Xmin, Xmax, Ymin, Ymax</li> </ol> </div> <div id="fs-idm135623232" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="fs-idm70635232">Another way to graph the line after you create a scatter plot is to use LinRegTTest.</p> <ol id="eip-idm46438656" type="1"><li>Make sure you have done the scatter plot. Check it on your screen.</li> <li>Go to LinRegTTest and enter the lists.</li> <li>At RegEq: press VARS and arrow over to Y-VARS. Press 1 for 1:Function. Press 1 for 1:Y1. Then arrow down to Calculate and do the calculation for the line of best fit.</li> <li>Press Y = (you will see the regression equation).</li> <li>Press GRAPH. The line will be drawn.&#8221;</li> </ol> </div> </div> <div class="bc-section section" data-depth="1"><h3 data-type="title">The Correlation Coefficient <em data-effect="italics">r</em></h3> <p id="element-12345">Besides looking at the scatter plot and seeing that a line seems reasonable, how can you tell if the line is a good predictor? Use the correlation coefficient as another indicator (besides the scatterplot) of the strength of the relationship between <em data-effect="italics">x</em> and <em data-effect="italics">y</em>.</p> <p id="eip-357">The <strong>correlation coefficient, <em data-effect="italics">r</em>, </strong> developed by Karl Pearson in the early 1900s, is numerical and provides a measure of strength and direction of the linear association between the independent variable <em data-effect="italics">x</em> and the dependent variable <em data-effect="italics">y</em>.</p> <p>The correlation coefficient is calculated as</p> <div id="eip-id4561593" data-type="equation">\(r=\frac{n\Sigma \left(xy\right)-\left(\Sigma x\right)\left(\Sigma y\right)}{\sqrt{\left[n\Sigma {x}^{2}-{\left(\Sigma x\right)}^{2}\right]\left[n\Sigma {y}^{2}-{\left(\Sigma y\right)}^{2}\right]}}\)</div> <p>where <em data-effect="italics">n</em> = the number of data points.</p> <p>If you suspect a linear relationship between <em data-effect="italics">x</em> and <em data-effect="italics">y</em>, then <em data-effect="italics">r</em> can measure how strong the linear relationship is.</p> <p id="fs-idm19432640"><span data-type="title">What the VALUE of <em data-effect="italics">r</em> tells us:</span></p> <ul><li>The value of <em data-effect="italics">r</em> is always between –1 and +1: –1 ≤ <em data-effect="italics">r</em> ≤ 1.</li> <li>The size of the correlation <em data-effect="italics">r</em> indicates the strength of the linear relationship between <em data-effect="italics">x</em> and <em data-effect="italics">y</em>. Values of <em data-effect="italics">r</em> close to –1 or to +1 indicate a stronger linear relationship between <em data-effect="italics">x</em> and <em data-effect="italics">y</em>.</li> <li>If <em data-effect="italics">r</em> = 0 there is likely no linear correlation. It is important to view the scatterplot, however, because data that exhibit a curved or horizontal pattern may have a correlation of 0.</li> <li>If <em data-effect="italics">r</em> = 1, there is perfect positive correlation. If <em data-effect="italics">r</em> = –1, there is perfect negative correlation. In both these cases, all of the original data points lie on a straight line. Of course,in the real world, this will not generally happen.</li> </ul> <p id="fs-idp154948544"><span data-type="title">What the SIGN of <em data-effect="italics">r</em> tells us</span></p> <ul><li>A positive value of <em data-effect="italics">r</em> means that when <em data-effect="italics">x</em> increases, <em data-effect="italics">y</em> tends to increase and when <em data-effect="italics">x</em> decreases, <em data-effect="italics">y</em> tends to decrease <strong>(positive correlation)</strong>.</li> <li>A negative value of <em data-effect="italics">r</em> means that when <em data-effect="italics">x</em> increases, <em data-effect="italics">y</em> tends to decrease and when <em data-effect="italics">x</em> decreases, <em data-effect="italics">y</em> tends to increase <strong>(negative correlation)</strong>.</li> <li>The sign of <em data-effect="italics">r</em> is the same as the sign of the slope, <em data-effect="italics">b</em>, of the best-fit line.</li> </ul> <div data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p>Strong correlation does not suggest that <em data-effect="italics">x</em> causes <em data-effect="italics">y</em> or <em data-effect="italics">y</em> causes <em data-effect="italics">x</em>. We say <strong>&#8220;correlation does not imply causation.&#8221;</strong></p> </div> <div id="linrgs_facts_pics" class="bc-figure figure"><div class="bc-figcaption figcaption">(a) A scatter plot showing data with a positive correlation. 0 &lt; <em data-effect="italics">r</em> &lt; 1 (b) A scatter plot showing data with a negative correlation. –1 &lt; <em data-effect="italics">r</em> &lt; 0 (c) A scatter plot showing data with zero correlation. <em data-effect="italics">r</em> = 0</div> <p><span id="eip-idp8185984" data-type="media" data-alt="Three scatter plots with lines of best fit. The first scatterplot shows points ascending from the lower left to the upper right. The line of best fit has positive slope. The second scatter plot shows points descending from the upper left to the lower right. The line of best fit has negative slope. The third scatter plot of points form a horizontal pattern. The line of best fit is a horizontal line."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch12_06_01-1.jpg" alt="Three scatter plots with lines of best fit. The first scatterplot shows points ascending from the lower left to the upper right. The line of best fit has positive slope. The second scatter plot shows points descending from the upper left to the lower right. The line of best fit has negative slope. The third scatter plot of points form a horizontal pattern. The line of best fit is a horizontal line." width="380" data-media-type="image/jpeg" /></span></p> </div> </div> <p>The formula for <em data-effect="italics">r</em> looks formidable. However, computer spreadsheets, statistical software, and many calculators can quickly calculate <em data-effect="italics">r</em>. The correlation coefficient <em data-effect="italics">r</em> is the bottom item in the output screens for the LinRegTTest on the TI-83, TI-83+, or TI-84+ calculator (see previous section for instructions).</p> <div class="bc-section section" data-depth="1"><h3 data-type="title">The Coefficient of Determination</h3> <p><strong>The variable <em data-effect="italics">r</em><sup>2</sup> is called the</strong><span data-type="term">coefficient of determination</span> and is the square of the correlation coefficient, but is usually stated as a percent, rather than in decimal form. It has an interpretation in the context of the data:</p> <ul id="eip-728"><li>\({r}^{2}\), when expressed as a percent, represents the percent of variation in the dependent (predicted) variable <em data-effect="italics">y</em> that can be explained by variation in the independent (explanatory) variable <em data-effect="italics">x</em> using the regression (best-fit) line.</li> <li>1 – \({r}^{2}\), when expressed as a percentage, represents the percent of variation in <em data-effect="italics">y</em> that is NOT explained by variation in <em data-effect="italics">x</em> using the regression line. This can be seen as the scattering of the observed data points about the regression line.</li> </ul> <p id="fs-idp19642608">Consider the <a href="#element-22">third exam/final exam example</a> introduced in the previous section<span data-type="newline"><br /> </span></p> <ul><li>The line of best fit is: <em data-effect="italics">ŷ</em> = –173.51 + 4.83x</li> <li>The correlation coefficient is <em data-effect="italics">r</em> = 0.6631</li> <li>The coefficient of determination is <em data-effect="italics">r</em><sup>2</sup> = 0.6631<sup>2</sup> = 0.4397</li> <li><strong>Interpretation of <em data-effect="italics">r</em><sup>2</sup> in the context of this example:</strong></li> <li>Approximately 44% of the variation (0.4397 is approximately 0.44) in the final-exam grades can be explained by the variation in the grades on the third exam, using the best-fit regression line.</li> <li>Therefore, approximately 56% of the variation (1 – 0.44 = 0.56) in the final exam grades can NOT be explained by the variation in the grades on the third exam, using the best-fit regression line. (This is seen as the scattering of the points about the line.)</li> </ul> </div> <div class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idm125375712">A regression line, or a line of best fit, can be drawn on a scatter plot and used to predict outcomes for the <em data-effect="italics">x</em> and <em data-effect="italics">y</em> variables in a given data set or sample data. There are several ways to find a regression line, but usually the least-squares regression line is used because it creates a uniform line. Residuals, also called “errors,” measure the distance from the actual value of <em data-effect="italics">y</em> and the estimated value of <em data-effect="italics">y</em>. The Sum of Squared Errors, when set to its minimum, calculates the points on the line of best fit. Regression lines can be used to predict values within the given set of data, but should not be used to make predictions for values outside the set of data.</p> <p id="fs-idm137061632">The correlation coefficient <em data-effect="italics">r</em> measures the strength of the linear association between <em data-effect="italics">x</em> and <em data-effect="italics">y</em>. The variable <em data-effect="italics">r</em> has to be between –1 and +1. When <em data-effect="italics">r</em> is positive, the <em data-effect="italics">x</em> and <em data-effect="italics">y</em> will tend to increase and decrease together. When <em data-effect="italics">r</em> is negative, <em data-effect="italics">x</em> will increase and <em data-effect="italics">y</em> will decrease, or the opposite, <em data-effect="italics">x</em> will decrease and <em data-effect="italics">y</em> will increase. The coefficient of determination <em data-effect="italics">r</em><sup>2</sup>, is equal to the square of the correlation coefficient. When expressed as a percent, <em data-effect="italics">r</em><sup>2</sup> represents the percent of variation in the dependent variable <em data-effect="italics">y</em> that can be explained by variation in the independent variable <em data-effect="italics">x</em> using the regression line.</p> </div> <div id="eip-234" class="practice" data-depth="1"><p><em data-effect="italics">Use the following information to answer the next five exercises</em>. A random sample of ten professional athletes produced the following data where <em data-effect="italics">x</em> is the number of endorsements the player has and <em data-effect="italics">y</em> is the amount of money made (in millions of dollars).</p> <table id="fs-idm76330016" summary=".."><caption> </caption> <thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">y</em></th> <th><em data-effect="italics">x</em></th> <th><em data-effect="italics">y</em></th> </tr> </thead> <tbody><tr><td>0</td> <td>2</td> <td>5</td> <td>12</td> </tr> <tr><td>3</td> <td>8</td> <td>4</td> <td>9</td> </tr> <tr><td>2</td> <td>7</td> <td>3</td> <td>9</td> </tr> <tr><td>1</td> <td>3</td> <td>0</td> <td>3</td> </tr> <tr><td>5</td> <td>13</td> <td>4</td> <td>10</td> </tr> </tbody> </table> <div data-type="exercise"><div data-type="problem"><p>Draw a scatter plot of the data.</p> </div> </div> <div data-type="exercise"><div id="eip-445" data-type="problem"><p>Use regression to find the equation for the line of best fit.</p> </div> <div data-type="solution"><p><em data-effect="italics">ŷ</em> = 2.23 + 1.99<em data-effect="italics">x</em></p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>Draw the line of best fit on the scatter plot.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>What is the slope of the line of best fit? What does it represent?</p> </div> <div data-type="solution"><p id="eip-540">The slope is 1.99 (<em data-effect="italics">b</em> = 1.99). It means that for every endorsement deal a professional player gets, he gets an average of another ?1.99 million in pay each year.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>What is the <em data-effect="italics">y</em>-intercept of the line of best fit? What does it represent?</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>What does an <em data-effect="italics">r</em> value of zero mean?</p> </div> <div data-type="solution"><p>It means that there is no correlation between the data sets.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>When <em data-effect="italics">n</em> = 2 and <em data-effect="italics">r</em> = 1, are the data significant? Explain.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p id="fs-idp27546736">When <em data-effect="italics">n</em> = 100 and <em data-effect="italics">r</em> = -0.89, is there a significant correlation? Explain.</p> </div> <div data-type="solution"><p id="eip-240">Yes, there are enough data points and the value of r is strong enough to show that there is a strong negative correlation between the data sets.</p> </div> </div> </div> <div class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <div id="eip-idp54658976" data-type="exercise"><div id="eip-idp54659232" data-type="problem"><p id="eip-idp106876496"><span style="font-size: 1em">1) Explain what it means when a correlation has an </span><em style="font-size: 1em" data-effect="italics">r</em><sup>2</sup> <span style="font-size: 1em">of 0.72.</span></p> </div> </div> <div data-type="exercise"><div data-type="solution"><p>&nbsp;</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>2) Can a coefficient of determination be negative? Why or why not?</p> <p><strong>Answers to odd questions</strong></p> <p>1) It means that 72% of the variation in the dependent variable (<em data-effect="italics">y</em>) can be explained by the variation in the independent variable (<em data-effect="italics">x</em>).</p> </div> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="coeffcorr"><dt>Coefficient of Correlation</dt> <dd id="id1167925021109">a measure developed by Karl Pearson (early 1900s) that gives the strength of association between the independent variable and the dependent variable; the formula is: <div id="id5499555" data-type="equation">\(r=\frac{n\Sigma \left(xy\right)-\left(\Sigma x\right)\left(\Sigma y\right)}{\sqrt{\left[n\Sigma {x}^{2}-{\left(\Sigma x\right)}^{2}\right]\left[n\Sigma {y}^{2}-{\left(\Sigma y\right)}^{2}\right]}}\)</div> <p>where <em data-effect="italics">n</em> is the number of data points. The coefficient cannot be more then 1 and less then –1. The closer the coefficient is to ±1, the stronger the evidence of a significant linear relationship between <em data-effect="italics">x</em> and <em data-effect="italics">y</em>.</p></dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-prediction" title="Chapter 3.5: Prediction"><div class="chapter-title-wrap"><h3 class="chapter-number">20</h3><h2 class="chapter-title"><span class="display-none">Chapter 3.5: Prediction</span></h2></div><div class="ugc chapter-ugc"> <p>&nbsp;</p> <p>Recall the <a href="#element-22" data-url="/contents/fd60680f-dbb7-4a97-84b8-f352c3a6c141#element-22">third exam/final exam example</a>.</p> <p>We examined the scatterplot and showed that the correlation coefficient is significant. We found the equation of the best-fit line for the final exam grade as a function of the grade on the third-exam. We can now use the least-squares regression line for prediction.</p> <p id="element-12498">Suppose you want to estimate, or predict, the mean final exam score of statistics students who received 73 on the third exam. The exam scores <strong>(<em data-effect="italics">x</em>-values)</strong> range from 65 to 75. <strong>Since 73 is between the <em data-effect="italics">x</em>-values 65 and 75</strong>, substitute <em data-effect="italics">x</em> = 73 into the equation. Then:</p> <div data-type="equation">\(\stackrel{^}{y}=-173.51+4.83\left(73\right)=179.08\)</div> <p id="fs-idm74881792">We predict that statistics students who earn a grade of 73 on the third exam will earn a grade of 179.08 on the final exam, on average.</p> <div class="textbox textbox--examples" data-type="example"><p>Recall the <a href="#element-22" data-url="/contents/fd60680f-dbb7-4a97-84b8-f352c3a6c141#element-22">third exam/final exam example</a>.</p> <p>&nbsp;</p> <div data-type="exercise"><div id="id1170592391444" data-type="problem"><p>a. What would you predict the final exam score to be for a student who scored a 66 on the third exam?</p> </div> <div id="id1170583280560" data-type="solution"><p>a. 145.27</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id1170601334360" data-type="problem"><p>b. What would you predict the final exam score to be for a student who scored a 90 on the third exam?</p> </div> <div id="id3981938" data-type="solution" data-print-placement="end"><p>b. The <em data-effect="italics">x</em> values in the data are between 65 and 75. Ninety is outside of the domain of the observed <em data-effect="italics">x</em> values in the data (independent variable), so you cannot reliably predict the final exam score for this student. (Even though it is possible to enter 90 into the equation for <em data-effect="italics">x</em> and calculate a corresponding <em data-effect="italics">y</em> value, the <em data-effect="italics">y</em> value that you get will not be reliable.) <span data-type="newline"><br /> </span><span data-type="newline"><br /> </span> To understand really how unreliable the prediction can be outside of the observed <em data-effect="italics">x</em> values observed in the data, make the substitution <em data-effect="italics">x</em> = 90 into the equation. <span data-type="newline"><br /> </span><span data-type="newline"><br /> </span> \(\stackrel{^}{y}=–173.51+4.83\left(90\right)=261.19\) <span data-type="newline"><br /> </span><span data-type="newline"><br /> </span> The final-exam score is predicted to be 261.19. The largest the final-exam score can be is 200. <span data-type="newline"><br /> </span></p> <div id="eip-id1168998191288" data-type="note" data-has-label="true" data-label=""><p><span data-type="title">Note</span></p> <p id="eip-idp117960960">The process of predicting inside of the observed <em data-effect="italics">x</em> values observed in the data is called <span data-type="term">interpolation</span>. The process of predicting outside of the observed <em data-effect="italics">x</em> values observed in the data is called <span data-type="term">extrapolation</span>.</p> </div> </div> </div> </div> <div id="fs-idp125531648" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div data-type="problem"><p>Data are collected on the relationship between the number of hours per week practicing a musical instrument and scores on a math test. The line of best fit is as follows:</p> <p id="eip-idp180607312"><em data-effect="italics">ŷ</em> = 72.5 + 2.8<em data-effect="italics">x</em> <span data-type="newline"><br /> </span>What would you predict the score on a math test would be for a student who practices a musical instrument for five hours a week?</p> </div> </div> </div> <div class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="eip-idp173225584">Data from the Centers for Disease Control and Prevention.</p> <p id="eip-idp173225968">Data from the National Center for agency reporting flu cases and TB Prevention.</p> <p id="eip-idp57756432">Data from the United States Census Bureau. Available online at http://www.census.gov/compendia/statab/cats/transportation/motor_vehicle_accidents_and_fatalities.html</p> <p id="eip-idm1014608">Data from the National Center for Health Statistics.</p> </div> <div class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idm72353840">After determining the presence of a strong correlation coefficient and calculating the line of best fit, you can use the least squares regression line to make predictions about your data.</p> </div> <div class="practice" data-depth="1"><p><em data-effect="italics">Use the following information to answer the next two exercises</em>. An electronics retailer used regression to find a simple model to predict sales growth in the first quarter of the new year (January through March). The model is good for 90 days, where <em data-effect="italics">x</em> is the day. The model can be written as follows:</p> <p><em data-effect="italics">ŷ</em> = 101.32 + 2.48<em data-effect="italics">x</em> where <em data-effect="italics">ŷ</em> is in thousands of dollars.</p> <div data-type="exercise"><div data-type="problem"><p>What would you predict the sales to be on day 60?</p> </div> <div data-type="solution"><p>\$250,120</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>What would you predict the sales to be on day 90?</p> </div> </div> <p><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next three exercises</em>. A landscaping company is hired to mow the grass for several large properties. The total area of the properties combined is 1,345 acres. The rate at which one person can mow is as follows:</p> <p><em data-effect="italics">ŷ</em> = 1350 – 1.2<em data-effect="italics">x</em> where <em data-effect="italics">x</em> is the number of hours and <em data-effect="italics">ŷ</em> represents the number of acres left to mow.</p> <div id="eip-584" data-type="exercise"><div data-type="problem"><p>How many acres will be left to mow after 20 hours of work?</p> </div> <div data-type="solution"><p>1,326 acres</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>How many acres will be left to mow after 100 hours of work?</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>How many hours will it take to mow all of the lawns? (When is <em data-effect="italics">ŷ</em> = 0?)</p> </div> <div id="eip-274" data-type="solution"><p>1,125 hours, or when <em data-effect="italics">x</em> = 1,125</p> </div> </div> <p><a class="autogenerated-content" href="#element-806" data-url="https://pressbooks.ccconline.org/accintrostats/chapter/prediction/#element-806">(Figure)</a> contains real data for the first two decades of flu cases reporting.</p> <table summary="This table presents the year of reporting flu cases and deaths in the first column, number of flu cases diagnosed in the second column, and number of flu deaths in the third column."><caption><span data-type="title">Adults and Adolescents only, United States </span></caption> <tbody><tr><td><strong>Year </strong></td> <td><strong># flu cases diagnosed</strong></td> <td><strong># flu deaths </strong></td> </tr> <tr><td>Pre-1981</td> <td>91</td> <td>29</td> </tr> <tr><td>1981</td> <td>319</td> <td>121</td> </tr> <tr><td>1982</td> <td>1,170</td> <td>453</td> </tr> <tr><td>1983</td> <td>3,076</td> <td>1,482</td> </tr> <tr><td>1984</td> <td>6,240</td> <td>3,466</td> </tr> <tr><td>1985</td> <td>11,776</td> <td>6,878</td> </tr> <tr><td>1986</td> <td>19,032</td> <td>11,987</td> </tr> <tr><td>1987</td> <td>28,564</td> <td>16,162</td> </tr> <tr><td>1988</td> <td>35,447</td> <td>20,868</td> </tr> <tr><td>1989</td> <td>42,674</td> <td>27,591</td> </tr> <tr><td>1990</td> <td>48,634</td> <td>31,335</td> </tr> <tr><td>1991</td> <td>59,660</td> <td>36,560</td> </tr> <tr><td>1992</td> <td>78,530</td> <td>41,055</td> </tr> <tr><td>1993</td> <td>78,834</td> <td>44,730</td> </tr> <tr><td>1994</td> <td>71,874</td> <td>49,095</td> </tr> <tr><td>1995</td> <td>68,505</td> <td>49,456</td> </tr> <tr><td>1996</td> <td>59,347</td> <td>38,510</td> </tr> <tr><td>1997</td> <td>47,149</td> <td>20,736</td> </tr> <tr><td>1998</td> <td>38,393</td> <td>19,005</td> </tr> <tr><td>1999</td> <td>25,174</td> <td>18,454</td> </tr> <tr><td>2000</td> <td>25,522</td> <td>17,347</td> </tr> <tr><td>2001</td> <td>25,643</td> <td>17,402</td> </tr> <tr><td>2002</td> <td>26,464</td> <td>16,371</td> </tr> <tr><td><strong>Total</strong></td> <td><strong>802,118</strong></td> <td><strong>489,093</strong></td> </tr> </tbody> </table> <div id="fs-idm41333104" data-type="exercise"><div id="fs-idp95587552" data-type="problem"><p id="fs-idp149524096">Graph “year” versus “# flu cases diagnosed” (plot the scatter plot). Do not include pre-1981 data.</p> </div> </div> <div id="fs-idp60770720" data-type="exercise"><div id="fs-idm63008624" data-type="problem"><p id="fs-idp60410256">Perform linear regression. What is the linear equation? Round to the nearest whole number.</p> </div> <div id="fs-idp27267136" data-type="solution"><p id="fs-idp39386448">Check student’s solution.</p> </div> </div> <div id="fs-idm89824624" data-type="exercise"><div id="fs-idm35742144" data-type="problem"><p id="fs-idm38107680">Find the correlation coefficient.<span data-type="newline"><br /> </span></p> <ol id="eip-idm31798608" type="a"><li><em data-effect="italics">r</em> = ________</li> </ol> </div> </div> <div data-type="exercise"><div id="id3231872" data-type="problem"><p>Solve.</p> <ol id="eip-idp112356080" type="a"><li>When <em data-effect="italics">x</em> = 1985, <em data-effect="italics">ŷ</em> = _____</li> <li>When <em data-effect="italics">x</em> = 1990, <em data-effect="italics">ŷ</em> =_____</li> <li>When <em data-effect="italics">x</em> = 1970, <em data-effect="italics">ŷ</em> =______ Why doesn’t this answer make sense?</li> </ol> </div> <div id="id3763418" data-type="solution"><ol id="eip-idp156589440" type="a"><li>When <em data-effect="italics">x</em> = 1985, <em data-effect="italics">ŷ</em> = 25,52</li> <li>When <em data-effect="italics">x</em> = 1990, <em data-effect="italics">ŷ</em> = 34,275</li> <li>When <em data-effect="italics">x</em> = 1970, <em data-effect="italics">ŷ</em> = –725 Why doesn’t this answer make sense? The range of <em data-effect="italics">x</em> values was 1981 to 2002; the year 1970 is not in this range. The regression equation does not apply, because predicting for the year 1970 is extrapolation, which requires a different process. Also, a negative number does not make sense in this context, where we are predicting flu cases diagnosed.</li> </ol> </div> </div> <div id="fs-idp123707680" data-type="exercise"><div id="fs-idp101389664" data-type="problem"><p id="fs-idm47746240">Does the line seem to fit the data? Why or why not?</p> </div> </div> <div id="fs-idp46020944" data-type="exercise"><div id="fs-idp87495152" data-type="problem"><p id="fs-idp59562368">What does the correlation imply about the relationship between time (years) and the number of diagnosed flu cases reported in the U.S.?</p> </div> <div id="fs-idp63232128" data-type="solution"><p id="fs-idm19062048">Also, the correlation <em data-effect="italics">r</em> = 0.4526. If <em data-effect="italics">r</em> is compared to the value in the 95% Critical Values of the Sample Correlation Coefficient Table, because <em data-effect="italics">r</em> &gt; 0.423, <em data-effect="italics">r</em> is significant, and you would think that the line could be used for prediction. But the scatter plot indicates otherwise.</p> </div> </div> <p>&nbsp;</p> <div id="fs-idm79515504" data-type="exercise"><div id="fs-idm117830672" data-type="problem"><p id="element-263">Plot the two given points on the following graph. Then, connect the two points to form the regression line.</p> <div id="fs-idp74029648" class="bc-figure figure"><span id="id4253169" data-type="media" data-alt="Blank graph with horizontal and vertical axes." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch12_14_01-1.jpg" alt="Blank graph with horizontal and vertical axes." width="380" data-media-type="image/jpeg" /></span></div> <p id="element-2352642">Obtain the graph on your calculator or computer.</p> </div> </div> <p>&nbsp;</p> <div data-type="exercise"><div id="id3371067" data-type="problem"><p>Write the equation: <em data-effect="italics">ŷ</em>= ____________</p> </div> <div id="id3238552" data-type="solution"><p>\(\stackrel{^}{y}\) = 3,448,225 + 1750<em data-effect="italics">x</em></p> </div> </div> <div data-type="exercise"><div id="id3281976" data-type="problem"><p id="element-659">Hand draw a smooth curve on the graph that shows the flow of the data.</p> </div> </div> <p>&nbsp;</p> <div data-type="exercise"><div id="id4193945" data-type="problem"><p>Does the line seem to fit the data? Why or why not?</p> </div> <div id="fs-idm125506640" data-type="solution"><p id="fs-idm47836464">There was an increase in flu cases diagnosed until 1993. From 1993 through 2002, the number of flu cases diagnosed declined each year. It is not appropriate to use a linear regression line to fit to the data.</p> </div> </div> <div data-type="exercise"><div id="id4247751" data-type="problem"><p>Do you think a linear fit is best? Why or why not?</p> </div> </div> <div data-type="exercise"><div id="id4227065" data-type="problem"><p>What does the correlation imply about the relationship between time (years) and the number of diagnosed flu cases reported in the U.S.?</p> </div> <div id="fs-idp25424512" data-type="solution"><p id="fs-idm57267184">Since there is no linear association between year and # of flu cases diagnosed, it is not appropriate to calculate a linear correlation coefficient. When there is a linear association and it is appropriate to calculate a correlation, we cannot say that one variable “causes” the other variable.</p> </div> </div> <p>&nbsp;</p> <div id="fs-idp87876464" data-type="exercise"><div id="fs-idm79455904" data-type="problem"><p>Graph “year” vs. “# flu cases diagnosed.” Do not include pre-1981. Label both axes with words. Scale both axes.</p> </div> </div> <div data-type="exercise"><div id="id4187966" data-type="problem"><p>Enter your data into your calculator or computer. The pre-1981 data should not be included. Why is that so?</p> <p>Write the linear equation, rounding to four decimal places:</p> </div> <div id="fs-idp16730128" data-type="solution"><p id="fs-idm46065584">We don’t know if the pre-1981 data was collected from a single year. So we don’t have an accurate <em data-effect="italics">x</em> value for this figure.</p> <p id="fs-idm126049344">Regression equation: <em data-effect="italics">ŷ</em> (#Flu Cases) = –3,448,225 + 1749.777 (year)</p> <table id="fs-idm85288224" summary=".."><thead><tr><th></th> <th>Coefficients</th> </tr> </thead> <tbody><tr><td>Intercept</td> <td>–3,448,225</td> </tr> <tr><td><em data-effect="italics">X</em> Variable 1</td> <td>1,749.777</td> </tr> </tbody> </table> </div> </div> <div data-type="exercise"><div id="id4176091" data-type="problem"><p id="element-50234233422">Find the correlation coefficient.</p> <ol id="list-234" type="a"><li>correlation = _____</li> </ol> </div> </div> </div> <div id="fs-idp66999104" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <div id="element-401" data-type="exercise"><div id="id44884447" data-type="problem"><p>1) Recently, the annual number of driver deaths per 100,000 for the selected age groups was as follows:</p> <table id="fs-idp87463888" summary=".."><thead><tr><th>Age</th> <th>Number of Driver Deaths per 100,000</th> </tr> </thead> <tbody><tr><td>16–19</td> <td>38</td> </tr> <tr><td>20–24</td> <td>36</td> </tr> <tr><td>25–34</td> <td>24</td> </tr> <tr><td>35–54</td> <td>20</td> </tr> <tr><td>55–74</td> <td>18</td> </tr> <tr><td>75+</td> <td>28</td> </tr> </tbody> </table> <ol type="a"><li>For each age group, pick the midpoint of the interval for the <em data-effect="italics">x</em> value. (For the 75+ group, use 80.)</li> <li>Using “ages” as the independent variable and “Number of driver deaths per 100,000” as the dependent variable, make a scatter plot of the data.</li> <li>Calculate the least squares (best–fit) line. Put the equation in the form of: <em data-effect="italics">ŷ</em> = <em data-effect="italics">a</em> + <em data-effect="italics">bx</em></li> <li>Find the correlation coefficient. Is it significant?</li> <li>Predict the number of deaths for ages 40 and 60.</li> <li>Based on the given data, is there a linear relationship between age of a driver and driver fatality rate?</li> <li>What is the slope of the least squares (best-fit) line? Interpret the slope.</li> </ol> <p>&nbsp;</p> </div> <div id="fs-idm44998864" data-type="solution"></div> </div> <div data-type="exercise"><div id="id44888890" data-type="problem"><p>2) <a class="autogenerated-content" href="#idasdgf10411945" data-url="https://pressbooks.ccconline.org/accintrostats/chapter/prediction/#idasdgf10411945">(Figure)</a> shows the life expectancy for an individual born in the United States in certain years.</p> <table id="idasdgf10411945" summary="This table presents year of birth in the first column and life expectancy in the second column."><thead><tr><th>Year of Birth</th> <th>Life Expectancy</th> </tr> </thead> <tbody><tr><td data-align="center">1930</td> <td data-align="center">59.7</td> </tr> <tr><td data-align="center">1940</td> <td data-align="center">62.9</td> </tr> <tr><td data-align="center">1950</td> <td data-align="center">70.2</td> </tr> <tr><td data-align="center">1965</td> <td data-align="center">69.7</td> </tr> <tr><td data-align="center">1973</td> <td data-align="center">71.4</td> </tr> <tr><td data-align="center">1982</td> <td data-align="center">74.5</td> </tr> <tr><td data-align="center">1987</td> <td data-align="center">75</td> </tr> <tr><td data-align="center">1992</td> <td data-align="center">75.7</td> </tr> <tr><td data-align="center">2010</td> <td data-align="center">78.7</td> </tr> </tbody> </table> <ol id="element-796" type="a"><li>Decide which variable should be the independent variable and which should be the dependent variable.</li> <li>Draw a scatter plot of the ordered pairs.</li> <li>Calculate the least squares line. Put the equation in the form of: <em data-effect="italics">ŷ</em> = <em data-effect="italics">a</em> + <em data-effect="italics">bx</em></li> <li>Find the correlation coefficient. Is it significant?</li> <li>Find the estimated life expectancy for an individual born in 1950 and for one born in 1982.</li> <li>Why aren’t the answers to part e the same as the values in <a class="autogenerated-content" href="#idasdgf10411945" data-url="https://pressbooks.ccconline.org/accintrostats/chapter/prediction/#idasdgf10411945">(Figure)</a> that correspond to those years?</li> <li>Use the two points in part e to plot the least squares line on your graph from part b.</li> <li>Based on the data, is there a linear relationship between the year of birth and life expectancy?</li> <li>Are there any outliers in the data?</li> <li>Using the least squares line, find the estimated life expectancy for an individual born in 1850. Does the least squares line give an accurate estimate for that year? Explain why or why not.</li> <li>What is the slope of the least-squares (best-fit) line? Interpret the slope.</li> </ol> <p>&nbsp;</p> </div> </div> <div id="element-682" data-type="exercise"><div id="id44892027" data-type="problem"><p>3) The maximum discount value of the Entertainment® card for the “Fine Dining” section, Edition ten, for various pages is given in <a class="autogenerated-content" href="#id9216hh080" data-url="https://pressbooks.ccconline.org/accintrostats/chapter/prediction/#id9216hh080">(Figure)</a></p> <table id="id9216hh080" summary="This table presents the page number in the first column and the maximum value (💲) in the second column."><thead><tr><th>Page number</th> <th>Maximum value (?)</th> </tr> </thead> <tbody><tr><td data-align="center">4</td> <td data-align="center">16</td> </tr> <tr><td data-align="center">14</td> <td data-align="center">19</td> </tr> <tr><td data-align="center">25</td> <td data-align="center">15</td> </tr> <tr><td data-align="center">32</td> <td data-align="center">17</td> </tr> <tr><td data-align="center">43</td> <td data-align="center">19</td> </tr> <tr><td data-align="center">57</td> <td data-align="center">15</td> </tr> <tr><td data-align="center">72</td> <td data-align="center">16</td> </tr> <tr><td data-align="center">85</td> <td data-align="center">15</td> </tr> <tr><td data-align="center">90</td> <td data-align="center">17</td> </tr> </tbody> </table> <ol id="element-86" type="a"><li>Decide which variable should be the independent variable and which should be the dependent variable.</li> <li>Draw a scatter plot of the ordered pairs.</li> <li>Calculate the least-squares line. Put the equation in the form of: <em data-effect="italics">ŷ</em> = <em data-effect="italics">a</em> + <em data-effect="italics">bx</em></li> <li>Find the correlation coefficient. Is it significant?</li> <li>Find the estimated maximum values for the restaurants on page ten and on page 70.</li> <li>Does it appear that the restaurants giving the maximum value are placed in the beginning of the “Fine Dining” section? How did you arrive at your answer?</li> <li>Suppose that there were 200 pages of restaurants. What do you estimate to be the maximum value for a restaurant listed on page 200?</li> <li>Is the least squares line valid for page 200? Why or why not?</li> <li>What is the slope of the least-squares (best-fit) line? Interpret the slope.</li> </ol> </div> <div id="fs-idp21667088" data-type="solution"><ol id="fs-idp21667344" type="a"></ol> <p id="fs-idp70584384"></p></div> </div> <div id="element-263a" data-type="exercise"><div id="id44847508" data-type="problem"><p>4) <a class="autogenerated-content" href="#id9978lok311" data-url="https://pressbooks.ccconline.org/accintrostats/chapter/prediction/#id9978lok311">(Figure)</a> gives the gold medal times for every other Summer Olympics for the women’s 100-meter freestyle (swimming).</p> <table id="id9978lok311" summary="This table presents the summer olympics year in the first column and the women's 100 meter freestyle time in seconds in the second column."><thead><tr><th>Year</th> <th>Time (seconds)</th> </tr> </thead> <tbody><tr><td data-align="center">1912</td> <td data-align="center">82.2</td> </tr> <tr><td data-align="center">1924</td> <td data-align="center">72.4</td> </tr> <tr><td data-align="center">1932</td> <td data-align="center">66.8</td> </tr> <tr><td data-align="center">1952</td> <td data-align="center">66.8</td> </tr> <tr><td data-align="center">1960</td> <td data-align="center">61.2</td> </tr> <tr><td data-align="center">1968</td> <td data-align="center">60.0</td> </tr> <tr><td data-align="center">1976</td> <td data-align="center">55.65</td> </tr> <tr><td data-align="center">1984</td> <td data-align="center">55.92</td> </tr> <tr><td data-align="center">1992</td> <td data-align="center">54.64</td> </tr> <tr><td data-align="center">2000</td> <td data-align="center">53.8</td> </tr> <tr><td data-align="center">2008</td> <td data-align="center">53.1</td> </tr> </tbody> </table> <ol type="a"><li>Decide which variable should be the independent variable and which should be the dependent variable.</li> <li>Draw a scatter plot of the data.</li> <li>Does it appear from inspection that there is a relationship between the variables? Why or why not?</li> <li>Calculate the least squares line. Put the equation in the form of: <em data-effect="italics">ŷ</em> = <em data-effect="italics">a</em> + <em data-effect="italics">bx</em>.</li> <li>Find the correlation coefficient. Is the decrease in times significant?</li> <li>Find the estimated gold medal time for 1932. Find the estimated time for 1984.</li> <li>Why are the answers from part f different from the chart values?</li> <li>Does it appear that a line is the best way to fit the data? Why or why not?</li> <li>Use the least-squares line to estimate the gold medal time for the next Summer Olympics. Do you think that your answer is reasonable? Why or why not?</li> </ol> </div> </div> <div data-type="exercise"><div id="id44850206" data-type="problem"><table id="id10945556" summary="This table presents the state names in the first column, number of letters in the state name in the second column, year entered in the union in the third column, rank for entering the union in the fourth column, and state area in square miles in the last column."><thead><tr><th>State</th> <th># letters in name</th> <th>Year entered the Union</th> <th>Rank for entering the Union</th> <th>Area (square miles)</th> </tr> </thead> <tbody><tr><td>Alabama</td> <td>7</td> <td>1819</td> <td>22</td> <td>52,423</td> </tr> <tr><td>Colorado</td> <td>8</td> <td>1876</td> <td>38</td> <td>104,100</td> </tr> <tr><td>Hawaii</td> <td>6</td> <td>1959</td> <td>50</td> <td>10,932</td> </tr> <tr><td>Iowa</td> <td>4</td> <td>1846</td> <td>29</td> <td>56,276</td> </tr> <tr><td>Maryland</td> <td>8</td> <td>1788</td> <td>7</td> <td>12,407</td> </tr> <tr><td>Missouri</td> <td>8</td> <td>1821</td> <td>24</td> <td>69,709</td> </tr> <tr><td>New Jersey</td> <td>9</td> <td>1787</td> <td>3</td> <td>8,722</td> </tr> <tr><td>Ohio</td> <td>4</td> <td>1803</td> <td>17</td> <td>44,828</td> </tr> <tr><td>South Carolina</td> <td>13</td> <td>1788</td> <td>8</td> <td>32,008</td> </tr> <tr><td>Utah</td> <td>4</td> <td>1896</td> <td>45</td> <td>84,904</td> </tr> <tr><td>Wisconsin</td> <td>9</td> <td>1848</td> <td>30</td> <td>65,499</td> </tr> </tbody> </table> <p>5) We are interested in whether or not the number of letters in a state name depends upon the year the state entered the Union.</p> <ol id="element-0" type="a"><li>Decide which variable should be the independent variable and which should be the dependent variable.</li> <li>Draw a scatter plot of the data.</li> <li>Does it appear from inspection that there is a relationship between the variables? Why or why not?</li> <li>Calculate the least-squares line. Put the equation in the form of: <em data-effect="italics">ŷ</em> = <em data-effect="italics">a</em> + <em data-effect="italics">bx</em>.</li> <li>Find the correlation coefficient. What does it imply about the significance of the relationship?</li> <li>Find the estimated number of letters (to the nearest integer) a state would have if it entered the Union in 1900. Find the estimated number of letters a state would have if it entered the Union in 1940.</li> <li>Does it appear that a line is the best way to fit the data? Why or why not?</li> <li>Use the least-squares line to estimate the number of letters a new state that enters the Union this year would have. Can the least squares line be used to predict it? Why or why not?</li> </ol> </div> <div id="fs-idp67061424" data-type="solution"><p>&nbsp;</p> <p><strong>Answers to odd questions</strong></p> <p>1)</p> <ol id="fs-idp19420064" type="a"><li><table id="id108047k45" summary="This table presents the age groups in the first column and the number of driver deaths per 100,000 in the second column."><thead><tr><th>Age</th> <th>Number of Driver Deaths per 100,000</th> </tr> </thead> <tbody><tr><td>16–19</td> <td data-align="center">38</td> </tr> <tr><td>20–24</td> <td data-align="center">36</td> </tr> <tr><td>25–34</td> <td data-align="center">24</td> </tr> <tr><td>35–54</td> <td data-align="center">20</td> </tr> <tr><td>55–74</td> <td data-align="center">18</td> </tr> <tr><td>75+</td> <td data-align="center">28</td> </tr> </tbody> </table> </li> <li>Check student’s solution.</li> <li><em data-effect="italics">ŷ</em> = 35.5818045 – 0.19182491<em data-effect="italics">x</em></li> <li><em data-effect="italics">r</em> = –0.57874<span data-type="newline"><br /> </span>For four <em data-effect="italics">df</em> and alpha = 0.05, the LinRegTTest gives <em data-effect="italics">p</em>-value = 0.2288 so we do not reject the null hypothesis; there is not a significant linear relationship between deaths and age.<span data-type="newline"><br /> </span>Using the table of critical values for the correlation coefficient, with four <em data-effect="italics">df</em>, the critical value is 0.811. The correlation coefficient <em data-effect="italics">r</em> = –0.57874 is not less than –0.811, so we do not reject the null hypothesis.</li> <li>There is not a linear relationship between the two variables, as evidenced by a p-value greater than 0.05.</li> </ol> <p>3)</p> <ol type="a"><li>We wonder if the better discounts appear earlier in the book so we select page as <em data-effect="italics">X</em> and discount as <em data-effect="italics">Y</em>.</li> <li>Check student’s solution.</li> <li><em data-effect="italics">ŷ</em> = 17.21757 – 0.01412<em data-effect="italics">x</em></li> <li><em data-effect="italics">r</em> = – 0.2752 <span data-type="newline"><br /> </span>For seven <em data-effect="italics">df</em> and alpha = 0.05, using LinRegTTest <em data-effect="italics">p</em>-value = 0.4736 so we do not reject; there is a not a significant linear relationship between page and discount.<span data-type="newline"><br /> </span>Using the table of critical values for the correlation coefficient, with seven <em data-effect="italics">df</em>, the critical value is 0.666. The correlation coefficient <em data-effect="italics">xi</em> = –0.2752 is not less than 0.666 so we do not reject.</li> <li>There is not a significant linear correlation so it appears there is no relationship between the page and the amount of the discount.</li> </ol> <p>As the page number increases by one page, the discount decreases by \$0.01412</p> <p>5)</p> <ol id="fs-idp34721504" type="a"><li>Year is the independent or <em data-effect="italics">x</em> variable; the number of letters is the dependent or <em data-effect="italics">y</em> variable.</li> <li>Check student’s solution.</li> <li>no</li> <li><em data-effect="italics">ŷ</em> = 47.03 – 0.0216<em data-effect="italics">x</em></li> <li>–0.4280 The r-value indicates that there is not a significant correlation between the year the state entered the union and the number of letters in the name.</li> <li>No, the relationship does not appear to be linear; the correlation is not significant.</li> </ol> <p>&nbsp;</p> </div> </div> </div> </div></div>
<div class="chapter standard" id="chapter-regression-distance-from-school" title="Activity 3.6: Regression (Distance from School)"><div class="chapter-title-wrap"><h3 class="chapter-number">21</h3><h2 class="chapter-title"><span class="display-none">Activity 3.6: Regression (Distance from School)</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-id1172772301748" class="statistics lab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Regression (Distance from School)</div> <p id="fs-idp105700624">Class Time:</p> <p id="fs-idp69476432">Names:</p> <div id="fs-idm39104336" data-type="list"><div data-type="title">Student Learning Outcomes</div> <ul><li>The student will calculate and construct the line of best fit between two variables.</li> <li>The student will evaluate the relationship between two variables to determine if that relationship is significant.</li> </ul> </div> <p><span data-type="title">Collect the Data</span>Use eight members of your class for the sample. Collect bivariate data (distance an individual lives from school, the cost of supplies for the current term).</p> <ol id="fs-idp17639616"><li>Complete the table.<br /> <table id="fs-idp6640" summary="Blank table with distance from school in the first column and cost of supplies this term in the second column. 16 empty cells"><thead><tr><th>Distance from school</th> <th>Cost of supplies this term</th> </tr> </thead> <tbody><tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> </tbody> </table> </li> <li>Which variable should be the dependent variable and which should be the independent variable? Why?</li> <li>Graph “distance” vs. “cost.” Plot the points on the graph. Label both axes with words. Scale both axes. <div id="id6749459" class="bc-figure figure"><span id="id6749463" data-type="media" data-alt="Blank graph with vertical and horizontal axes."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch12_14_01-1.png" alt="Blank graph with vertical and horizontal axes." width="380" data-media-type="image/png" /></span></div> </li> </ol> <p id="fs-idm26666928"><span data-type="title">Analyze the Data</span>Enter your data into your calculator or computer. Write the linear equation, rounding to four decimal places.</p> <ol id="fs-idm15728672" type="1"><li>Calculate the following: <ol id="fs-idp38070176" type="a"><li><em data-effect="italics">a</em> = ______</li> <li><em data-effect="italics">b</em> = ______</li> <li>correlation = ______</li> <li><em data-effect="italics">n</em> = ______</li> <li>equation: <em data-effect="italics">ŷ</em> = ______</li> <li>Is the correlation significant? Why or why not? (Answer in one to three complete sentences.)</li> </ol> </li> <li>Supply an answer for the following senarios: <ol id="fs-idm25916576" type="a"><li>For a person who lives eight miles from campus, predict the total cost of supplies this term:</li> <li>For a person who lives eighty miles from campus, predict the total cost of supplies this term:</li> </ol> </li> <li>Obtain the graph on your calculator or computer. Sketch the regression line. <div id="id6749694" class="bc-figure figure"><span id="id6749698" data-type="media" data-alt="Blank graph with vertical and horizontal axes."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch12_14_02-1.png" alt="Blank graph with vertical and horizontal axes." width="380" data-media-type="image/png" /></span></div> </li> </ol> <div id="fs-idm72135792" data-type="list"><div data-type="title">Discussion Questions</div> <ol><li>Answer each question in complete sentences. <ol id="fs-idp21518976" type="a"><li>Does the line seem to fit the data? Why?</li> <li>What does the correlation imply about the relationship between the distance and the cost?</li> </ol> </li> <li>Are there any outliers? If so, which point is an outlier?</li> <li>Should the outlier, if it exists, be removed? Why or why not?</li> </ol> </div> </div> </div></div>
<div class="chapter standard" id="chapter-regression-textbook-cost" title="Activity 3.7: Regression (Textbook Cost)"><div class="chapter-title-wrap"><h3 class="chapter-number">22</h3><h2 class="chapter-title"><span class="display-none">Activity 3.7: Regression (Textbook Cost)</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-id1172137875645" class="statistics lab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Regression (Textbook Cost)</div> <p id="fs-idm90448">Class Time:</p> <p id="fs-idp162776560">Names:</p> <div id="fs-idp101503968" data-type="list"><div data-type="title">Student Learning Outcomes</div> <ul><li>The student will calculate and construct the line of best fit between two variables.</li> <li>The student will evaluate the relationship between two variables to determine if that relationship is significant.</li> </ul> </div> <p id="fs-idp70571712"><span data-type="title">Collect the Data</span> Survey ten textbooks. Collect bivariate data (number of pages in a textbook, the cost of the textbook).</p> <ol id="fs-idp53322144"><li>Complete the table.<br /> <table id="fs-idm40594000" summary="The blank table has number of pages in the first column and cost of textbook in the second column. 20 empty cells."><thead><tr><th>Number of pages</th> <th>Cost of textbook</th> </tr> </thead> <tbody><tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> </tbody> </table> </li> <li>Which variable should be the dependent variable and which should be the independent variable? Why?</li> <li>Graph “pages” vs. “cost.” Plot the points on the graph in <a href="#fs-idm37313136">Analyze the Data</a>. Label both axes with words. Scale both axes.</li> </ol> <p id="fs-idm37313136"><span data-type="title">Analyze the Data</span> Enter your data into your calculator or computer. Write the linear equation, rounding to four decimal places.</p> <ol id="fs-idm130060688"><li>Calculate the following: <ol id="fs-idm69177152" type="a"><li><em data-effect="italics">a</em> = ______</li> <li><em data-effect="italics">b</em> = ______</li> <li>correlation = ______</li> <li><em data-effect="italics">n</em> = ______</li> <li>equation: <em data-effect="italics">y</em> = ______</li> <li>Is the correlation significant? Why or why not? (Answer in complete sentences.)</li> </ol> </li> <li>Supply an answer for the following senarios: <ol id="fs-idp3298784" type="a"><li>For a textbook with 400 pages, predict the cost.</li> <li>For a textbook with 600 pages, predict the cost.</li> </ol> </li> <li>Obtain the graph on your calculator or computer. Sketch the regression line. <div id="id19863024" class="bc-figure figure"><span id="id20268459" data-type="media" data-alt="Blank graph with vertical and horizontal axes."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch12_15_01-1.png" alt="Blank graph with vertical and horizontal axes." width="380" data-media-type="image/png" /></span></div> </li> </ol> <div id="fs-idp7458256" data-type="list"><div data-type="title">Discussion Questions</div> <ol><li>Answer each question in complete sentences. <ol id="fs-idm37663456" type="a"><li>Does the line seem to fit the data? Why?</li> <li>What does the correlation imply about the relationship between the number of pages and the cost?</li> </ol> </li> <li>Are there any outliers? If so, which point(s) is an outlier?</li> <li>Should the outlier, if it exists, be removed? Why or why not?</li> </ol> </div> </div> </div></div>
<div class="chapter standard" id="chapter-regression-fuel-efficiency" title="Activity 3.8: Regression (Fuel Efficiency)"><div class="chapter-title-wrap"><h3 class="chapter-number">23</h3><h2 class="chapter-title"><span class="display-none">Activity 3.8: Regression (Fuel Efficiency)</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-id1170235891735" class="statistics lab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Regression (Fuel Efficiency)</div> <p id="id7738260">Class Time:</p> <p id="id7843117">Names:</p> <div id="id7650443" data-type="list"><div data-type="title">Student Learning Outcomes</div> <ul><li>The student will calculate and construct the line of best fit between two variables.</li> <li>The student will evaluate the relationship between two variables to determine if that relationship is significant.</li> </ul> </div> <p id="fs-idp36246160"><span data-type="title">Collect the Data</span>Find a reputable source that provides information on total fuel efficiency (in miles per gallon) and weight (in pounds) of new model cars with automatic transmissions. We will use this data to determine the relationship, if any, between the fuel efficiency of a car and its weight.</p> <ol id="fs-idp98195168"><li>Using your random number generator, randomly select 20 cars from the list and record their weights and fuel efficiency into <a class="autogenerated-content" href="#id7895697570">(Figure)</a>.<br /> <table id="id7895697570" summary="Blank table with weight in the first column and fuel efficiency in the second column. 40 empty cells."><thead><tr><th>Weight</th> <th>Fuel Efficiency</th> </tr> </thead> <tbody><tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> </tbody> </table> </li> <li>Which variable should be the dependent variable and which should be the independent variable? Why?</li> <li>By hand, do a scatterplot of “weight” vs. “fuel efficiency”. Plot the points on graph paper. Label both axes with words. Scale both axes accurately. <div id="id21784687" class="bc-figure figure"><span id="id21784693" data-type="media" data-alt="Blank graph with vertical and horizontal axes."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch12_16_01-1.png" alt="Blank graph with vertical and horizontal axes." width="380" data-media-type="image/pngg" /></span></div> </li> </ol> <p id="fs-idm70550048"><span data-type="title">Analyze the Data</span> Enter your data into your calculator or computer. Write the linear equation, rounding to 4 decimal places.</p> <ol id="fs-idp56557248"><li>Calculate the following: <ol id="fs-idp7856912" type="a"><li><em data-effect="italics">a</em> = ______</li> <li><em data-effect="italics">b</em> = ______</li> <li>correlation = ______</li> <li><em data-effect="italics">n</em> = ______</li> <li>equation: <em data-effect="italics">ŷ</em> = ______</li> </ol> </li> <li>Obtain the graph of the regression line on your calculator. Sketch the regression line on the same axes as your scatter plot.</li> </ol> <div id="fs-idp91842272" data-type="list"><div data-type="title">Discussion Questions</div> <ol data-mark-suffix="."><li>Is the correlation significant? Explain how you determined this in complete sentences.</li> <li>Is the relationship a positive one or a negative one? Explain how you can tell and what this means in terms of weight and fuel efficiency.</li> <li>In one or two complete sentences, what is the practical interpretation of the slope of the least squares line in terms of fuel efficiency and weight?</li> <li>For a car that weighs 4,000 pounds, predict its fuel efficiency. Include units.</li> <li>Can we predict the fuel efficiency of a car that weighs 10,000 pounds using the least squares line? Explain why or why not.</li> <li>Answer each question in complete sentences. <ol id="list-9769782345" type="a"><li>Does the line seem to fit the data? Why or why not?</li> <li>What does the correlation imply about the relationship between fuel efficiency and weight of a car? Is this what you expected?</li> </ol> </li> <li>Are there any outliers? If so, which point is an outlier?</li> </ol> </div> </div> </div></div>
<div class="part " id="part-probability-topics"><div class="part-title-wrap"><h3 class="part-number">IV</h3><h1 class="part-title">Chapter 4: Probability Topics</h1></div><div class="ugc part-ugc"></div></div>
<div class="chapter standard" id="chapter-introduction-15" title="Chapter 4.1: Introduction"><div class="chapter-title-wrap"><h3 class="chapter-number">24</h3><h2 class="chapter-title"><span class="display-none">Chapter 4.1: Introduction</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-idm115405360" class="splash"><div class="bc-figcaption figcaption">Meteor showers are rare, but the probability of them occurring can be calculated. (credit: Navicore/flickr)</div> <p><span id="fs-idm115166448" data-type="media" data-alt="This is a photo taken of the night sky. A meteor and its tail are shown entering the earth's atmosphere."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/CNX_Stats_C03_CO-1.jpg" alt="This is a photo taken of the night sky. A meteor and its tail are shown entering the earth's atmosphere." width="500" data-media-type="image/jpg" /></span></p> </div> <div id="fs-idm7161872" class="chapter-objectives" data-type="note" data-has-label="true" data-label=""><div data-type="title">Chapter Objectives</div> <p id="fs-idm75575520">By the end of this chapter, the student should be able to:</p> <ul id="list1523423"><li>Understand and use the terminology of probability.</li> <li>Determine whether two events are mutually exclusive and whether two events are independent.</li> <li>Calculate probabilities using the Addition Rules and Multiplication Rules.</li> <li>Construct and interpret Contingency Tables.</li> <li>Construct and interpret Venn Diagrams.</li> <li>Construct and interpret Tree Diagrams.</li> </ul> </div> <p id="intro01">It is often necessary to &#8220;guess&#8221; about the outcome of an event in order to make a decision. Politicians study polls to guess their likelihood of winning an election. Teachers choose a particular course of study based on what they think students can comprehend. Doctors choose the treatments needed for various diseases based on their assessment of likely results. You may have visited a casino where people play games chosen because of the belief that the likelihood of winning is good. You may have chosen your course of study based on the probable availability of jobs.</p> <p>You have, more than likely, used probability. In fact, you probably have an intuitive sense of probability. Probability deals with the chance of an event occurring. Whenever you weigh the odds of whether or not to do your homework or to study for an exam, you are using probability. In this chapter, you will learn how to solve probability problems using a systematic approach.</p> <div id="fs-idm8566848" class="statistics collab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Collaborative Exercise</div> <p>Your instructor will survey your class. Count the number of students in the class today.</p> <ul><li>Raise your hand if you have any change in your pocket or purse. Record the number of raised hands.</li> <li>Raise your hand if you rode a bus within the past month. Record the number of raised hands.</li> <li>Raise your hand if you answered &#8220;yes&#8221; to BOTH of the first two questions. Record the number of raised hands.</li> </ul> <p id="element-352">Use the class data as estimates of the following probabilities. <em data-effect="italics">P</em>(change) means the probability that a randomly chosen person in your class has change in his/her pocket or purse. <em data-effect="italics">P</em>(bus) means the probability that a randomly chosen person in your class rode a bus within the last month and so on. Discuss your answers.</p> <ul id="element-204"><li>Find <em data-effect="italics">P</em>(change).</li> <li>Find <em data-effect="italics">P</em>(bus).</li> <li>Find <em data-effect="italics">P</em>(change AND bus). Find the probability that a randomly chosen student in your class has change in his/her pocket or purse and rode a bus within the last month.</li> <li>Find <em data-effect="italics">P</em>(change|bus). Find the probability that a randomly chosen student has change given that he or she rode a bus within the last month. Count all the students that rode a bus. From the group of students who rode a bus, count those who have change. The probability is equal to those who have change and rode a bus divided by those who rode a bus.</li> </ul> </div> </div></div>
<div class="chapter standard" id="chapter-terminology" title="Chapter 4.2: Terminology"><div class="chapter-title-wrap"><h3 class="chapter-number">25</h3><h2 class="chapter-title"><span class="display-none">Chapter 4.2: Terminology</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p>Probability is a measure that is associated with how certain we are of outcomes of a particular experiment or activity. An <span data-type="term">experiment</span> is a planned operation carried out under controlled conditions. If the result is not predetermined, then the experiment is said to be a <strong>chance</strong> experiment. Flipping one fair coin twice is an example of an experiment.</p> <p>A result of an experiment is called an <span data-type="term">outcome</span>. The <span data-type="term">sample space</span> of an experiment is the set of all possible outcomes. Three ways to represent a sample space are: to list the possible outcomes, to create a tree diagram, or to create a Venn diagram. The uppercase letter <em data-effect="italics">S</em> is used to denote the sample space. For example, if you flip one fair coin, <em data-effect="italics">S</em> = {<em data-effect="italics">H</em>, <em data-effect="italics">T</em>} where <em data-effect="italics">H</em> = heads and <em data-effect="italics">T</em> = tails are the outcomes.</p> <p id="element-214">An <span data-type="term">event</span> is any combination of outcomes. Upper case letters like <em data-effect="italics">A</em> and <em data-effect="italics">B</em> represent events. For example, if the experiment is to flip one fair coin, event <em data-effect="italics">A</em> might be getting at most one head. The probability of an event <em data-effect="italics">A</em> is written <em data-effect="italics">P</em>(<em data-effect="italics">A</em>).</p> <p>The <span data-type="term">probability</span> of any outcome is the <span data-type="term">long-term relative frequency</span> of that outcome. <strong>Probabilities are between zero and one, inclusive</strong> (that is, zero and one and all numbers between these values). <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) = 0 means the event <em data-effect="italics">A</em> can never happen. <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) = 1 means the event <em data-effect="italics">A</em> always happens. <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) = 0.5 means the event <em data-effect="italics">A</em> is equally likely to occur or not to occur. For example, if you flip one fair coin repeatedly (from 20 to 2,000 to 20,000 times) the relative frequency of heads approaches 0.5 (the probability of heads).</p> <p id="eip-516"><span data-type="term">Equally likely</span> means that each outcome of an experiment occurs with equal probability. For example, if you toss a <span data-type="term">fair</span>, six-sided die, each face (1, 2, 3, 4, 5, or 6) is as likely to occur as any other face. If you toss a fair coin, a Head (<em data-effect="italics">H</em>) and a Tail (<em data-effect="italics">T</em>) are equally likely to occur. If you randomly guess the answer to a true/false question on an exam, you are equally likely to select a correct answer or an incorrect answer.</p> <p id="element-677"><strong>To calculate the probability of an event <em data-effect="italics">A</em> when all outcomes in the sample space are equally likely</strong>, count the number of outcomes for event <em data-effect="italics">A</em> and divide by the total number of outcomes in the sample space. For example, if you toss a fair dime and a fair nickel, the sample space is {<em data-effect="italics">HH</em>, <em data-effect="italics">TH</em>, <em data-effect="italics">HT</em>, <em data-effect="italics">TT</em>} where <em data-effect="italics">T</em> = tails and <em data-effect="italics">H</em> = heads. The sample space has four outcomes. <em data-effect="italics">A</em> = getting one head. There are two outcomes that meet this condition {<em data-effect="italics">HT</em>, <em data-effect="italics">TH</em>}, so <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) = \(\frac{2}{4}\) = 0.5.</p> <p id="eip-405">Suppose you roll one fair six-sided die, with the numbers {1, 2, 3, 4, 5, 6} on its faces. Let event <em data-effect="italics">E</em> = rolling a number that is at least five. There are two outcomes {5, 6}. <em data-effect="italics">P</em>(<em data-effect="italics">E</em>) = \(\frac{2}{6}\). If you were to roll the die only a few times, you would not be surprised if your observed results did not match the probability. If you were to roll the die a very large number of times, you would expect that, overall, \(\frac{2}{6}\) of the rolls would result in an outcome of &#8220;at least five&#8221;. You would not expect exactly \(\frac{2}{6}\). The long-term relative frequency of obtaining this result would approach the theoretical probability of \(\frac{2}{6}\) as the number of repetitions grows larger and larger.</p> <p id="eip-664">This important characteristic of probability experiments is known as the <span data-type="term">law of large numbers</span> which states that as the number of repetitions of an experiment is increased, the relative frequency obtained in the experiment tends to become closer and closer to the theoretical probability. Even though the outcomes do not happen according to any set pattern or order, overall, the long-term observed relative frequency will approach the theoretical probability. (The word <strong>empirical</strong> is often used instead of the word observed.)</p> <p>It is important to realize that in many situations, the outcomes are not equally likely. A coin or die may be <span data-type="term">unfair</span>, or <strong>biased</strong>. Two math professors in Europe had their statistics students test the Belgian one Euro coin and discovered that in 250 trials, a head was obtained 56% of the time and a tail was obtained 44% of the time. The data seem to show that the coin is not a fair coin; more repetitions would be helpful to draw a more accurate conclusion about such bias. Some dice may be biased. Look at the dice in a game you have at home; the spots on each face are usually small holes carved out and then painted to make the spots visible. Your dice may or may not be biased; it is possible that the outcomes may be affected by the slight weight differences due to the different numbers of holes in the faces. Gambling casinos make a lot of money depending on outcomes from rolling dice, so casino dice are made differently to eliminate bias. Casino dice have flat faces; the holes are completely filled with paint having the same density as the material that the dice are made out of so that each face is equally likely to occur. Later we will learn techniques to use to work with probabilities for events that are not equally likely.</p> <p>&nbsp;</p> <p><span data-type="title">&#8220;OR&#8221; Event:</span>An outcome is in the event <em data-effect="italics">A</em> OR <em data-effect="italics">B</em> if the outcome is in <em data-effect="italics">A</em> or is in <em data-effect="italics">B</em> or is in both <em data-effect="italics">A</em> and <em data-effect="italics">B</em>. For example, let <em data-effect="italics">A</em> = {1, 2, 3, 4, 5} and <em data-effect="italics">B</em> = {4, 5, 6, 7, 8}. <em data-effect="italics">A</em> OR <em data-effect="italics">B</em> = {1, 2, 3, 4, 5, 6, 7, 8}. Notice that 4 and 5 are NOT listed twice.</p> <p>&nbsp;</p> <p id="element-713"><span data-type="title">&#8220;AND&#8221; Event:</span>An outcome is in the event <em data-effect="italics">A</em> AND <em data-effect="italics">B</em> if the outcome is in both <em data-effect="italics">A</em> and <em data-effect="italics">B</em> at the same time. For example, let <em data-effect="italics">A</em> and <em data-effect="italics">B</em> be {1, 2, 3, 4, 5} and {4, 5, 6, 7, 8}, respectively. Then <em data-effect="italics">A</em> AND <em data-effect="italics">B</em> = {4, 5}.</p> <p>The <span data-type="term">complement</span> of event <em data-effect="italics">A</em> is denoted <em data-effect="italics">A′</em> (read &#8220;<em data-effect="italics">A</em> prime&#8221;). <em data-effect="italics">A′</em> consists of all outcomes that are <strong>NOT</strong> in <em data-effect="italics">A</em>. Notice that <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">A′</em>) = 1. For example, let <em data-effect="italics">S</em> = {1, 2, 3, 4, 5, 6} and let <em data-effect="italics">A</em> = {1, 2, 3, 4}. Then, <em data-effect="italics">A′</em> = {5, 6}. <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) = \(\frac{4}{6}\), <em data-effect="italics">P</em>(<em data-effect="italics">A′</em>) = \(\frac{2}{6}\), and <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">A′</em>) = \(\frac{4}{6}+\frac{2}{6}\) = 1</p> <p>The <span data-type="term">conditional probability</span> of <em data-effect="italics">A</em> given <em data-effect="italics">B</em> is written <em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>). <em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>) is the probability that event <em data-effect="italics">A</em> will occur given that the event <em data-effect="italics">B</em> has already occurred. <strong>A conditional reduces the sample space</strong>. We calculate the probability of <em data-effect="italics">A</em> from the reduced sample space <em data-effect="italics">B</em>. The formula to calculate <em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>) is <em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>) = \(\frac{P\left(A\text{AND}B\right)}{P\left(B\right)}\) where <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) is greater than zero.</p> <p>For example, suppose we toss one fair, six-sided die. The sample space <em data-effect="italics">S</em> = {1, 2, 3, 4, 5, 6}. Let <em data-effect="italics">A</em> = face is 2 or 3 and <em data-effect="italics">B</em> = face is even (2, 4, 6). To calculate <em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>), we count the number of outcomes 2 or 3 in the sample space <em data-effect="italics">B</em> = {2, 4, 6}. Then we divide that by the number of outcomes <em data-effect="italics">B</em> (rather than <em data-effect="italics">S</em>).</p> <p>We get the same result by using the formula. Remember that <em data-effect="italics">S</em> has six outcomes.</p> <p><em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>) = \(\frac{P\left(A\phantom{\rule{2pt}{0ex}}\text{AND}\phantom{\rule{2pt}{0ex}}B\right)}{P\left(B\right)}=\frac{\frac{\left(\text{the number of outcomes that are 2 or 3 and even in}\phantom{\rule{2pt}{0ex}}S\right)}{6}}{\frac{\left(\text{the number of outcomes that are even in}\phantom{\rule{2pt}{0ex}}S\right)}{6}}=\frac{\frac{1}{6}}{\frac{3}{6}}=\frac{1}{3}\)</p> <p id="eip-615"><span data-type="title">Understanding Terminology and Symbols</span>It is important to read each problem carefully to think about and understand what the events are. Understanding the wording is the first very important step in solving probability problems. Reread the problem several times if necessary. Clearly identify the event of interest. Determine whether there is a condition stated in the wording that would indicate that the probability is conditional; carefully identify the condition, if any.</p> <div id="fs-idp18139744" class="textbox textbox--examples" data-type="example"><div id="fs-idp55046224" data-type="exercise"><div id="fs-idm2608960" data-type="problem"><p id="fs-idm9889072">The sample space <em data-effect="italics">S</em> is the whole numbers starting at one and less than 20.</p> <ol id="fs-idp18756240" type="a"><li><em data-effect="italics">S</em> = _____________________________ <p id="eip-idm125727456">Let event <em data-effect="italics">A</em> = the even numbers and event <em data-effect="italics">B</em> = numbers greater than 13.</p> </li> <li><em data-effect="italics">A</em> = _____________________, <em data-effect="italics">B</em> = _____________________</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">A</em>) = _____________, <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) = ________________</li> <li><em data-effect="italics">A</em> AND <em data-effect="italics">B</em> = ____________________, <em data-effect="italics">A</em> OR <em data-effect="italics">B</em> = ________________</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = _________, <em data-effect="italics">P</em>(<em data-effect="italics">A</em> OR <em data-effect="italics">B</em>) = _____________</li> <li><em data-effect="italics">A′</em> = _____________, <em data-effect="italics">P</em>(<em data-effect="italics">A′</em>) = _____________</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">A</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">A′</em>) = ____________</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>) = ___________, <em data-effect="italics">P</em>(<em data-effect="italics">B</em>|<em data-effect="italics">A</em>) = _____________; are the probabilities equal?</li> </ol> </div> <div id="fs-idm6734480" data-type="solution"><ol id="fs-idp17343632" type="a"><li><em data-effect="italics">S</em> = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19}</li> <li><em data-effect="italics">A</em> = {2, 4, 6, 8, 10, 12, 14, 16, 18}, <em data-effect="italics">B</em> = {14, 15, 16, 17, 18, 19}</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">A</em>) = \(\frac{9}{19}\), <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) = \(\frac{6}{19}\)</li> <li><em data-effect="italics">A</em> AND <em data-effect="italics">B</em> = {14,16,18}, <em data-effect="italics">A</em> OR <em data-effect="italics">B</em> = {2, 4, 6, 8, 10, 12, 14, 15, 16, 17, 18, 19}</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = \(\frac{3}{19}\), <em data-effect="italics">P</em>(<em data-effect="italics">A</em> OR <em data-effect="italics">B</em>) = \(\frac{12}{19}\)</li> <li><em data-effect="italics">A′</em> = 1, 3, 5, 7, 9, 11, 13, 15, 17, 19; <em data-effect="italics">P</em>(<em data-effect="italics">A′</em>) = \(\frac{10}{19}\)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">A</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">A′</em>) = 1 (\(\frac{9}{19}\) + \(\frac{10}{19}\) = 1)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>) = \(\frac{P\left(A\text{AND}B\right)}{P\left(B\right)}\) = \(\frac{3}{6}\), <em data-effect="italics">P</em>(<em data-effect="italics">B</em>|<em data-effect="italics">A</em>) = \(\frac{P\left(A\text{AND}B\right)}{P\left(A\right)}\) = \(\frac{3}{9}\), No</li> </ol> </div> </div> </div> <div id="fs-idm38461776" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp18521824" data-type="exercise"><div id="fs-idp8232288" data-type="problem"><p id="fs-idp22298512">The sample space <em data-effect="italics">S</em> is all the ordered pairs of two whole numbers, the first from one to three and the second from one to four (Example: (1, 4)).</p> <p>&nbsp;</p> <ol id="fs-idm618192" type="a"><li><em data-effect="italics">S</em> = _____________________________Let event <em data-effect="italics">A</em> = the sum is even and event <em data-effect="italics">B</em> = the first number is prime.</li> <li><em data-effect="italics">A</em> = _____________________, <em data-effect="italics">B</em> = _____________________</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">A</em>) = _____________, <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) = ________________</li> <li><em data-effect="italics">A</em> AND <em data-effect="italics">B</em> = ____________________, <em data-effect="italics">A</em> OR <em data-effect="italics">B</em> = ________________</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = _________, <em data-effect="italics">P</em>(<em data-effect="italics">A</em> OR <em data-effect="italics">B</em>) = _____________</li> <li><em data-effect="italics">B′</em> = _____________, <em data-effect="italics">P</em>(<em data-effect="italics">B′</em>) = _____________</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">A</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">A′</em>) = ____________</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>) = ___________, <em data-effect="italics">P</em>(<em data-effect="italics">B</em>|<em data-effect="italics">A</em>) = _____________; are the probabilities equal?</li> </ol> </div> </div> </div> <div id="fs-idp58540864" class="textbox textbox--examples" data-type="example"><div id="fs-idm2972704" data-type="exercise"><div id="fs-idp70495648" data-type="problem"><p id="fs-idp2040560">A fair, six-sided die is rolled. Describe the sample space <em data-effect="italics">S</em>, identify each of the following events with a subset of <em data-effect="italics">S</em> and compute its probability (an outcome is the number of dots that show up).</p> <ol id="fs-idp63287328" type="a"><li>Event <em data-effect="italics">T</em> = the outcome is two.</li> <li>Event <em data-effect="italics">A</em> = the outcome is an even number.</li> <li>Event <em data-effect="italics">B</em> = the outcome is less than four.</li> <li>The complement of <em data-effect="italics">A</em>.</li> <li><em data-effect="italics">A</em> GIVEN <em data-effect="italics">B</em></li> <li><em data-effect="italics">B</em> GIVEN <em data-effect="italics">A</em></li> <li><em data-effect="italics">A</em> AND <em data-effect="italics">B</em></li> <li><em data-effect="italics">A</em> OR <em data-effect="italics">B</em></li> <li><em data-effect="italics">A</em> OR <em data-effect="italics">B′</em></li> <li>Event <em data-effect="italics">N</em> = the outcome is a prime number.</li> <li>Event <em data-effect="italics">I</em> = the outcome is seven.</li> </ol> </div> <div id="fs-idp22590288" data-type="solution"><ol id="fs-idp47608624" type="a"><li><em data-effect="italics">T</em> = {2}, <em data-effect="italics">P</em>(<em data-effect="italics">T</em>) = \(\frac{1}{6}\)</li> <li><em data-effect="italics">A</em> = {2, 4, 6}, <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) = \(\frac{1}{2}\)</li> <li><em data-effect="italics">B</em> = {1, 2, 3}, <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) = \(\frac{1}{2}\)</li> <li><em data-effect="italics">A′</em> = {1, 3, 5}, <em data-effect="italics">P</em>(<em data-effect="italics">A′</em>) = \(\frac{1}{2}\)</li> <li><em data-effect="italics">A</em>|<em data-effect="italics">B</em> = {2}, <em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>) = \(\frac{1}{3}\)</li> <li><em data-effect="italics">B</em>|<em data-effect="italics">A</em> = {2}, <em data-effect="italics">P</em>(<em data-effect="italics">B</em>|<em data-effect="italics">A</em>) = \(\frac{1}{3}\)</li> <li><em data-effect="italics">A</em> AND <em data-effect="italics">B</em> = {2}, <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = \(\frac{1}{6}\)</li> <li><em data-effect="italics">A</em> OR <em data-effect="italics">B</em> = {1, 2, 3, 4, 6}, <em data-effect="italics">P</em>(<em data-effect="italics">A</em> OR <em data-effect="italics">B</em>) = \(\frac{5}{6}\)</li> <li><em data-effect="italics">A</em> OR <em data-effect="italics">B′</em> = {2, 4, 5, 6}, <em data-effect="italics">P</em>(<em data-effect="italics">A</em> OR <em data-effect="italics">B′</em>) = \(\frac{2}{3}\)</li> <li><em data-effect="italics">N</em> = {2, 3, 5}, <em data-effect="italics">P</em>(<em data-effect="italics">N</em>) = \(\frac{1}{2}\)</li> <li>A six-sided die does not have seven dots. <em data-effect="italics">P</em>(7) = 0.</li> </ol> </div> </div> </div> <div id="fs-idp6753008" class="textbox textbox--examples" data-type="example"><p id="fs-idp15243376"><a class="autogenerated-content" href="#ch03_M02-tbl001">(Figure)</a> describes the distribution of a random sample <em data-effect="italics">S</em> of 100 individuals, organized by gender and whether they are right- or left-handed.</p> <table id="ch03_M02-tbl001" summary="Example 3 Table"><thead><tr><th></th> <th>Right-handed</th> <th>Left-handed</th> </tr> </thead> <tbody><tr><td>Males</td> <td>43</td> <td>9</td> </tr> <tr><td>Females</td> <td>44</td> <td>4</td> </tr> </tbody> </table> <div id="fs-idp47478688" data-type="exercise"><div id="fs-idp17336896" data-type="problem"><p id="fs-idp16966080">Let’s denote the events <em data-effect="italics">M</em> = the subject is male, <em data-effect="italics">F</em> = the subject is female, <em data-effect="italics">R</em> = the subject is right-handed, <em data-effect="italics">L</em> = the subject is left-handed. Compute the following probabilities:</p> <ol id="fs-idm298416" type="a"><li><em data-effect="italics">P</em>(<em data-effect="italics">M</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">F</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">R</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">L</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">M</em> AND <em data-effect="italics">R</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">F</em> AND <em data-effect="italics">L</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">M</em> OR <em data-effect="italics">F</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">M</em> OR <em data-effect="italics">R</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">F</em> OR <em data-effect="italics">L</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">M&#8217;</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">R</em>|<em data-effect="italics">M</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">F</em>|<em data-effect="italics">L</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">L</em>|<em data-effect="italics">F</em>)</li> </ol> </div> <div id="fs-idm1054528" data-type="solution"><ol id="fs-idm30102752" type="a"><li><em data-effect="italics">P</em>(<em data-effect="italics">M</em>) = 0.52</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">F</em>) = 0.48</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">R</em>) = 0.87</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">L</em>) = 0.13</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">M</em> AND <em data-effect="italics">R</em>) = 0.43</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">F</em> AND <em data-effect="italics">L</em>) = 0.04</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">M</em> OR <em data-effect="italics">F</em>) = 1</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">M</em> OR <em data-effect="italics">R</em>) = 0.96</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">F</em> OR <em data-effect="italics">L</em>) = 0.57</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">M&#8217;</em>) = 0.48</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">R</em>|<em data-effect="italics">M</em>) = 0.8269 (rounded to four decimal places)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">F</em>|<em data-effect="italics">L</em>) = 0.3077 (rounded to four decimal places)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">L</em>|<em data-effect="italics">F</em>) = 0.0833</li> </ol> </div> </div> </div> <div id="fs-idp2299568" class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="fs-idm5532304">“Countries List by Continent.” Worldatlas, 2013. Available online at http://www.worldatlas.com/cntycont.htm (accessed May 2, 2013).</p> </div> <div id="fs-idp18168192" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idp7404384">In this module we learned the basic terminology of probability. The set of all possible outcomes of an experiment is called the sample space. Events are subsets of the sample space, and they are assigned a probability that is a number between zero and one, inclusive.</p> </div> <div id="fs-idp22008272" class="formula-review" data-depth="1"><h3 data-type="title">Formula Review</h3> <p id="fs-idp19318640"><em data-effect="italics">A</em> and <em data-effect="italics">B</em> are events</p> <p id="fs-idm18729872"><em data-effect="italics">P</em>(<em data-effect="italics">S</em>) = 1 where <em data-effect="italics">S</em> is the sample space</p> <p id="fs-idp16994864">0 ≤ <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) ≤ 1</p> <p id="fs-idp72159184"><em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>) = \(\frac{P\text{(}A\text{AND}B\text{)}}{P\text{(}B\text{)}}\)</p> </div> <div id="eip-99" class="practice" data-depth="1"><div data-type="exercise"><div id="eip-id1164326849676" data-type="problem"><p id="eip-id1164335950747">In a particular college class, there are male and female students. Some students have long hair and some students have short hair. Write the <strong>symbols</strong> for the probabilities of the events for parts a through j. (Note that you cannot find numerical answers here. You were not given enough information to find any probability values yet; concentrate on understanding the symbols.)</p> <ul id="eip-id1164334783998"><li>Let <em data-effect="italics">F</em> be the event that a student is female.</li> <li>Let <em data-effect="italics">M</em> be the event that a student is male.</li> <li>Let <em data-effect="italics">S</em> be the event that a student has short hair.</li> <li>Let <em data-effect="italics">L</em> be the event that a student has long hair.</li> </ul> <ol type="a"><li>The probability that a student does not have long hair.</li> <li>The probability that a student is male or has short hair.</li> <li>The probability that a student is a female and has long hair.</li> <li>The probability that a student is male, given that the student has long hair.</li> <li>The probability that a student has long hair, given that the student is male.</li> <li>Of all the female students, the probability that a student has short hair.</li> <li>Of all students with long hair, the probability that a student is female.</li> <li>The probability that a student is female or has long hair.</li> <li>The probability that a randomly selected student is a male student with short hair.</li> <li>The probability that a student is female.</li> </ol> </div> <div id="eip-id1164308885144" data-type="solution"><ol id="eip-id1164326073304" type="a"><li><em data-effect="italics">P</em>(<em data-effect="italics">L′</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">S</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">M</em> OR <em data-effect="italics">S</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">F</em> AND <em data-effect="italics">L</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">M</em>|<em data-effect="italics">L</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">L</em>|<em data-effect="italics">M</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">S</em>|<em data-effect="italics">F</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">F</em>|<em data-effect="italics">L</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">F</em> OR <em data-effect="italics">L</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">M</em> AND <em data-effect="italics">S</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">F</em>)</li> </ol> </div> </div> <p id="fs-idp38442640"><em data-effect="italics">Use the following information to answer the next four exercises.</em> A box is filled with several party favors. It contains 12 hats, 15 noisemakers, ten finger traps, and five bags of confetti. <span data-type="newline"><br /> </span>Let <em data-effect="italics">H</em> = the event of getting a hat. <span data-type="newline"><br /> </span>Let <em data-effect="italics">N</em> = the event of getting a noisemaker. <span data-type="newline"><br /> </span>Let <em data-effect="italics">F</em> = the event of getting a finger trap. <span data-type="newline"><br /> </span>Let <em data-effect="italics">C</em> = the event of getting a bag of confetti.</p> <div id="fs-idm23159776" data-type="exercise"><div id="fs-idm49471840" data-type="problem"><p id="fs-idm86433696">Find <em data-effect="italics">P</em>(<em data-effect="italics">H</em>).</p> </div> </div> <div id="fs-idm49682256" data-type="exercise"><div id="fs-idm46637072" data-type="problem"><p id="fs-idp59284912">Find <em data-effect="italics">P</em>(<em data-effect="italics">N</em>).</p> </div> <div id="fs-idp61984096" data-type="solution"><p id="fs-idp24214560"><em data-effect="italics">P</em>(<em data-effect="italics">N</em>) = \(\frac{15}{42}\) = \(\frac{5}{14}\) = 0.36</p> </div> </div> <div id="fs-idm10330672" data-type="exercise"><div id="fs-idm86031936" data-type="problem"><p id="fs-idm8076048">Find <em data-effect="italics">P</em>(<em data-effect="italics">F</em>).</p> </div> </div> <div id="fs-idm119514368" data-type="exercise"><div id="fs-idm25820928" data-type="problem"><p id="fs-idm46325408">Find <em data-effect="italics">P</em>(<em data-effect="italics">C</em>).</p> </div> <div id="fs-idm26325552" data-type="solution"><p id="fs-idm68106736"><em data-effect="italics">P</em>(<em data-effect="italics">C</em>) = \(\frac{5}{42}\) = 0.12</p> </div> </div> <p id="fs-idm56753520"><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next six exercises.</em> A jar of 150 jelly beans contains 22 red jelly beans, 38 yellow, 20 green, 28 purple, 26 blue, and the rest are orange. <span data-type="newline"><br /> </span>Let <em data-effect="italics">B</em> = the event of getting a blue jelly bean <span data-type="newline"><br /> </span>Let <em data-effect="italics">G</em> = the event of getting a green jelly bean. <span data-type="newline"><br /> </span>Let <em data-effect="italics">O</em> = the event of getting an orange jelly bean. <span data-type="newline"><br /> </span>Let <em data-effect="italics">P</em> = the event of getting a purple jelly bean. <span data-type="newline"><br /> </span>Let <em data-effect="italics">R</em> = the event of getting a red jelly bean. <span data-type="newline"><br /> </span>Let <em data-effect="italics">Y</em> = the event of getting a yellow jelly bean.</p> <div id="fs-idm75082368" data-type="exercise"><div id="fs-idm50181456" data-type="problem"><p id="fs-idm93408512">Find <em data-effect="italics">P</em>(<em data-effect="italics">B</em>).</p> </div> </div> <div id="fs-idm80518976" data-type="exercise"><div id="fs-idm60746640" data-type="problem"><p id="fs-idp2053376">Find <em data-effect="italics">P</em>(<em data-effect="italics">G</em>).</p> </div> <div id="fs-idm50181744" data-type="solution"><p id="fs-idm21704752"><em data-effect="italics">P</em>(<em data-effect="italics">G</em>) = \(\frac{20}{150}\) = \(\frac{2}{15}\) = 0.13</p> </div> </div> <div id="fs-idm68528784" data-type="exercise"><div id="fs-idm61260992" data-type="problem"><p id="fs-idm87704720">Find <em data-effect="italics">P</em>(<em data-effect="italics">P</em>).</p> </div> </div> <div id="fs-idm48493856" data-type="exercise"><div id="fs-idm49941984" data-type="problem"><p id="fs-idm79884208">Find <em data-effect="italics">P</em>(<em data-effect="italics">R</em>).</p> </div> <div id="fs-idm50623312" data-type="solution"><p id="fs-idm28001664"><em data-effect="italics">P</em>(<em data-effect="italics">R</em>) = \(\frac{22}{150}\) = \(\frac{11}{75}\) = 0.15</p> </div> </div> <div id="fs-idm3176768" data-type="exercise"><div id="fs-idm59533344" data-type="problem"><p id="fs-idm47130992">Find <em data-effect="italics">P</em>(<em data-effect="italics">Y</em>).</p> </div> </div> <div id="fs-idm53901776" data-type="exercise"><div id="fs-idm69722992" data-type="problem"><p id="fs-idm51951232">Find <em data-effect="italics">P</em>(<em data-effect="italics">O</em>).</p> </div> <div id="fs-idm67360768" data-type="solution"><p id="fs-idm51927824"><em data-effect="italics">P</em>(<em data-effect="italics">O</em>) = \(\frac{150-22-38-20-28-26}{150}\) = \(\frac{16}{150}\) = \(\frac{8}{75}\) = 0.11</p> </div> </div> <p id="fs-idm25754160"><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next six exercises.</em> There are 23 countries in North America, 12 countries in South America, 47 countries in Europe, 44 countries in Asia, 54 countries in Africa, and 14 in Oceania (Pacific Ocean region). <span data-type="newline"><br /> </span>Let <em data-effect="italics">A</em> = the event that a country is in Asia. <span data-type="newline"><br /> </span>Let <em data-effect="italics">E</em> = the event that a country is in Europe. <span data-type="newline"><br /> </span>Let <em data-effect="italics">F</em> = the event that a country is in Africa. <span data-type="newline"><br /> </span>Let <em data-effect="italics">N</em> = the event that a country is in North America. <span data-type="newline"><br /> </span>Let <em data-effect="italics">O</em> = the event that a country is in Oceania. <span data-type="newline"><br /> </span>Let <em data-effect="italics">S</em> = the event that a country is in South America.</p> <div id="fs-idm3484896" data-type="exercise"><div id="fs-idm47263488" data-type="problem"><p id="fs-idm51751696">Find <em data-effect="italics">P</em>(<em data-effect="italics">A</em>).</p> </div> </div> <div id="fs-idm9492064" data-type="exercise"><div id="fs-idm67449488" data-type="problem"><p id="fs-idm49904224">Find <em data-effect="italics">P</em>(<em data-effect="italics">E</em>).</p> </div> <div id="fs-idm9486496" data-type="solution"><p id="fs-idm1799840"><em data-effect="italics">P</em>(<em data-effect="italics">E</em>) = \(\frac{47}{194}\) = 0.24</p> </div> </div> <div id="fs-idm79250496" data-type="exercise"><div id="fs-idm80555184" data-type="problem"><p id="fs-idm66849952">Find <em data-effect="italics">P</em>(<em data-effect="italics">F</em>).</p> </div> </div> <div id="fs-idm3781600" data-type="exercise"><div id="fs-idp54591472" data-type="problem"><p id="fs-idm27995232">Find <em data-effect="italics">P</em>(<em data-effect="italics">N</em>).</p> </div> <div id="fs-idm67806368" data-type="solution"><p id="fs-idm45369376"><em data-effect="italics">P</em>(<em data-effect="italics">N</em>) = \(\frac{23}{194}\) = 0.12</p> </div> </div> <div id="fs-idm121890336" data-type="exercise"><div id="fs-idm47264784" data-type="problem"><p id="fs-idm19659264">Find <em data-effect="italics">P</em>(<em data-effect="italics">O</em>).</p> </div> </div> <div id="fs-idp14140208" data-type="exercise"><div id="fs-idm28717232" data-type="problem"><p id="fs-idm26564640">Find <em data-effect="italics">P</em>(<em data-effect="italics">S</em>).</p> </div> <div id="fs-idm51969536" data-type="solution"><p id="fs-idp78356208"><em data-effect="italics">P</em>(<em data-effect="italics">S</em>) = \(\frac{12}{194}\) = \(\frac{6}{97}\) = 0.06</p> </div> </div> <div id="fs-idm24247104" data-type="exercise"><div id="fs-idm21936736" data-type="problem"><p id="fs-idm73112992">What is the probability of drawing a red card in a standard deck of 52 cards?</p> </div> </div> <div id="fs-idm28330400" data-type="exercise"><div id="fs-idp20091712" data-type="problem"><p id="fs-idm24866720">What is the probability of drawing a club in a standard deck of 52 cards?</p> </div> <div id="fs-idm10488336" data-type="solution"><p id="fs-idm169440">\(\frac{13}{52}\) = \(\frac{1}{4}\) = 0.25</p> </div> </div> <div id="fs-idm3623360" data-type="exercise"><div id="fs-idm62001312" data-type="problem"><p id="fs-idm85725360">What is the probability of rolling an even number of dots with a fair, six-sided die numbered one through six?</p> </div> </div> <div id="fs-idm61066160" data-type="exercise"><div id="fs-idm109071856" data-type="problem"><p id="fs-idm92474096">What is the probability of rolling a prime number of dots with a fair, six-sided die numbered one through six?</p> </div> <div id="fs-idm86034528" data-type="solution"><p id="fs-idp53515360">\(\frac{3}{6}\) = \(\frac{1}{2}\) = 0.5</p> </div> </div> <p id="fs-idm83426832"><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next two exercises.</em> You see a game at a local fair. You have to throw a dart at a color wheel. Each section on the color wheel is equal in area.</p> <div id="eip-idp31502288" class="bc-figure figure"><span id="fs-idm79611488" data-type="media" data-alt="" data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/CNX_Stats_C03_M01_021-1.jpg" alt="" width="250" data-media-type="image/jpg" /></span></div> <p><span data-type="newline" data-count="1"><br /> </span>Let <em data-effect="italics">B</em> = the event of landing on blue. <span data-type="newline" data-count="1"><br /> </span>Let <em data-effect="italics">R</em> = the event of landing on red. <span data-type="newline" data-count="1"><br /> </span>Let <em data-effect="italics">G</em> = the event of landing on green. <span data-type="newline" data-count="1"><br /> </span>Let <em data-effect="italics">Y</em> = the event of landing on yellow.</p> <div id="fs-idm87354368" data-type="exercise"><div id="fs-idm25090880" data-type="problem"><p id="fs-idm84985120">If you land on <em data-effect="italics">Y</em>, you get the biggest prize. Find <em data-effect="italics">P</em>(<em data-effect="italics">Y</em>).</p> </div> </div> <div id="fs-idm87415968" data-type="exercise"><div id="fs-idm45178336" data-type="problem"><p id="fs-idm60563664">If you land on red, you don’t get a prize. What is <em data-effect="italics">P</em>(<em data-effect="italics">R</em>)?</p> </div> <div id="fs-idm86444976" data-type="solution"><p id="fs-idm10991472">\(P\left(R\right)=\frac{4}{8}=0.5\)</p> </div> </div> <p id="eip-321"><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next ten exercises.</em> On a baseball team, there are infielders and outfielders. Some players are great hitters, and some players are not great hitters. <span data-type="newline" data-count="1"><br /> </span>Let <em data-effect="italics">I</em> = the event that a player in an infielder. <span data-type="newline" data-count="1"><br /> </span>Let <em data-effect="italics">O</em> = the event that a player is an outfielder. <span data-type="newline" data-count="1"><br /> </span>Let <em data-effect="italics">H</em> = the event that a player is a great hitter. <span data-type="newline" data-count="1"><br /> </span>Let <em data-effect="italics">N</em> = the event that a player is not a great hitter.</p> <div id="eip-386" data-type="exercise"><div data-type="problem"><p id="eip-717">Write the symbols for the probability that a player is not an outfielder.</p> </div> </div> <div data-type="exercise"><div id="eip-702" data-type="problem"><p>Write the symbols for the probability that a player is an outfielder or is a great hitter.</p> </div> <div id="eip-73" data-type="solution"><p><em data-effect="italics">P</em>(<em data-effect="italics">O</em> OR <em data-effect="italics">H</em>)</p> </div> </div> <div id="eip-863" data-type="exercise"><div id="eip-119" data-type="problem"><p>Write the symbols for the probability that a player is an infielder and is not a great hitter.</p> </div> </div> <div id="eip-94" data-type="exercise"><div data-type="problem"><p id="eip-856">Write the symbols for the probability that a player is a great hitter, given that the player is an infielder.</p> </div> <div id="eip-954" data-type="solution"><p id="eip-577"><em data-effect="italics">P</em>(<em data-effect="italics">H</em>|<em data-effect="italics">I</em>)</p> </div> </div> <div id="eip-16" data-type="exercise"><div id="eip-938" data-type="problem"><p id="eip-777">Write the symbols for the probability that a player is an infielder, given that the player is a great hitter.</p> </div> </div> <div data-type="exercise"><div id="eip-115" data-type="problem"><p>Write the symbols for the probability that of all the outfielders, a player is not a great hitter.</p> </div> <div data-type="solution"><p><em data-effect="italics">P</em>(<em data-effect="italics">N</em>|<em data-effect="italics">O</em>)</p> </div> </div> <div id="eip-266" data-type="exercise"><div id="eip-25" data-type="problem"><p id="eip-823">Write the symbols for the probability that of all the great hitters, a player is an outfielder.</p> </div> </div> <div id="eip-542" data-type="exercise"><div id="eip-998" data-type="problem"><p>Write the symbols for the probability that a player is an infielder or is not a great hitter.</p> </div> <div id="eip-38" data-type="solution"><p id="eip-401"><em data-effect="italics">P</em>(<em data-effect="italics">I</em> OR <em data-effect="italics">N</em>)</p> </div> </div> <div id="eip-355" data-type="exercise"><div id="eip-636" data-type="problem"><p>Write the symbols for the probability that a player is an outfielder and is a great hitter.</p> </div> </div> <div id="eip-311" data-type="exercise"><div id="eip-622" data-type="problem"><p id="eip-627">Write the symbols for the probability that a player is an infielder.</p> </div> <div data-type="solution"><p><em data-effect="italics">P</em>(<em data-effect="italics">I</em>)</p> </div> </div> <div data-type="exercise"><div id="eip-640" data-type="problem"><p>What is the word for the set of all possible outcomes?</p> </div> </div> <div data-type="exercise"><div id="eip-334" data-type="problem"><p id="eip-743">What is conditional probability?</p> </div> <div id="eip-926" data-type="solution"><p id="eip-977">The likelihood that an event will occur given that another event has already occurred.</p> </div> </div> <div data-type="exercise"><div id="eip-547" data-type="problem"><p id="eip-468">A shelf holds 12 books. Eight are fiction and the rest are nonfiction. Each is a different book with a unique title. The fiction books are numbered one to eight. The nonfiction books are numbered one to four. Randomly select one book <span data-type="newline" data-count="1"><br /> </span>Let <em data-effect="italics">F</em> = event that book is fiction <span data-type="newline" data-count="1"><br /> </span>Let <em data-effect="italics">N</em> = event that book is nonfiction <span data-type="newline" data-count="1"><br /> </span>What is the sample space?</p> </div> </div> <div id="eip-848" data-type="exercise"><div data-type="problem"><p id="eip-687">What is the sum of the probabilities of an event and its complement?</p> </div> <div id="eip-144" data-type="solution"><p id="eip-26">1</p> </div> </div> <p id="fs-idp8995312"><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next two exercises.</em> You are rolling a fair, six-sided number cube. Let <em data-effect="italics">E</em> = the event that it lands on an even number. Let <em data-effect="italics">M</em> = the event that it lands on a multiple of three.</p> <div id="eip-460" data-type="exercise"><div id="eip-551" data-type="problem"><p id="eip-idm920816">What does <em data-effect="italics">P</em>(<em data-effect="italics">E</em>|<em data-effect="italics">M</em>) mean in words?</p> </div> </div> <div id="eip-185" data-type="exercise"><div data-type="problem"><p id="eip-idp41331392">What does <em data-effect="italics">P</em>(<em data-effect="italics">E</em> OR <em data-effect="italics">M</em>) mean in words?</p> </div> <div id="eip-505" data-type="solution"><p>the probability of landing on an even number or a multiple of three</p> </div> </div> </div> <div id="fs-idm3746928" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <div id="eip-616" data-type="exercise"><div data-type="problem"><div id="ch03_M02-fig001" class="bc-figure figure"><div class="bc-figcaption figcaption"><span id="eip-idm59266352" data-type="media" data-alt="This is a bar graph with three bars for each category on the x-axis: age groups, gender, and total. The first bar shows the number of people in the category. The second bar shows the percent in the category that approve, and the third bar shows percent in the category that disapprove. The y-axis has intervals of 200 from 0–1200.">1) <img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_StatsC03_RWP_002-1.jpg" alt="This is a bar graph with three bars for each category on the x-axis: age groups, gender, and total. The first bar shows the number of people in the category. The second bar shows the percent in the category that approve, and the third bar shows percent in the category that disapprove. The y-axis has intervals of 200 from 0–1200." width="500" data-media-type="image/jpg" /></span></div> <p id="eip-id1171775545516">The graph in <a class="autogenerated-content" href="#ch03_M02-fig001">(Figure)</a> displays the sample sizes and percentages of people in different age and gender groups who were polled concerning their approval of Mayor Ford’s actions in office. The total number in the sample of all the age groups is 1,045.</p> <ol id="eip-idm36675712" type="a"><li>Define three events in the graph.</li> <li>Describe in words what the entry 40 means.</li> <li>Describe in words the complement of the entry in question 2.</li> <li>Describe in words what the entry 30 means.</li> <li>Out of the males and females, what percent are males?</li> <li>Out of the females, what percent disapprove of Mayor Ford?</li> <li>Out of all the age groups, what percent approve of Mayor Ford?</li> <li>Find <em data-effect="italics">P</em>(Approve|Male).</li> <li>Out of the age groups, what percent are more than 44 years old?</li> <li>Find <em data-effect="italics">P</em>(Approve|Age &lt; 35).</li> </ol> <p>&nbsp;</p> </div> </div> <div id="element-858" data-type="exercise"><div id="id43572830" data-type="problem"><p>2) Explain what is wrong with the following statements. Use complete sentences.</p> <ol id="element-571" type="a"><li>If there is a 60% chance of rain on Saturday and a 70% chance of rain on Sunday, then there is a 130% chance of rain over the weekend.</li> <li>The probability that a baseball player hits a home run is greater than the probability that he gets a successful hit.</li> </ol> </div> <div id="fs-idm63594832" data-type="solution"><ol id="fs-idm95338544" type="a"><li>You can&#8217;t calculate the joint probability knowing the probability of both events occurring, which is not in the information given; the probabilities should be multiplied, not added; and probability is never greater than 100%</li> <li>A home run by definition is a successful hit, so he has to have at least as many successful hits as home runs.</li> </ol> </div> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="condprob"><dt>Conditional Probability</dt> <dd id="id3663808">the likelihood that an event will occur given that another event has already occurred</dd> </dl> <dl id="eqlikly"><dt>Equally Likely</dt> <dd id="id3651637">Each outcome of an experiment has the same probability.</dd> </dl> <dl id="event"><dt>Event</dt> <dd id="id19906185">a subset of the set of all outcomes of an experiment; the set of all outcomes of an experiment is called a <strong>sample space</strong> and is usually denoted by <em data-effect="italics">S</em>. An event is an arbitrary subset in <em data-effect="italics">S</em>. It can contain one outcome, two outcomes, no outcomes (empty subset), the entire sample space, and the like. Standard notations for events are capital letters such as <em data-effect="italics">A</em>, <em data-effect="italics">B</em>, <em data-effect="italics">C</em>, and so on.</dd> </dl> <dl id="experiment"><dt>Experiment</dt> <dd id="id3579725">a planned activity carried out under controlled conditions</dd> </dl> <dl id="outcome"><dt>Outcome</dt> <dd id="id3610906">a particular result of an experiment</dd> </dl> <dl><dt>Probability</dt> <dd>a number between zero and one, inclusive, that gives the likelihood that a specific event will occur; the foundation of statistics is given by the following 3 axioms (by A.N. Kolmogorov, 1930’s): Let <em data-effect="italics">S</em> denote the sample space and <em data-effect="italics">A</em> and <em data-effect="italics">B</em> are two events in <em data-effect="italics">S</em>. Then: <ul id="fs-id6987848"><li>0 ≤ <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) ≤ 1</li> <li>If <em data-effect="italics">A</em> and <em data-effect="italics">B</em> are any two mutually exclusive events, then <em data-effect="italics">P</em>(<em data-effect="italics">A</em> OR <em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) + <em data-effect="italics">P</em>(B).</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">S</em>) = 1</li> </ul> </dd> </dl> <dl id="samplesp"><dt>Sample Space</dt> <dd id="id3455189">the set of all possible outcomes of an experiment</dd> </dl> <dl id="fs-idp211856"><dt>The AND Event</dt> <dd id="fs-idp35849872">An outcome is in the event <em data-effect="italics">A</em> AND <em data-effect="italics">B</em> if the outcome is in both <em data-effect="italics">A</em> AND <em data-effect="italics">B</em> at the same time.</dd> </dl> <dl id="fs-idp6248000"><dt>The Complement Event</dt> <dd id="fs-idp20218352">The complement of event <em data-effect="italics">A</em> consists of all outcomes that are NOT in <em data-effect="italics">A</em>.</dd> </dl> <dl id="fs-idm8780960"><dt>The Conditional Probability of <em data-effect="italics">A</em> GIVEN <em data-effect="italics">B</em></dt> <dd id="fs-idm6980240"><em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>) is the probability that event <em data-effect="italics">A</em> will occur given that the event <em data-effect="italics">B</em> has already occurred.</dd> </dl> <dl id="fs-idp40744624"><dt>The Or Event</dt> <dd id="fs-idp93196512">An outcome is in the event <em data-effect="italics">A</em> OR <em data-effect="italics">B</em> if the outcome is in <em data-effect="italics">A</em> or is in <em data-effect="italics">B</em> or is in both <em data-effect="italics">A</em> and <em data-effect="italics">B</em>.</dd> </dl> </div> </div> </div></div>
<div class="chapter standard" id="chapter-independent-and-mutually-exclusive-events" title="Chapter 4.3: Independent and Mutually Exclusive Events"><div class="chapter-title-wrap"><h3 class="chapter-number">26</h3><h2 class="chapter-title"><span class="display-none">Chapter 4.3: Independent and Mutually Exclusive Events</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p>Independent and mutually exclusive do <strong>not</strong> mean the same thing.</p> <div class="bc-section section" data-depth="1"><h3 data-type="title">Independent Events</h3> <p>Two events are independent if the following are true:</p> <ul><li><em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">B</em>|<em data-effect="italics">A</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">B</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>)<em data-effect="italics">P</em>(<em data-effect="italics">B</em>)</li> </ul> <p>Two events <em data-effect="italics">A</em> and <em data-effect="italics">B</em> are <span data-type="term">independent</span> if the knowledge that one occurred does not affect the chance the other occurs. For example, the outcomes of two roles of a fair die are independent events. The outcome of the first roll does not change the probability for the outcome of the second roll. To show two events are independent, you must show <strong>only one</strong> of the above conditions. If two events are NOT independent, then we say that they are <strong>dependent</strong>.</p> <p id="element-405">Sampling may be done <strong>with</strong> <span data-type="term">replacement</span> or <strong>without replacement</strong>.</p> <ul id="fs-idm1341072"><li><strong>With replacement</strong>: If each member of a population is replaced after it is picked, then that member has the possibility of being chosen more than once. When sampling is done with replacement, then events are considered to be independent, meaning the result of the first pick will not change the probabilities for the second pick.</li> <li><strong>Without replacement</strong>: When sampling is done without replacement, each member of a population may be chosen only once. In this case, the probabilities for the second pick are affected by the result of the first pick. The events are considered to be dependent or not independent.</li> </ul> <p id="element-835">If it is not known whether <em data-effect="italics">A</em> and <em data-effect="italics">B</em> are independent or dependent, <strong>assume they are dependent until you can show otherwise</strong>.</p> <div id="fs-idm1963392" class="textbox textbox--examples" data-type="example"><p id="fs-idm4177088">You have a fair, well-shuffled deck of 52 cards. It consists of four suits. The suits are clubs, diamonds, hearts and spades. There are 13 cards in each suit consisting of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, <em data-effect="italics">J</em> (jack), <em data-effect="italics">Q</em> (queen), <em data-effect="italics">K</em> (king) of that suit.</p> <p id="fs-idm5494304">a. Sampling with replacement: <span data-type="newline" data-count="1"><br /> </span>Suppose you pick three cards with replacement. The first card you pick out of the 52 cards is the <em data-effect="italics">Q</em> of spades. You put this card back, reshuffle the cards and pick a second card from the 52-card deck. It is the ten of clubs. You put this card back, reshuffle the cards and pick a third card from the 52-card deck. This time, the card is the <em data-effect="italics">Q</em> of spades again. Your picks are {<em data-effect="italics">Q</em> of spades, ten of clubs, <em data-effect="italics">Q</em> of spades}. You have picked the <em data-effect="italics">Q</em> of spades twice. You pick each card from the 52-card deck.</p> <p id="fs-idm176528">b. Sampling without replacement: <span data-type="newline" data-count="1"><br /> </span>Suppose you pick three cards without replacement. The first card you pick out of the 52 cards is the <em data-effect="italics">K</em> of hearts. You put this card aside and pick the second card from the 51 cards remaining in the deck. It is the three of diamonds. You put this card aside and pick the third card from the remaining 50 cards in the deck. The third card is the <em data-effect="italics">J</em> of spades. Your picks are {<em data-effect="italics">K</em> of hearts, three of diamonds, <em data-effect="italics">J</em> of spades}. Because you have picked the cards without replacement, you cannot pick the same card twice.</p> </div> <div id="fs-idm17445584" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm1262752" data-type="exercise"><div id="fs-idm318128" data-type="problem"><p id="fs-idp52703136">You have a fair, well-shuffled deck of 52 cards. It consists of four suits. The suits are clubs, diamonds, hearts and spades. There are 13 cards in each suit consisting of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, <em data-effect="italics">J</em> (jack), <em data-effect="italics">Q</em> (queen), <em data-effect="italics">K</em> (king) of that suit. Three cards are picked at random.</p> <ol id="fs-idm4233984" type="a"><li>Suppose you know that the picked cards are <em data-effect="italics">Q</em> of spades, <em data-effect="italics">K</em> of hearts and <em data-effect="italics">Q</em> of spades. Can you decide if the sampling was with or without replacement?</li> <li>Suppose you know that the picked cards are <em data-effect="italics">Q</em> of spades, <em data-effect="italics">K</em> of hearts, and <em data-effect="italics">J</em> of spades. Can you decide if the sampling was with or without replacement?</li> </ol> </div> </div> </div> <div id="fs-idm1468672" class="textbox textbox--examples" data-type="example"><div id="fs-idm203264" data-type="exercise"><div id="fs-idm1147040" data-type="problem"><p id="fs-idm4186336">You have a fair, well-shuffled deck of 52 cards. It consists of four suits. The suits are clubs, diamonds, hearts, and spades. There are 13 cards in each suit consisting of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, <em data-effect="italics">J</em> (jack), <em data-effect="italics">Q</em> (queen), and <em data-effect="italics">K</em> (king) of that suit. <em data-effect="italics">S</em> = spades, <em data-effect="italics">H</em> = Hearts, <em data-effect="italics">D</em> = Diamonds, <em data-effect="italics">C</em> = Clubs.</p> <ol id="fs-idm2936016" type="a"><li>Suppose you pick four cards, but do not put any cards back into the deck. Your cards are <em data-effect="italics">QS</em>, 1<em data-effect="italics">D</em>, 1<em data-effect="italics">C</em>, <em data-effect="italics">QD</em>.</li> <li>Suppose you pick four cards and put each card back before you pick the next card. Your cards are <em data-effect="italics">KH</em>, 7<em data-effect="italics">D</em>, 6<em data-effect="italics">D</em>, <em data-effect="italics">KH</em>.</li> </ol> <p id="fs-idm61714144">Which of a. or b. did you sample with replacement and which did you sample without replacement?</p> </div> <div id="fs-idm4167824" data-type="solution" data-label=""><p id="fs-idm4124432">a. Without replacement; b. With replacement</p> </div> </div> </div> <div id="fs-idm64621712" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm1421520" data-type="exercise"><div id="fs-idm5372176" data-type="problem"><p id="fs-idm328320">You have a fair, well-shuffled deck of 52 cards. It consists of four suits. The suits are clubs, diamonds, hearts, and spades. There are 13 cards in each suit consisting of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, <em data-effect="italics">J</em> (jack), <em data-effect="italics">Q</em> (queen), and <em data-effect="italics">K</em> (king) of that suit. <em data-effect="italics">S</em> = spades, <em data-effect="italics">H</em> = Hearts, <em data-effect="italics">D</em> = Diamonds, <em data-effect="italics">C</em> = Clubs. Suppose that you sample four cards without replacement. Which of the following outcomes are possible? Answer the same question for sampling with replacement.</p> <ol id="fs-idm5107680" type="a"><li><em data-effect="italics">QS</em>, 1<em data-effect="italics">D</em>, 1<em data-effect="italics">C</em>, <em data-effect="italics">QD</em></li> <li><em data-effect="italics">KH</em>, 7<em data-effect="italics">D</em>, 6<em data-effect="italics">D</em>, <em data-effect="italics">KH</em></li> <li><em data-effect="italics">QS</em>, 7<em data-effect="italics">D</em>, 6<em data-effect="italics">D</em>, <em data-effect="italics">KS</em></li> </ol> </div> </div> </div> </div> <div id="element-462" class="bc-section section" data-depth="1"><h3 data-type="title">Mutually Exclusive Events</h3> <p id="element-548"><em data-effect="italics">A</em> and <em data-effect="italics">B</em> are <span data-type="term">mutually exclusive</span> events if they cannot occur at the same time. This means that <em data-effect="italics">A</em> and <em data-effect="italics">B</em> do not share any outcomes and <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = 0.</p> <p>For example, suppose the sample space <em data-effect="italics">S</em> = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. Let <em data-effect="italics">A</em> = {1, 2, 3, 4, 5}, <em data-effect="italics">B</em> = {4, 5, 6, 7, 8}, and <em data-effect="italics">C</em> = {7, 9}. <em data-effect="italics">A</em> AND <em data-effect="italics">B</em> = {4, 5}. <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = \(\frac{2}{10}\) and is not equal to zero. Therefore, <em data-effect="italics">A</em> and <em data-effect="italics">B</em> are not mutually exclusive. <em data-effect="italics">A</em> and <em data-effect="italics">C</em> do not have any numbers in common so <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">C</em>) = 0. Therefore, <em data-effect="italics">A</em> and <em data-effect="italics">C</em> are mutually exclusive.</p> <p id="element-529">If it is not known whether <em data-effect="italics">A</em> and <em data-effect="italics">B</em> are mutually exclusive, <strong>assume they are not until you can show otherwise</strong>. The following examples illustrate these definitions and terms.</p> <div id="element-931" class="textbox textbox--examples" data-type="example"><p id="element-482">Flip two fair coins. (This is an experiment.)</p> <p id="element-956">The sample space is {<em data-effect="italics">HH</em>, <em data-effect="italics">HT</em>, <em data-effect="italics">TH</em>, <em data-effect="italics">TT</em>} where <em data-effect="italics">T</em> = tails and <em data-effect="italics">H</em> = heads. The outcomes are <em data-effect="italics">HH</em>, <em data-effect="italics">HT</em>, <em data-effect="italics">TH</em>, and <em data-effect="italics">TT</em>. The outcomes HT and TH are different. The <em data-effect="italics">HT</em> means that the first coin showed heads and the second coin showed tails. The <em data-effect="italics">TH</em> means that the first coin showed tails and the second coin showed heads.</p> <ul><li>Let <em data-effect="italics">A</em> = the event of getting <strong>at most one tail</strong>. (At most one tail means zero or one tail.) Then <em data-effect="italics">A</em> can be written as {<em data-effect="italics">HH</em>, <em data-effect="italics">HT</em>, <em data-effect="italics">TH</em>}. The outcome <em data-effect="italics">HH</em> shows zero tails. <em data-effect="italics">HT</em> and <em data-effect="italics">TH</em> each show one tail.</li> <li>Let <em data-effect="italics">B</em> = the event of getting all tails. <em data-effect="italics">B</em> can be written as {<em data-effect="italics">TT</em>}. <em data-effect="italics">B</em> is the <strong>complement</strong> of <em data-effect="italics">A</em>, so <em data-effect="italics">B</em> = <em data-effect="italics">A′</em>. Also, <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">A′</em>) = 1.</li> <li>The probabilities for <em data-effect="italics">A</em> and for <em data-effect="italics">B</em> are <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) = \(\frac{3}{4}\) and <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) = \(\frac{1}{4}\).</li> <li>Let <em data-effect="italics">C</em> = the event of getting all heads. <em data-effect="italics">C</em> = {<em data-effect="italics">HH</em>}. Since <em data-effect="italics">B</em> = {<em data-effect="italics">TT</em>}, <em data-effect="italics">P</em>(<em data-effect="italics">B</em> AND <em data-effect="italics">C</em>) = 0. <em data-effect="italics">B</em> and <em data-effect="italics">C</em> are mutually exclusive. (<em data-effect="italics">B</em> and <em data-effect="italics">C</em> have no members in common because you cannot have all tails and all heads at the same time.)</li> <li>Let <em data-effect="italics">D</em> = event of getting <strong>more than one</strong> tail. <em data-effect="italics">D</em> = {<em data-effect="italics">TT</em>}. <em data-effect="italics">P</em>(<em data-effect="italics">D</em>) = \(\frac{1}{4}\)</li> <li>Let <em data-effect="italics">E</em> = event of getting a head on the first roll. (This implies you can get either a head or tail on the second roll.) <em data-effect="italics">E</em> = {<em data-effect="italics">HT</em>, <em data-effect="italics">HH</em>}. <em data-effect="italics">P</em>(<em data-effect="italics">E</em>) = \(\frac{2}{4}\)</li> <li>Find the probability of getting <strong>at least one</strong> (one or two) tail in two flips. Let <em data-effect="italics">F</em> = event of getting at least one tail in two flips. <em data-effect="italics">F</em> = {<em data-effect="italics">HT</em>, <em data-effect="italics">TH</em>, <em data-effect="italics">TT</em>}. <em data-effect="italics">P</em>(<em data-effect="italics">F</em>) = \(\frac{3}{4}\)</li> </ul> </div> <div id="fs-idp18707552" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp250992" data-type="exercise"><div id="fs-idm1214672" data-type="problem"><p id="fs-idm5147344">Draw two cards from a standard 52-card deck with replacement. Find the probability of getting at least one black card.</p> </div> </div> </div> <div id="fs-idm3420000" class="textbox textbox--examples" data-type="example"><div id="fs-idm4639088" data-type="exercise"><div id="fs-idm173264" data-type="problem"><p id="fs-idm1266432">Flip two fair coins. Find the probabilities of the events.</p> <ol id="fs-idm3669424" type="a"><li>Let <em data-effect="italics">F</em> = the event of getting at most one tail (zero or one tail).</li> <li>Let <em data-effect="italics">G</em> = the event of getting two faces that are the same.</li> <li>Let <em data-effect="italics">H</em> = the event of getting a head on the first flip followed by a head or tail on the second flip.</li> <li>Are <em data-effect="italics">F</em> and <em data-effect="italics">G</em> mutually exclusive?</li> <li>Let <em data-effect="italics">J</em> = the event of getting all tails. Are <em data-effect="italics">J</em> and <em data-effect="italics">H</em> mutually exclusive?</li> </ol> </div> <div id="fs-idp66096" data-type="solution"><p id="fs-idm5054528">Look at the sample space in <a class="autogenerated-content" href="#element-931">(Figure)</a>.</p> <ol id="fs-idm4693632" type="a"><li>Zero (0) or one (1) tails occur when the outcomes <em data-effect="italics">HH</em>, <em data-effect="italics">TH</em>, <em data-effect="italics">HT</em> show up. <em data-effect="italics">P</em>(<em data-effect="italics">F</em>) = \(\frac{3}{4}\)</li> <li>Two faces are the same if <em data-effect="italics">HH</em> or <em data-effect="italics">TT</em> show up. <em data-effect="italics">P</em>(<em data-effect="italics">G</em>) = \(\frac{2}{4}\)</li> <li>A head on the first flip followed by a head or tail on the second flip occurs when <em data-effect="italics">HH</em> or <em data-effect="italics">HT</em> show up. <em data-effect="italics">P</em>(<em data-effect="italics">H</em>) = \(\frac{2}{4}\)</li> <li><em data-effect="italics">F</em> and <em data-effect="italics">G</em> share <em data-effect="italics">HH</em> so <em data-effect="italics">P</em>(<em data-effect="italics">F</em> AND <em data-effect="italics">G</em>) is not equal to zero (0). <em data-effect="italics">F</em> and <em data-effect="italics">G</em> are not mutually exclusive.</li> <li>Getting all tails occurs when tails shows up on both coins (<em data-effect="italics">TT</em>). <em data-effect="italics">H</em>’s outcomes are <em data-effect="italics">HH</em> and <em data-effect="italics">HT</em>.</li> </ol> <p id="fs-idm12598624"><em data-effect="italics">J</em> and <em data-effect="italics">H</em> have nothing in common so <em data-effect="italics">P</em>(<em data-effect="italics">J</em> AND <em data-effect="italics">H</em>) = 0. <em data-effect="italics">J</em> and <em data-effect="italics">H</em> are mutually exclusive.</p> </div> </div> </div> <div id="fs-idm101726688" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm3945968" data-type="exercise"><div id="fs-idm4652384" data-type="problem"><p id="fs-idm1471712">A box has two balls, one white and one red. We select one ball, put it back in the box, and select a second ball (sampling with replacement). Find the probability of the following events:</p> <ol id="fs-idm2908208" type="a"><li>Let <em data-effect="italics">F</em> = the event of getting the white ball twice.</li> <li>Let <em data-effect="italics">G</em> = the event of getting two balls of different colors.</li> <li>Let <em data-effect="italics">H</em> = the event of getting white on the first pick.</li> <li>Are <em data-effect="italics">F</em> and <em data-effect="italics">G</em> mutually exclusive?</li> <li>Are <em data-effect="italics">G</em> and <em data-effect="italics">H</em> mutually exclusive?</li> </ol> </div> </div> </div> <div id="element-56" class="textbox textbox--examples" data-type="example"><p id="example-3-2-para">Roll one fair, six-sided die. The sample space is {1, 2, 3, 4, 5, 6}. Let event <em data-effect="italics">A</em> = a face is odd. Then <em data-effect="italics">A</em> = {1, 3, 5}. Let event <em data-effect="italics">B</em> = a face is even. Then <em data-effect="italics">B</em> = {2, 4, 6}.</p> <ul><li>Find the complement of <em data-effect="italics">A</em>, <em data-effect="italics">A′</em>. The complement of <em data-effect="italics">A</em>, <em data-effect="italics">A′</em>, is <em data-effect="italics">B</em> because <em data-effect="italics">A</em> and <em data-effect="italics">B</em> together make up the sample space. <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">A′</em>) = 1. Also, <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) = \(\frac{3}{6}\) and <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) = \(\frac{3}{6}\).</li> <li>Let event <em data-effect="italics">C</em> = odd faces larger than two. Then <em data-effect="italics">C</em> = {3, 5}. Let event <em data-effect="italics">D</em> = all even faces smaller than five. Then <em data-effect="italics">D</em> = {2, 4}. <em data-effect="italics">P</em>(<em data-effect="italics">C</em> AND <em data-effect="italics">D</em>) = 0 because you cannot have an odd and even face at the same time. Therefore, <em data-effect="italics">C</em> and <em data-effect="italics">D</em> are mutually exclusive events.</li> <li>Let event <em data-effect="italics">E</em> = all faces less than five. <em data-effect="italics">E</em> = {1, 2, 3, 4}.</li> </ul> <div id="element-12341" data-type="exercise"><div id="id9852986" data-type="problem"><p id="element-73455">Are <em data-effect="italics">C</em> and <em data-effect="italics">E</em> mutually exclusive events? (Answer yes or no.) Why or why not?</p> </div> <div id="id10118020" data-type="solution" data-label=""><p id="element-032545">No. <em data-effect="italics">C</em> = {3, 5} and <em data-effect="italics">E</em> = {1, 2, 3, 4}. <em data-effect="italics">P</em>(<em data-effect="italics">C</em> AND <em data-effect="italics">E</em>) = \(\frac{1}{6}\). To be mutually exclusive, <em data-effect="italics">P</em>(<em data-effect="italics">C</em> AND <em data-effect="italics">E</em>) must be zero.</p> </div> </div> <ul id="fs-idm67063616"><li>Find <em data-effect="italics">P</em>(<em data-effect="italics">C</em>|<em data-effect="italics">A</em>). This is a conditional probability. Recall that the event <em data-effect="italics">C</em> is {3, 5} and event <em data-effect="italics">A</em> is {1, 3, 5}. To find <em data-effect="italics">P</em>(<em data-effect="italics">C</em>|<em data-effect="italics">A</em>), find the probability of <em data-effect="italics">C</em> using the sample space <em data-effect="italics">A</em>. You have reduced the sample space from the original sample space {1, 2, 3, 4, 5, 6} to {1, 3, 5}. So, <em data-effect="italics">P</em>(<em data-effect="italics">C</em>|<em data-effect="italics">A</em>) = \(\frac{2}{3}\).</li> </ul> </div> <div id="fs-idm51413200" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm3558336" data-type="exercise"><div id="fs-idm4126272" data-type="problem"><p id="fs-idm832352">Let event <em data-effect="italics">A</em> = learning Spanish. Let event <em data-effect="italics">B</em> = learning German. Then <em data-effect="italics">A</em> AND <em data-effect="italics">B</em> = learning Spanish and German. Suppose <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) = 0.4 and <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) = 0.2. <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = 0.08. Are events <em data-effect="italics">A</em> and <em data-effect="italics">B</em> independent? Hint: You must show ONE of the following:</p> <ul id="fs-idm143307488"><li><em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">B</em>|<em data-effect="italics">A</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">B</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>)<em data-effect="italics">P</em>(<em data-effect="italics">B</em>)</li> </ul> </div> <div id="fs-idm4021072" data-type="solution" data-label=""><p id="fs-idm882096"><em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>) = \(\frac{P\left(A\text{ AND }B\right)}{P\left(B\right)}\text{ = }\frac{0.\text{08}}{0.2}\text{ = 0}\text{.4 = }P\left(A\right)\)</p> <p id="fs-idm682208">The events are independent because <em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>).</p> </div> </div> </div> <div id="element-579" class="textbox textbox--examples" data-type="example"><p id="example-3-3-para">Let event <em data-effect="italics">G</em> = taking a math class. Let event <em data-effect="italics">H</em> = taking a science class. Then, <em data-effect="italics">G</em> AND <em data-effect="italics">H</em> = taking a math class and a science class. Suppose <em data-effect="italics">P</em>(<em data-effect="italics">G</em>) = 0.6, <em data-effect="italics">P</em>(<em data-effect="italics">H</em>) = 0.5, and <em data-effect="italics">P</em>(<em data-effect="italics">G</em> AND <em data-effect="italics">H</em>) = 0.3. Are <em data-effect="italics">G</em> and <em data-effect="italics">H</em> independent?</p> <p id="element-456">If <em data-effect="italics">G</em> and <em data-effect="italics">H</em> are independent, then you must show <strong>ONE</strong> of the following:</p> <ul id="element-358"><li><em data-effect="italics">P</em>(<em data-effect="italics">G</em>|<em data-effect="italics">H</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">G</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">H</em>|<em data-effect="italics">G</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">H</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">G</em> AND <em data-effect="italics">H</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">G</em>)<em data-effect="italics">P</em>(<em data-effect="italics">H</em>)</li> </ul> <div id="id9660801" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="eip-idp52179728"><strong>The choice you make depends on the information you have.</strong> You could choose any of the methods here because you have the necessary information.</p> </div> <div id="element-3525" data-type="exercise"><div id="id9660827" data-type="problem"><p id="element-52356">a. Show that <em data-effect="italics">P</em>(<em data-effect="italics">G</em>|<em data-effect="italics">H</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">G</em>).</p> </div> <div id="id9660856" data-type="solution"><p id="fs-idm4674864"><em data-effect="italics">P</em>(<em data-effect="italics">G</em>|<em data-effect="italics">H</em>) = \(\frac{P\text{(}G\text{ AND }H\text{)}}{P\text{(}H\text{)}}\) = \(\frac{\text{0}\text{.3}}{\text{0}\text{.5}}\) = 0.6 = <em data-effect="italics">P</em>(<em data-effect="italics">G</em>)</p> </div> </div> <div id="element-32525" data-type="exercise"><div id="id9660930" data-type="problem"><p id="element-52456">b. Show <em data-effect="italics">P</em>(<em data-effect="italics">G</em> AND <em data-effect="italics">H</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">G</em>)<em data-effect="italics">P</em>(<em data-effect="italics">H</em>).</p> </div> <div id="id9660965" data-type="solution"><p id="element-3462"><em data-effect="italics">P</em>(<em data-effect="italics">G</em>)<em data-effect="italics">P</em>(<em data-effect="italics">H</em>) = (0.6)(0.5) = 0.3 = <em data-effect="italics">P</em>(<em data-effect="italics">G</em> AND <em data-effect="italics">H</em>)</p> </div> </div> <p>Since <em data-effect="italics">G</em> and <em data-effect="italics">H</em> are independent, knowing that a person is taking a science class does not change the chance that he or she is taking a math class. If the two events had not been independent (that is, they are dependent) then knowing that a person is taking a science class would change the chance he or she is taking math. For practice, show that <em data-effect="italics">P</em>(<em data-effect="italics">H</em>|<em data-effect="italics">G</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">H</em>) to show that <em data-effect="italics">G</em> and <em data-effect="italics">H</em> are independent events.</p> </div> <div id="fs-idm13687712" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm4055200" data-type="exercise"><div id="fs-idm3236672" data-type="problem"><p id="fs-idm3236416">In a bag, there are six red marbles and four green marbles. The red marbles are marked with the numbers 1, 2, 3, 4, 5, and 6. The green marbles are marked with the numbers 1, 2, 3, and 4.</p> <ul id="fs-idm5437072"><li><em data-effect="italics">R</em> = a red marble</li> <li><em data-effect="italics">G</em> = a green marble</li> <li><em data-effect="italics">O</em> = an odd-numbered marble</li> <li>The sample space is <em data-effect="italics">S</em> = {<em data-effect="italics">R</em>1, <em data-effect="italics">R</em>2, <em data-effect="italics">R</em>3, <em data-effect="italics">R</em>4, <em data-effect="italics">R</em>5, <em data-effect="italics">R</em>6, <em data-effect="italics">G</em>1, <em data-effect="italics">G</em>2, <em data-effect="italics">G</em>3, <em data-effect="italics">G</em>4}.</li> </ul> <p id="eip-idp4935776"><em data-effect="italics">S</em> has ten outcomes. What is <em data-effect="italics">P</em>(<em data-effect="italics">G</em> AND <em data-effect="italics">O</em>)?</p> </div> </div> </div> <div id="fs-idm793920" class="textbox textbox--examples" data-type="example"><div id="fs-idm3514480" data-type="exercise"><div id="fs-idm1757104" data-type="problem"><p id="fs-idm1756848">Let event <em data-effect="italics">C</em> = taking an English class. Let event <em data-effect="italics">D</em> = taking a speech class.</p> <p id="fs-idm293392">Suppose <em data-effect="italics">P</em>(<em data-effect="italics">C</em>) = 0.75, <em data-effect="italics">P</em>(<em data-effect="italics">D</em>) = 0.3, <em data-effect="italics">P</em>(<em data-effect="italics">C</em>|<em data-effect="italics">D</em>) = 0.75 and <em data-effect="italics">P</em>(<em data-effect="italics">C</em> AND <em data-effect="italics">D</em>) = 0.225.</p> <p id="fs-idm4675792">Justify your answers to the following questions numerically.</p> <ol id="fs-idm5240736" type="a"><li>Are <em data-effect="italics">C</em> and <em data-effect="italics">D</em> independent?</li> <li>Are <em data-effect="italics">C</em> and <em data-effect="italics">D</em> mutually exclusive?</li> <li>What is <em data-effect="italics">P</em>(<em data-effect="italics">D</em>|<em data-effect="italics">C</em>)?</li> </ol> </div> <div id="fs-idm4625248" data-type="solution" data-label=""><ol id="fs-idm316704" type="a"><li>Yes, because <em data-effect="italics">P</em>(<em data-effect="italics">C</em>|<em data-effect="italics">D</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">C</em>).</li> <li>No, because <em data-effect="italics">P</em>(<em data-effect="italics">C</em> AND <em data-effect="italics">D</em>) is not equal to zero.</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">D</em>|<em data-effect="italics">C</em>) = \(\frac{P\text{(}C\text{ AND }D\text{)}}{P\text{(}C\text{)}}\) = \(\frac{\text{0}\text{.225}}{0.75}\) = 0.3</li> </ol> </div> </div> </div> <div id="fs-idp11259824" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm3333552" data-type="exercise"><div id="fs-idm4100544" data-type="problem"><p id="fs-idm4100288">A student goes to the library. Let events <em data-effect="italics">B</em> = the student checks out a book and <em data-effect="italics">D</em> = the student checks out a DVD. Suppose that <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) = 0.40, <em data-effect="italics">P</em>(<em data-effect="italics">D</em>) = 0.30 and <em data-effect="italics">P</em>(<em data-effect="italics">B</em> AND <em data-effect="italics">D</em>) = 0.20.</p> <ol id="fs-idm4647408" type="a"><li>Find <em data-effect="italics">P</em>(<em data-effect="italics">B</em>|<em data-effect="italics">D</em>).</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">D</em>|<em data-effect="italics">B</em>).</li> <li>Are <em data-effect="italics">B</em> and <em data-effect="italics">D</em> independent?</li> <li>Are <em data-effect="italics">B</em> and <em data-effect="italics">D</em> mutually exclusive?</li> </ol> </div> </div> </div> <div id="element-184" class="textbox textbox--examples" data-type="example"><p id="example-3-4-para">In a box there are three red cards and five blue cards. The red cards are marked with the numbers 1, 2, and 3, and the blue cards are marked with the numbers 1, 2, 3, 4, and 5. The cards are well-shuffled. You reach into the box (you cannot see into it) and draw one card.</p> <p id="fs-idm1909360">Let <em data-effect="italics">R</em> = red card is drawn, <em data-effect="italics">B</em> = blue card is drawn, <em data-effect="italics">E</em> = even-numbered card is drawn.</p> <p>The sample space <em data-effect="italics">S</em> = <em data-effect="italics">R</em>1, <em data-effect="italics">R</em>2, <em data-effect="italics">R</em>3, <em data-effect="italics">B</em>1, <em data-effect="italics">B</em>2, <em data-effect="italics">B</em>3, <em data-effect="italics">B</em>4, <em data-effect="italics">B</em>5. <em data-effect="italics">S</em> has eight outcomes.</p> <ul><li><em data-effect="italics">P</em>(<em data-effect="italics">R</em>) = \(\frac{3}{8}\). <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) = \(\frac{5}{8}\). <em data-effect="italics">P</em>(<em data-effect="italics">R</em> AND <em data-effect="italics">B</em>) = 0. (You cannot draw one card that is both red and blue.)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">E</em>) = \(\frac{3}{8}\). (There are three even-numbered cards, <em data-effect="italics">R</em>2, <em data-effect="italics">B</em>2, and <em data-effect="italics">B</em>4.)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">E</em>|<em data-effect="italics">B</em>) = \(\frac{2}{5}\). (There are five blue cards: <em data-effect="italics">B</em>1, <em data-effect="italics">B</em>2, <em data-effect="italics">B</em>3, <em data-effect="italics">B</em>4, and <em data-effect="italics">B</em>5. Out of the blue cards, there are two even cards; <em data-effect="italics">B</em>2 and <em data-effect="italics">B</em>4.)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">B</em>|<em data-effect="italics">E</em>) = \(\frac{2}{3}\). (There are three even-numbered cards: <em data-effect="italics">R</em>2, <em data-effect="italics">B</em>2, and <em data-effect="italics">B</em>4. Out of the even-numbered cards, to are blue; <em data-effect="italics">B</em>2 and <em data-effect="italics">B</em>4.)</li> <li>The events <em data-effect="italics">R</em> and <em data-effect="italics">B</em> are mutually exclusive because <em data-effect="italics">P</em>(<em data-effect="italics">R</em> AND <em data-effect="italics">B</em>) = 0.</li> <li>Let <em data-effect="italics">G</em> = card with a number greater than 3. <em data-effect="italics">G</em> = {<em data-effect="italics">B</em>4, <em data-effect="italics">B</em>5}. <em data-effect="italics">P</em>(<em data-effect="italics">G</em>) = \(\frac{2}{8}\). Let <em data-effect="italics">H</em> = blue card numbered between one and four, inclusive. <em data-effect="italics">H</em> = {<em data-effect="italics">B</em>1, <em data-effect="italics">B</em>2, <em data-effect="italics">B</em>3, <em data-effect="italics">B</em>4}. <em data-effect="italics">P</em>(<em data-effect="italics">G</em>|<em data-effect="italics">H</em>) = \(\frac{1}{4}\). (The only card in <em data-effect="italics">H</em> that has a number greater than three is <em data-effect="italics">B</em>4.) Since \(\frac{2}{8}\) = \(\frac{1}{4}\), <em data-effect="italics">P</em>(<em data-effect="italics">G</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">G</em>|<em data-effect="italics">H</em>), which means that <em data-effect="italics">G</em> and <em data-effect="italics">H</em> are independent.</li> </ul> </div> <div id="fs-idm85337200" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm4149344" data-type="exercise"><div id="fs-idm5137744" data-type="problem"><p id="fs-idm3390208">In a basketball arena,</p> <ul id="fs-idm626832"><li>70% of the fans are rooting for the home team.</li> <li>25% of the fans are wearing blue.</li> <li>20% of the fans are wearing blue and are rooting for the away team.</li> <li>Of the fans rooting for the away team, 67% are wearing blue.</li> </ul> <p id="fs-idm2657904">Let <em data-effect="italics">A</em> be the event that a fan is rooting for the away team. <span data-type="newline"><br /> </span>Let <em data-effect="italics">B</em> be the event that a fan is wearing blue. <span data-type="newline"><br /> </span>Are the events of rooting for the away team and wearing blue independent? Are they mutually exclusive?</p> </div> </div> </div> <div id="eip-31" class="textbox textbox--examples" data-type="example"><p>In a particular college class, 60% of the students are female. Fifty percent of all students in the class have long hair. Forty-five percent of the students are female and have long hair. Of the female students, 75% have long hair. Let <em data-effect="italics">F</em> be the event that a student is female. Let <em data-effect="italics">L</em> be the event that a student has long hair. One student is picked randomly. Are the events of being female and having long hair independent?</p> <ul id="eip-637"><li>The following probabilities are given in this example:</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">F</em>) = 0.60; <em data-effect="italics">P</em>(<em data-effect="italics">L</em>) = 0.50</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">F</em> AND <em data-effect="italics">L</em>) = 0.45</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">L</em>|<em data-effect="italics">F</em>) = 0.75</li> </ul> <div id="eip-843" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="eip-idm110347232"><strong>The choice you make depends on the information you have.</strong> You could use the first or last condition on the list for this example. You do not know <em data-effect="italics">P</em>(<em data-effect="italics">F</em>|<em data-effect="italics">L</em>) yet, so you cannot use the second condition.</p> </div> <p><span data-type="title">Solution 1</span>Check whether <em data-effect="italics">P</em>(<em data-effect="italics">F</em> AND <em data-effect="italics">L</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">F</em>)<em data-effect="italics">P</em>(<em data-effect="italics">L</em>). We are given that <em data-effect="italics">P</em>(<em data-effect="italics">F</em> AND <em data-effect="italics">L</em>) = 0.45, but <em data-effect="italics">P</em>(<em data-effect="italics">F</em>)<em data-effect="italics">P</em>(<em data-effect="italics">L</em>) = (0.60)(0.50) = 0.30. The events of being female and having long hair are not independent because <em data-effect="italics">P</em>(<em data-effect="italics">F</em> AND <em data-effect="italics">L</em>) does not equal <em data-effect="italics">P</em>(<em data-effect="italics">F</em>)<em data-effect="italics">P</em>(<em data-effect="italics">L</em>).</p> <p><span data-type="title">Solution 2</span>Check whether <em data-effect="italics">P</em>(<em data-effect="italics">L</em>|<em data-effect="italics">F</em>) equals <em data-effect="italics">P</em>(<em data-effect="italics">L</em>). We are given that <em data-effect="italics">P</em>(<em data-effect="italics">L</em>|<em data-effect="italics">F</em>) = 0.75, but <em data-effect="italics">P</em>(<em data-effect="italics">L</em>) = 0.50; they are not equal. The events of being female and having long hair are not independent.</p> <p id="eip-103"><span data-type="title">Interpretation of Results</span>The events of being female and having long hair are not independent; knowing that a student is female changes the probability that a student has long hair.</p> </div> <div id="fs-idp15300304" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm4091904" data-type="exercise"><div id="fs-idm4091648" data-type="problem"><p id="fs-idm2370976">Mark is deciding which route to take to work. His choices are <em data-effect="italics">I</em> = the Interstate and <em data-effect="italics">F</em> = Fifth Street.</p> <ul id="fs-idm5051536"><li><em data-effect="italics">P</em>(<em data-effect="italics">I</em>) = 0.44 and <em data-effect="italics">P</em>(<em data-effect="italics">F</em>) = 0.56</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">I</em> AND <em data-effect="italics">F</em>) = 0 because Mark will take only one route to work.</li> </ul> <p id="fs-idm1878352">What is the probability of <em data-effect="italics">P</em>(<em data-effect="italics">I</em> OR <em data-effect="italics">F</em>)?</p> </div> </div> </div> <div id="fs-idm4667408" class="textbox textbox--examples" data-type="example"><div id="fs-idm4667152" data-type="exercise"><div id="fs-idm336096" data-type="problem"><ol id="fs-idm335840" type="a"><li>Toss one fair coin (the coin has two sides, <em data-effect="italics">H</em> and <em data-effect="italics">T</em>). The outcomes are ________. Count the outcomes. There are ____ outcomes.</li> <li>Toss one fair, six-sided die (the die has 1, 2, 3, 4, 5 or 6 dots on a side). The outcomes are ________________. Count the outcomes. There are ___ outcomes.</li> <li>Multiply the two numbers of outcomes. The answer is _______.</li> <li>If you flip one fair coin and follow it with the toss of one fair, six-sided die, the answer in part c. is the number of outcomes (size of the sample space). What are the outcomes? (Hint: Two of the outcomes are <em data-effect="italics">H</em>1 and <em data-effect="italics">T</em>6.)</li> <li>Event <em data-effect="italics">A</em> = heads (<em data-effect="italics">H</em>) on the coin followed by an even number (2, 4, 6) on the die. <span data-type="newline"><br /> </span><em data-effect="italics">A</em> = {_________________}. Find <em data-effect="italics">P</em>(<em data-effect="italics">A</em>).</li> <li>Event <em data-effect="italics">B</em> = heads on the coin followed by a three on the die. <em data-effect="italics">B</em> = {________}. Find <em data-effect="italics">P</em>(<em data-effect="italics">B</em>).</li> <li>Are <em data-effect="italics">A</em> and <em data-effect="italics">B</em> mutually exclusive? (Hint: What is <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>)? If <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = 0, then <em data-effect="italics">A</em> and <em data-effect="italics">B</em> are mutually exclusive.)</li> <li>Are <em data-effect="italics">A</em> and <em data-effect="italics">B</em> independent? (Hint: Is <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>)<em data-effect="italics">P</em>(<em data-effect="italics">B</em>)? If <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>)<em data-effect="italics">P</em>(<em data-effect="italics">B</em>), then <em data-effect="italics">A</em> and <em data-effect="italics">B</em> are independent. If not, then they are dependent).</li> </ol> </div> <div id="fs-idm3508544" data-type="solution" data-label=""><ol id="fs-idm15568720" type="a"><li><em data-effect="italics">H</em> and <em data-effect="italics">T</em>; 2</li> <li>1, 2, 3, 4, 5, 6; 6</li> <li>2(6) = 12</li> <li><em data-effect="italics">T</em>1, <em data-effect="italics">T</em>2, <em data-effect="italics">T</em>3, <em data-effect="italics">T</em>4, <em data-effect="italics">T</em>5, <em data-effect="italics">T</em>6, <em data-effect="italics">H</em>1, <em data-effect="italics">H</em>2, <em data-effect="italics">H</em>3, <em data-effect="italics">H</em>4, <em data-effect="italics">H</em>5, <em data-effect="italics">H</em>6</li> <li><em data-effect="italics">A</em> = {<em data-effect="italics">H</em>2, <em data-effect="italics">H</em>4, <em data-effect="italics">H</em>6}; <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) = \(\frac{\text{3}}{12}\)</li> <li><em data-effect="italics">B</em> = {<em data-effect="italics">H</em>3}; <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) = \(\frac{\text{1}}{12}\)</li> <li>Yes, because <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = 0</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = 0.<em data-effect="italics">P</em>(<em data-effect="italics">A</em>)<em data-effect="italics">P</em>(<em data-effect="italics">B</em>) = \(\left(\frac{\text{3}}{12}\right)\)\(\left(\frac{\text{1}}{12}\right)\). <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) does not equal <em data-effect="italics">P</em>(<em data-effect="italics">A</em>)<em data-effect="italics">P</em>(<em data-effect="italics">B</em>), so <em data-effect="italics">A</em> and <em data-effect="italics">B</em> are dependent.</li> </ol> </div> </div> </div> <div id="fs-idp34076080" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm5094416" data-type="exercise"><div id="fs-idm340208" data-type="problem"><p id="fs-idm339952">A box has two balls, one white and one red. We select one ball, put it back in the box, and select a second ball (sampling with replacement). Let <em data-effect="italics">T</em> be the event of getting the white ball twice, <em data-effect="italics">F</em> the event of picking the white ball first, <em data-effect="italics">S</em> the event of picking the white ball in the second drawing.</p> <ol id="fs-idm5499808" type="a"><li>Compute <em data-effect="italics">P</em>(<em data-effect="italics">T</em>).</li> <li>Compute <em data-effect="italics">P</em>(<em data-effect="italics">T</em>|<em data-effect="italics">F</em>).</li> <li>Are <em data-effect="italics">T</em> and <em data-effect="italics">F</em> independent?.</li> <li>Are <em data-effect="italics">F</em> and <em data-effect="italics">S</em> mutually exclusive?</li> <li>Are <em data-effect="italics">F</em> and <em data-effect="italics">S</em> independent?</li> </ol> </div> </div> </div> </div> <div id="fs-idm73253936" class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="fs-idm119965792">Lopez, Shane, Preety Sidhu. “U.S. Teachers Love Their Lives, but Struggle in the Workplace.” Gallup Wellbeing, 2013. http://www.gallup.com/poll/161516/teachers-love-lives-struggle-workplace.aspx (accessed May 2, 2013).</p> <p id="fs-idp18246464">Data from Gallup. Available online at www.gallup.com/ (accessed May 2, 2013).</p> </div> <div id="fs-idm1912144" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idm750784">Two events <em data-effect="italics">A</em> and <em data-effect="italics">B</em> are independent if the knowledge that one occurred does not affect the chance the other occurs. If two events are not independent, then we say that they are dependent.</p> <p id="fs-idm3534560">In sampling with replacement, each member of a population is replaced after it is picked, so that member has the possibility of being chosen more than once, and the events are considered to be independent. In sampling without replacement, each member of a population may be chosen only once, and the events are considered not to be independent. When events do not share outcomes, they are mutually exclusive of each other.</p> </div> <div id="fs-idm4116704" class="formula-review" data-depth="1"><h3 data-type="title">Formula Review</h3> <p id="fs-idm3335456">If <em data-effect="italics">A</em> and <em data-effect="italics">B</em> are independent, <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>)<em data-effect="italics">P</em>(<em data-effect="italics">B</em>), <em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) and <em data-effect="italics">P</em>(<em data-effect="italics">B</em>|<em data-effect="italics">A</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">B</em>).</p> <p id="fs-idm2879760">If <em data-effect="italics">A</em> and <em data-effect="italics">B</em> are mutually exclusive, <em data-effect="italics">P</em>(<em data-effect="italics">A</em> OR <em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) and <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = 0.</p> </div> <div id="fs-idp62639840" class="practice" data-depth="1"><div id="element-863" data-type="exercise"><div id="id43759125" data-type="problem"><p><em data-effect="italics">E</em> and <em data-effect="italics">F</em> are mutually exclusive events. <em data-effect="italics">P</em>(<em data-effect="italics">E</em>) = 0.4; <em data-effect="italics">P</em>(<em data-effect="italics">F</em>) = 0.5. Find <em data-effect="italics">P</em>(<em data-effect="italics">E</em>∣<em data-effect="italics">F</em>).</p> </div> </div> <div data-type="exercise"><div id="id43759323" data-type="problem"><p><em data-effect="italics">J</em> and <em data-effect="italics">K</em> are independent events. <em data-effect="italics">P</em>(<em data-effect="italics">J</em>|<em data-effect="italics">K</em>) = 0.3. Find <em data-effect="italics">P</em>(<em data-effect="italics">J</em>).</p> </div> <div id="fs-idm2098384" data-type="solution" data-label=""><p id="fs-idp3820560"><em data-effect="italics">P</em>(<em data-effect="italics">J</em>) = 0.3</p> </div> </div> <div data-type="exercise"><div id="id43759426" data-type="problem"><p><em data-effect="italics">U</em> and <em data-effect="italics">V</em> are mutually exclusive events. <em data-effect="italics">P</em>(<em data-effect="italics">U</em>) = 0.26; <em data-effect="italics">P</em>(<em data-effect="italics">V</em>) = 0.37. Find:</p> <ol id="element-896" type="a"><li><em data-effect="italics">P</em>(<em data-effect="italics">U</em> AND <em data-effect="italics">V</em>) =</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">U</em>|<em data-effect="italics">V</em>) =</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">U</em> OR <em data-effect="italics">V</em>) =</li> </ol> </div> </div> <div id="element-930" data-type="exercise"><div id="id43759701" data-type="problem"><p id="element-445"><em data-effect="italics">Q</em> and <em data-effect="italics">R</em> are independent events. <em data-effect="italics">P</em>(<em data-effect="italics">Q</em>) = 0.4 and <em data-effect="italics">P</em>(<em data-effect="italics">Q</em> AND <em data-effect="italics">R</em>) = 0.1. Find <em data-effect="italics">P</em>(<em data-effect="italics">R</em>).</p> </div> <div id="fs-idp48870976" data-type="solution" data-label=""><p id="fs-idp48871360"><em data-effect="italics">P</em>(<em data-effect="italics">Q</em> AND <em data-effect="italics">R</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">Q</em>)<em data-effect="italics">P</em>(<em data-effect="italics">R</em>)</p> <p id="fs-idp19254592">0.1 = (0.4)<em data-effect="italics">P</em>(<em data-effect="italics">R</em>)</p> <p id="fs-idp19254976"><em data-effect="italics">P</em>(<em data-effect="italics">R</em>) = 0.25</p> </div> </div> </div> <div id="fs-idm88736" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <p id="eip-idp126196880"><em data-effect="italics">Use the following information to answer the next 12 exercises.</em> The graph shown is based on more than 170,000 interviews done by Gallup that took place from January through December 2012. The sample consists of employed Americans 18 years of age or older. The Emotional Health Index Scores are the sample space. We randomly sample one Emotional Health Index Score.</p> <div id="eip-id1171405227589" class="bc-figure figure"><span id="eip-id1171388484061" data-type="media" data-alt="emotional health index score"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/CNX_Stats_C03_RWP_001-1.jpg" alt="emotional health index score" width="500" data-media-type="image/jpeg" /></span></div> <div data-type="exercise"><div id="eip-85" data-type="problem"><p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="eip-102" data-type="problem"><p>1) Find the probability that an Emotional Health Index Score is 81.0.</p> </div> <div id="eip-841" data-type="solution"><p>&nbsp;</p> </div> </div> <div id="eip-538" data-type="exercise"><div id="eip-976" data-type="problem"><p>2) Find the probability that an Emotional Health Index Score is more than 81?</p> <p>&nbsp;</p> </div> </div> <div id="eip-42" data-type="exercise"><div id="eip-714" data-type="problem"><p id="eip-674">3) Find the probability that an Emotional Health Index Score is between 80.5 and 82?</p> </div> <div id="eip-118" data-type="solution"><p>&nbsp;</p> </div> </div> <div id="eip-44" data-type="exercise"><div data-type="problem"><p id="eip-515">4) If we know an Emotional Health Index Score is 81.5 or more, what is the probability that it is 82.7?</p> <p>&nbsp;</p> </div> </div> <div id="eip-347" data-type="exercise"><div data-type="problem"><p>5) What is the probability that an Emotional Health Index Score is 80.7 or 82.7?</p> </div> <div id="eip-923" data-type="solution" data-label=""><p>&nbsp;</p> </div> </div> <div id="eip-197" data-type="exercise"><div id="eip-299" data-type="problem"><p id="eip-147">6) What is the probability that an Emotional Health Index Score is less than 80.2 given that it is already less than 81.</p> <p>&nbsp;</p> </div> </div> <div id="eip-545" data-type="exercise"><div id="eip-62" data-type="problem"><p>7) What occupation has the highest emotional index score?</p> </div> <div id="eip-427" data-type="solution" data-label=""><p>&nbsp;</p> </div> </div> <div id="eip-904" data-type="exercise"><div id="eip-638" data-type="problem"><p>8) What occupation has the lowest emotional index score?</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p id="eip-533">9) What is the range of the data?</p> <p>&nbsp;</p> </div> </div> <div id="eip-221" data-type="exercise"><div data-type="problem"><p>10) Compute the average EHIS.</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="eip-520" data-type="problem"><p id="eip-314">11) If all occupations are equally likely for a certain individual, what is the probability that he or she will have an occupation with lower than average EHIS?</p> </div> <div id="eip-983" data-type="solution" data-label=""><p>12) Find the probability that an Emotional Health Index Score is 82.7.</p> <p>&nbsp;</p> </div> </div> </div> <div id="fs-idm4259888" class="bring-together-homework" data-depth="1"><h3 data-type="title">Bringing It Together</h3> <div id="element-738" data-type="exercise"><div id="id43569944" data-type="problem"><p id="element-699">A previous year, the weights of the members of the <strong>San Francisco 49ers</strong> and the <strong>Dallas Cowboys</strong> were published in the <cite><span data-type="cite-title">San Jose Mercury News</span></cite>. The factual data are compiled into <a class="autogenerated-content" href="#ch03_M03-tbl001">(Figure)</a>.</p> <table id="ch03_M03-tbl001" summary="This table presents weight in pounds by shirt number. The first column lists the shirt number, the second column lists weight ≤ 210, the third column lists 211-250, fourth column lists 251-290, and the fifth column lists 291 ≤. The first row lists shirt numbers 1-33, second row lists 34-66, and the third row lists 66-99."><thead><tr><th>Shirt#</th> <th>≤ 210</th> <th>211–250</th> <th>251–290</th> <th>290≤</th> </tr> </thead> <tbody><tr><td>1–33</td> <td>21</td> <td>5</td> <td>0</td> <td>0</td> </tr> <tr><td>34–66</td> <td>6</td> <td>18</td> <td>7</td> <td>4</td> </tr> <tr><td>66–99</td> <td>6</td> <td>12</td> <td>22</td> <td>5</td> </tr> </tbody> </table> <p>For the following, suppose that you randomly select one player from the 49ers or Cowboys.</p> <p id="eip-idm15421280">13) <span style="font-size: 1em">The probability that a male develops some form of cancer in his lifetime is 0.4567. The probability that a male has at least one false positive test result (meaning the test comes back for cancer when the man does not have it) is 0.51. Some of the following questions do not have enough information for you to answer them. Write “not enough information” for those answers. Let </span><em style="font-size: 1em" data-effect="italics">C</em> <span style="font-size: 1em">= a man develops cancer in his lifetime and </span><em style="font-size: 1em" data-effect="italics">P</em> <span style="font-size: 1em">= man has at least one false positive.</span></p> </div> </div> <div id="element-750" data-type="exercise"><div id="id43570745" data-type="problem"><ol id="element-196" type="a"><li><em data-effect="italics">P</em>(<em data-effect="italics">C</em>) = ______</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">P</em>|<em data-effect="italics">C</em>) = ______</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">P</em>|<em data-effect="italics">C&#8217;</em>) = ______</li> <li>If a test comes up positive, based upon numerical values, can you assume that man has cancer? Justify numerically and explain why or why not.</li> </ol> <p>&nbsp;</p> </div> <div id="fs-idm6556256" data-type="solution" data-label=""></div> </div> <div data-type="exercise"><div id="eip-582" data-type="problem"><p>14) If having a shirt number from one to 33 and weighing at most 210 pounds were independent events, then what should be true about <em data-effect="italics">P</em>(Shirt# 1–33|≤ 210 pounds)?</p> <p>&nbsp;</p> </div> </div> <div id="eip-278" data-type="exercise"><div id="eip-id1164335822550" data-type="problem"><p>15) Given events <em data-effect="italics">J</em> and <em data-effect="italics">K</em>: <em data-effect="italics">P</em>(<em data-effect="italics">J</em>) = 0.18; <em data-effect="italics">P</em>(<em data-effect="italics">K</em>) = 0.37; <em data-effect="italics">P</em>(<em data-effect="italics">J</em> OR <em data-effect="italics">K</em>) = 0.45</p> <ol id="eip-id1164314961072" type="a"><li>Find <em data-effect="italics">P</em>(<em data-effect="italics">J</em> AND <em data-effect="italics">K</em>).</li> <li>Find the probability of the complement of event (<em data-effect="italics">J</em> AND <em data-effect="italics">K</em>).</li> <li>Find the probability of the complement of event (<em data-effect="italics">J</em> OR <em data-effect="italics">K</em>).</li> </ol> <p>&nbsp;</p> </div> <div id="eip-id1164325147049" data-type="solution" data-label=""><ol id="eip-id1164335526487" type="a"></ol> <p>16) Given events <em data-effect="italics">G</em> and <em data-effect="italics">H</em>: <em data-effect="italics">P</em>(<em data-effect="italics">G</em>) = 0.43; <em data-effect="italics">P</em>(<em data-effect="italics">H</em>) = 0.26; <em data-effect="italics">P</em>(<em data-effect="italics">H</em> AND <em data-effect="italics">G</em>) = 0.14</p> <ol id="eip-id8377009" type="a"><li>Find <em data-effect="italics">P</em>(<em data-effect="italics">H</em> OR <em data-effect="italics">G</em>).</li> <li>Find the probability of the complement of event (<em data-effect="italics">H</em> AND <em data-effect="italics">G</em>).</li> <li>Find the probability of the complement of event (<em data-effect="italics">H</em> OR <em data-effect="italics">G</em>).</li> </ol> <p><strong>Answers to odd questions</strong></p> <p>1) 0</p> <p>3) 0.3571</p> <p>5) 0.2142</p> <p>7) Physician (83.7)</p> <p>9) 83.7 − 79.6 = 4.1</p> <p>11) <em data-effect="italics">P</em>(Occupation &lt; 81.3) = 0.5</p> <p>13)</p> <ol id="fs-idm62553312" type="a"><li><em data-effect="italics">P</em>(<em data-effect="italics">C</em>) = 0.4567</li> <li>not enough information</li> <li>not enough information</li> <li>No, because over half (0.51) of men have at least one false positive text</li> </ol> <p>15)</p> <ol type="a"><li><em data-effect="italics">P</em>(<em data-effect="italics">J</em> OR <em data-effect="italics">K</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">J</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">K</em>) − <em data-effect="italics">P</em>(<em data-effect="italics">J</em> AND <em data-effect="italics">K</em>); 0.45 = 0.18 + 0.37 &#8211; <em data-effect="italics">P</em>(<em data-effect="italics">J</em> AND <em data-effect="italics">K</em>); solve to find <em data-effect="italics">P</em>(<em data-effect="italics">J</em> AND <em data-effect="italics">K</em>) = 0.10</li> <li><em data-effect="italics">P</em>(NOT (<em data-effect="italics">J</em> AND <em data-effect="italics">K</em>)) = 1 &#8211; <em data-effect="italics">P</em>(<em data-effect="italics">J</em> AND <em data-effect="italics">K</em>) = 1 &#8211; 0.10 = 0.90</li> <li><em data-effect="italics">P</em>(NOT (<em data-effect="italics">J</em> OR <em data-effect="italics">K</em>)) = 1 &#8211; <em data-effect="italics">P</em>(<em data-effect="italics">J</em> OR <em data-effect="italics">K</em>) = 1 &#8211; 0.45 = 0.55</li> </ol> </div> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="fs-idm19024"><dt>Dependent Events</dt> <dd id="fs-idm18384">If two events are NOT independent, then we say that they are dependent.</dd> </dl> <dl id="fs-idm5236464"><dt>Sampling with Replacement</dt> <dd id="fs-idm5235824">If each member of a population is replaced after it is picked, then that member has the possibility of being chosen more than once.</dd> </dl> <dl id="fs-idm5235168"><dt>Sampling without Replacement</dt> <dd id="fs-idm5234528">When sampling is done without replacement, each member of a population may be chosen only once.</dd> </dl> <dl id="fs-idm5117088"><dt>The Conditional Probability of One Event Given Another Event</dt> <dd id="fs-idm5116448"><em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>) is the probability that event <em data-effect="italics">A</em> will occur given that the event <em data-effect="italics">B</em> has already occurred.</dd> </dl> <dl id="fs-idm5233904"><dt>The OR of Two Events</dt> <dd id="fs-idm5117712">An outcome is in the event <em data-effect="italics">A</em> OR <em data-effect="italics">B</em> if the outcome is in <em data-effect="italics">A</em>, is in <em data-effect="italics">B</em>, or is in both <em data-effect="italics">A</em> and <em data-effect="italics">B</em>.</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-two-basic-rules-of-probability" title="Chapter 4.4: Two Basic Rules of Probability"><div class="chapter-title-wrap"><h3 class="chapter-number">27</h3><h2 class="chapter-title"><span class="display-none">Chapter 4.4: Two Basic Rules of Probability</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p id="fs-idp70908880">When calculating probability, there are two rules to consider when determining if two events are independent or dependent and if they are mutually exclusive or not.</p> <div class="bc-section section" data-depth="1"><h3 data-type="title">The Multiplication Rule</h3> <p>If <em data-effect="italics">A</em> and <em data-effect="italics">B</em> are two events defined on a <span data-type="term">sample space</span>, then: <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">B</em>)<em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>).</p> <p id="element-423">This rule may also be written as: <em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>) = \(\frac{P\left(A\text{ AND }B\right)}{P\left(B\right)}\)</p> <p id="element-800">(The probability of <em data-effect="italics">A</em> given <em data-effect="italics">B</em> equals the probability of <em data-effect="italics">A</em> and <em data-effect="italics">B</em> divided by the probability of <em data-effect="italics">B</em>.)</p> <p id="element-607">If <em data-effect="italics">A</em> and <em data-effect="italics">B</em> are <span data-type="term">independent</span>, then <em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>). Then <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>)<em data-effect="italics">P</em>(<em data-effect="italics">B</em>) becomes <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>)<em data-effect="italics">P</em>(<em data-effect="italics">B</em>).</p> </div> <div class="bc-section section" data-depth="1"><h3 data-type="title">The Addition Rule</h3> <p id="element-306">If <em data-effect="italics">A</em> and <em data-effect="italics">B</em> are defined on a sample space, then: <em data-effect="italics">P</em>(<em data-effect="italics">A</em> OR <em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) &#8211; <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>).</p> <p>If <em data-effect="italics">A</em> and <em data-effect="italics">B</em> are <span data-type="term">mutually exclusive</span>, then <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = 0. Then <em data-effect="italics">P</em>(<em data-effect="italics">A</em> OR <em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) &#8211; <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) becomes <em data-effect="italics">P</em>(<em data-effect="italics">A</em> OR <em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">B</em>).</p> <div id="element-898" class="textbox textbox--examples" data-type="example"><p id="fs-idm51279344">Klaus is trying to choose where to go on vacation. His two choices are: <em data-effect="italics">A</em> = New Zealand and <em data-effect="italics">B</em> = Alaska</p> <ul><li>Klaus can only afford one vacation. The probability that he chooses <em data-effect="italics">A</em> is <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) = 0.6 and the probability that he chooses <em data-effect="italics">B</em> is <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) = 0.35.</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = 0 because Klaus can only afford to take one vacation</li> <li>Therefore, the probability that he chooses either New Zealand or Alaska is <em data-effect="italics">P</em>(<em data-effect="italics">A</em> OR <em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) = 0.6 + 0.35 = 0.95. Note that the probability that he does not choose to go anywhere on vacation must be 0.05.</li> </ul> </div> <div id="element-83" class="textbox textbox--examples" data-type="example"><p>Carlos plays college soccer. He makes a goal 65% of the time he shoots. Carlos is going to attempt two goals in a row in the next game. <em data-effect="italics">A</em> = the event Carlos is successful on his first attempt. <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) = 0.65. <em data-effect="italics">B</em> = the event Carlos is successful on his second attempt. <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) = 0.65. Carlos tends to shoot in streaks. The probability that he makes the second goal <strong>GIVEN</strong> that he made the first goal is 0.90.</p> <p>&nbsp;</p> <div data-type="exercise"><div id="id20946894" data-type="problem"><p id="element-639p">a. What is the probability that he makes both goals?</p> </div> <div id="id20946912" data-type="solution"><p id="element-639s">a. The problem is asking you to find <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">B</em> AND <em data-effect="italics">A</em>). Since <em data-effect="italics">P</em>(<em data-effect="italics">B</em>|<em data-effect="italics">A</em>) = 0.90: <em data-effect="italics">P</em>(<em data-effect="italics">B</em> AND <em data-effect="italics">A</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">B</em>|<em data-effect="italics">A</em>) <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) = (0.90)(0.65) = 0.585</p> <p id="fs-idm26792640">Carlos makes the first and second goals with probability 0.585.</p> <p>&nbsp;</p> </div> </div> <div id="element-101" data-type="exercise"><div id="id20947019" data-type="problem"><p id="element-101p">b. What is the probability that Carlos makes either the first goal or the second goal?</p> </div> <div id="id20947037" data-type="solution"><p id="element-101s">b. The problem is asking you to find <em data-effect="italics">P</em>(<em data-effect="italics">A</em> OR <em data-effect="italics">B</em>).</p> <p id="fs-idm17573264"><em data-effect="italics">P</em>(<em data-effect="italics">A</em> OR <em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) &#8211; <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = 0.65 + 0.65 &#8211; 0.585 = 0.715</p> <p id="element-101s2">Carlos makes either the first goal or the second goal with probability 0.715.</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id20946309" data-type="problem"><p id="element-356">c. Are <em data-effect="italics">A</em> and <em data-effect="italics">B</em> independent?</p> </div> <div id="id20946336" data-type="solution"><p>c. No, they are not, because <em data-effect="italics">P</em>(<em data-effect="italics">B</em> AND <em data-effect="italics">A</em>) = 0.585.</p> <p id="fs-idm36249792"><em data-effect="italics">P</em>(<em data-effect="italics">B</em>)<em data-effect="italics">P</em>(<em data-effect="italics">A</em>) = (0.65)(0.65) = 0.423</p> <p id="fs-idp6892608">0.423 ≠ 0.585 = <em data-effect="italics">P</em>(<em data-effect="italics">B</em> AND <em data-effect="italics">A</em>)</p> <p>So, <em data-effect="italics">P</em>(<em data-effect="italics">B</em> AND <em data-effect="italics">A</em>) is <strong>not</strong> equal to <em data-effect="italics">P</em>(<em data-effect="italics">B</em>)<em data-effect="italics">P</em>(<em data-effect="italics">A</em>).</p> <p>&nbsp;</p> </div> </div> <div id="element-102" data-type="exercise"><div id="id20946487" data-type="problem"><p id="element-102p">d. Are <em data-effect="italics">A</em> and <em data-effect="italics">B</em> mutually exclusive?</p> </div> <div id="id20946514" data-type="solution"><p id="element-102s">d. No, they are not because <em data-effect="italics">P</em>(<em data-effect="italics">A</em> and <em data-effect="italics">B</em>) = 0.585.</p> <p id="element-102s2">To be mutually exclusive, <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) must equal zero.</p> </div> </div> </div> <div id="fs-idm59654416" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm50677584" data-type="exercise"><div id="fs-idm61670400" data-type="problem"><p id="fs-idm37435328">Helen plays basketball. For free throws, she makes the shot 75% of the time. Helen must now attempt two free throws. <em data-effect="italics">C</em> = the event that Helen makes the first shot. <em data-effect="italics">P</em>(<em data-effect="italics">C</em>) = 0.75. <em data-effect="italics">D</em> = the event Helen makes the second shot. <em data-effect="italics">P</em>(<em data-effect="italics">D</em>) = 0.75. The probability that Helen makes the second free throw given that she made the first is 0.85. What is the probability that Helen makes both free throws?</p> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><p>A community swim team has <strong>150</strong> members. <strong>Seventy-five</strong> of the members are advanced swimmers. <strong>Forty-seven</strong> of the members are intermediate swimmers. The remainder are novice swimmers. <strong>Forty</strong> of the advanced swimmers practice four times a week. <strong>Thirty</strong> of the intermediate swimmers practice four times a week. <strong>Ten</strong> of the novice swimmers practice four times a week. Suppose one member of the swim team is chosen randomly.</p> <p>&nbsp;</p> <div id="element-201" data-type="exercise"><div id="id21150090" data-type="problem"><p id="element-201p">a. What is the probability that the member is a novice swimmer?</p> </div> <div id="id21150108" data-type="solution"><p id="element-201s">a. \(\frac{28}{150}\)</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id21150144" data-type="problem"><p id="element-202p">b. What is the probability that the member practices four times a week?</p> </div> <div id="id21150162" data-type="solution"><p id="element-202s">b. \(\frac{80}{150}\)</p> <p>&nbsp;</p> </div> </div> <div id="element-203" data-type="exercise"><div id="id21150198" data-type="problem"><p id="element-203p">c. What is the probability that the member is an advanced swimmer and practices four times a week?</p> </div> <div id="id21150218" data-type="solution"><p id="element-203s">c. \(\frac{40}{150}\)</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id21150254" data-type="problem"><p id="element-204p">d. What is the probability that a member is an advanced swimmer and an intermediate swimmer? Are being an advanced swimmer and an intermediate swimmer mutually exclusive? Why or why not?</p> </div> <div id="id21150274" data-type="solution"><p id="element-204s">d. <em data-effect="italics">P</em>(advanced AND intermediate) = 0, so these are mutually exclusive events. A swimmer cannot be an advanced swimmer and an intermediate swimmer at the same time.</p> <p>&nbsp;</p> </div> </div> <div id="element-205" data-type="exercise"><div id="id21163993" data-type="problem"><p id="eip-idm154998256">e. Are being a novice swimmer and practicing four times a week independent events? Why or why not?</p> </div> <div id="id21164012" data-type="solution"><p id="element-205s">e. No, these are not independent events. <span data-type="newline"><br /> </span><em data-effect="italics">P</em>(novice AND practices four times per week) = 0.0667 <span data-type="newline"><br /> </span><em data-effect="italics">P</em>(novice)<em data-effect="italics">P</em>(practices four times per week) = 0.0996 <span data-type="newline"><br /> </span>0.0667 ≠ 0.0996</p> </div> </div> </div> <div id="fs-idm17775392" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm36460112" data-type="exercise"><div id="fs-idm11280848" data-type="problem"><p id="fs-idp1767312">A school has 200 seniors of whom 140 will be going to college next year. Forty will be going directly to work. The remainder are taking a gap year. Fifty of the seniors going to college play sports. Thirty of the seniors going directly to work play sports. Five of the seniors taking a gap year play sports. What is the probability that a senior is taking a gap year?</p> </div> </div> </div> <div id="fs-idm15085168" class="textbox textbox--examples" data-type="example"><p id="fs-idm37045312">Felicity attends Modesto JC in Modesto, CA. The probability that Felicity enrolls in a math class is 0.2 and the probability that she enrolls in a speech class is 0.65. The probability that she enrolls in a math class GIVEN that she enrolls in speech class is 0.25.</p> <p id="fs-idm57022736">Let: <em data-effect="italics">M</em> = math class, <em data-effect="italics">S</em> = speech class, <em data-effect="italics">M</em>|<em data-effect="italics">S</em> = math given speech</p> <div id="fs-idp38089552" data-type="exercise"><div id="fs-idm12150000" data-type="problem"><ol id="fs-idm41716768" type="a"><li>What is the probability that Felicity enrolls in math and speech? <span data-type="newline" data-count="1"><br /> </span>Find <em data-effect="italics">P</em>(<em data-effect="italics">M</em> AND <em data-effect="italics">S</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">M</em>|<em data-effect="italics">S</em>)<em data-effect="italics">P</em>(<em data-effect="italics">S</em>).</li> <li>What is the probability that Felicity enrolls in math or speech classes? <span data-type="newline" data-count="1"><br /> </span>Find <em data-effect="italics">P</em>(<em data-effect="italics">M</em> OR <em data-effect="italics">S</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">M</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">S</em>) &#8211; <em data-effect="italics">P</em>(<em data-effect="italics">M</em> AND <em data-effect="italics">S</em>).</li> <li>Are <em data-effect="italics">M</em> and <em data-effect="italics">S</em> independent? Is <em data-effect="italics">P</em>(<em data-effect="italics">M</em>|<em data-effect="italics">S</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">M</em>)?</li> <li>Are <em data-effect="italics">M</em> and <em data-effect="italics">S</em> mutually exclusive? Is <em data-effect="italics">P</em>(<em data-effect="italics">M</em> AND <em data-effect="italics">S</em>) = 0?</li> </ol> </div> <div id="fs-idp19721328" data-type="solution"><p id="fs-idm40435168">a. 0.1625, b. 0.6875, c. No, d. No</p> </div> </div> </div> <div class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm68914864" data-type="exercise"><div id="fs-idm57353328" data-type="problem"><p id="fs-idm50508672">A student goes to the library. Let events <em data-effect="italics">B</em> = the student checks out a book and <em data-effect="italics">D</em> = the student check out a DVD. Suppose that <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) = 0.40, <em data-effect="italics">P</em>(<em data-effect="italics">D</em>) = 0.30 and <em data-effect="italics">P</em>(<em data-effect="italics">D</em>|<em data-effect="italics">B</em>) = 0.5.</p> <ol id="fs-idm35504336" type="a"><li>Find <em data-effect="italics">P</em>(<em data-effect="italics">B</em> AND <em data-effect="italics">D</em>).</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">B</em> OR <em data-effect="italics">D</em>).</li> </ol> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><p id="element-472">Studies show that about one woman in seven (approximately 14.3%) who live to be 90 will develop breast cancer. Suppose that of those women who develop breast cancer, a test is negative 2% of the time. Also suppose that in the general population of women, the test for breast cancer is negative about 85% of the time. Let <em data-effect="italics">B</em> = woman develops breast cancer and let <em data-effect="italics">N</em> = tests negative. Suppose one woman is selected at random.</p> <p>&nbsp;</p> <div data-type="exercise"><div id="id21164153" data-type="problem"><p id="element-301p">a. What is the probability that the woman develops breast cancer? What is the probability that woman tests negative?</p> </div> <div id="id21164172" data-type="solution"><p id="element-301s">a. <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) = 0.143; <em data-effect="italics">P</em>(<em data-effect="italics">N</em>) = 0.85</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id21164224" data-type="problem"><p id="element-302p">b. Given that the woman has breast cancer, what is the probability that she tests negative?</p> </div> <div id="id21164243" data-type="solution"><p id="element-302s">b. <em data-effect="italics">P</em>(<em data-effect="italics">N</em>|<em data-effect="italics">B</em>) = 0.02</p> <p>&nbsp;</p> </div> </div> <div id="element-307" data-type="exercise"><div id="id21164283" data-type="problem"><p id="element-307p">c. What is the probability that the woman has breast cancer AND tests negative?</p> </div> <div id="id21164302" data-type="solution"><p id="element-307s">c. <em data-effect="italics">P</em>(<em data-effect="italics">B</em> AND <em data-effect="italics">N</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">B</em>)<em data-effect="italics">P</em>(<em data-effect="italics">N</em>|<em data-effect="italics">B</em>) = (0.143)(0.02) = 0.0029</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id21164367" data-type="problem"><p id="element-303p">d. What is the probability that the woman has breast cancer or tests negative?</p> </div> <div id="id21164385" data-type="solution"><p id="element-303s">d. <em data-effect="italics">P</em>(<em data-effect="italics">B</em> OR <em data-effect="italics">N</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">N</em>) &#8211; <em data-effect="italics">P</em>(<em data-effect="italics">B</em> AND <em data-effect="italics">N</em>) = 0.143 + 0.85 &#8211; 0.0029 = 0.9901</p> <p>&nbsp;</p> </div> </div> <div id="element-304" data-type="exercise"><div id="id21164462" data-type="problem"><p id="element-304p">e. Are having breast cancer and testing negative independent events?</p> </div> <div id="id21164481" data-type="solution"><p id="element-304s">e. No. <em data-effect="italics">P</em>(<em data-effect="italics">N</em>) = 0.85; <em data-effect="italics">P</em>(<em data-effect="italics">N</em>|<em data-effect="italics">B</em>) = 0.02. So, <em data-effect="italics">P</em>(<em data-effect="italics">N</em>|<em data-effect="italics">B</em>) does not equal <em data-effect="italics">P</em>(<em data-effect="italics">N</em>).</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id21001185" data-type="problem"><p id="element-305p">f. Are having breast cancer and testing negative mutually exclusive?</p> </div> <div id="id21001204" data-type="solution"><p id="element-305s">f. No. <em data-effect="italics">P</em>(<em data-effect="italics">B</em> AND <em data-effect="italics">N</em>) = 0.0029. For <em data-effect="italics">B</em> and <em data-effect="italics">N</em> to be mutually exclusive, <em data-effect="italics">P</em>(<em data-effect="italics">B</em> AND <em data-effect="italics">N</em>) must be zero.</p> </div> </div> </div> <div id="fs-idp45632144" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm57543872" data-type="exercise"><div id="fs-idm69233152" data-type="problem"><p id="fs-idm39897552">A school has 200 seniors of whom 140 will be going to college next year. Forty will be going directly to work. The remainder are taking a gap year. Fifty of the seniors going to college play sports. Thirty of the seniors going directly to work play sports. Five of the seniors taking a gap year play sports. What is the probability that a senior is going to college and plays sports?</p> </div> </div> </div> <div id="fs-idp15189248" class="textbox textbox--examples" data-type="example"><div id="fs-idm109502304" data-type="exercise"><div id="fs-idm59824144" data-type="problem"><p id="fs-idm8914880">Refer to the information in <a class="autogenerated-content" href="#example5">(Figure)</a>. <em data-effect="italics">P</em> = tests positive.</p> <ol id="fs-idm51077952" type="a"><li>Given that a woman develops breast cancer, what is the probability that she tests positive. Find <em data-effect="italics">P</em>(<em data-effect="italics">P</em>|<em data-effect="italics">B</em>) = 1 &#8211; <em data-effect="italics">P</em>(<em data-effect="italics">N</em>|<em data-effect="italics">B</em>).</li> <li>What is the probability that a woman develops breast cancer and tests positive. Find <em data-effect="italics">P</em>(<em data-effect="italics">B</em> AND <em data-effect="italics">P</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">P</em>|<em data-effect="italics">B</em>)<em data-effect="italics">P</em>(<em data-effect="italics">B</em>).</li> <li>What is the probability that a woman does not develop breast cancer. Find <em data-effect="italics">P</em>(<em data-effect="italics">B′</em>) = 1 &#8211; <em data-effect="italics">P</em>(<em data-effect="italics">B</em>).</li> <li>What is the probability that a woman tests positive for breast cancer. Find <em data-effect="italics">P</em>(<em data-effect="italics">P</em>) = 1 &#8211; <em data-effect="italics">P</em>(<em data-effect="italics">N</em>).</li> </ol> </div> <div id="fs-idm59455360" data-type="solution"><p id="fs-idm40210528">a. 0.98; b. 0.1401; c. 0.857; d. 0.15</p> </div> </div> </div> <div id="fs-idm25110496" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp6951808" data-type="exercise"><div id="fs-idm846464" data-type="problem"><p id="fs-idm37086800">A student goes to the library. Let events <em data-effect="italics">B</em> = the student checks out a book and <em data-effect="italics">D</em> = the student checks out a DVD. Suppose that <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) = 0.40, <em data-effect="italics">P</em>(<em data-effect="italics">D</em>) = 0.30 and <em data-effect="italics">P</em>(<em data-effect="italics">D</em>|<em data-effect="italics">B</em>) = 0.5.</p> <ol id="fs-idm40366288" type="a"><li>Find <em data-effect="italics">P</em>(<em data-effect="italics">B′</em>).</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">D</em> AND <em data-effect="italics">B</em>).</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">B</em>|<em data-effect="italics">D</em>).</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">D</em> AND <em data-effect="italics">B′</em>).</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">D</em>|<em data-effect="italics">B′</em>).</li> </ol> </div> </div> </div> </div> <div id="fs-idm107103536" class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="fs-idm35596400">DiCamillo, Mark, Mervin Field. “The File Poll.” Field Research Corporation. Available online at http://www.field.com/fieldpollonline/subscribers/Rls2443.pdf (accessed May 2, 2013).</p> <p id="fs-idm130053392">Rider, David, “Ford support plummeting, poll suggests,” The Star, September 14, 2011. Available online at http://www.thestar.com/news/gta/2011/09/14/ford_support_plummeting_poll_suggests.html (accessed May 2, 2013).</p> <p id="fs-idm102616512">“Mayor’s Approval Down.” News Release by Forum Research Inc. Available online at http://www.forumresearch.com/forms/News Archives/News Releases/74209_TO_Issues_-_Mayoral_Approval_%28Forum_Research%29%2820130320%29.pdf (accessed May 2, 2013).</p> <p id="fs-idm172827024">“Roulette.” Wikipedia. Available online at http://en.wikipedia.org/wiki/Roulette (accessed May 2, 2013).</p> <p id="fs-idm67517376">Shin, Hyon B., Robert A. Kominski. “Language Use in the United States: 2007.” United States Census Bureau. Available online at http://www.census.gov/hhes/socdemo/language/data/acs/ACS-12.pdf (accessed May 2, 2013).</p> <p id="fs-idm161310304">Data from the Baseball-Almanac, 2013. Available online at www.baseball-almanac.com (accessed May 2, 2013).</p> <p id="fs-idm78623120">Data from U.S. Census Bureau.</p> <p id="fs-idm78622736">Data from the Wall Street Journal.</p> <p id="fs-idm132741488">Data from The Roper Center: Public Opinion Archives at the University of Connecticut. Available online at http://www.ropercenter.uconn.edu/ (accessed May 2, 2013).</p> <p id="fs-idm188880256">Data from Field Research Corporation. Available online at www.field.com/fieldpollonline (accessed May 2,2 013).</p> </div> <div id="fs-idp34842048" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idm50027936">The multiplication rule and the addition rule are used for computing the probability of <em data-effect="italics">A</em> and <em data-effect="italics">B</em>, as well as the probability of <em data-effect="italics">A</em> or <em data-effect="italics">B</em> for two given events <em data-effect="italics">A</em>, <em data-effect="italics">B</em> defined on the sample space. In sampling with replacement each member of a population is replaced after it is picked, so that member has the possibility of being chosen more than once, and the events are considered to be independent. In sampling without replacement, each member of a population may be chosen only once, and the events are considered to be not independent. The events <em data-effect="italics">A</em> and <em data-effect="italics">B</em> are mutually exclusive events when they do not have any outcomes in common.</p> </div> <div id="fs-idp67099248" class="formula-review" data-depth="1"><h3 data-type="title">Formula Review</h3> <p id="fs-idm56363920"><strong data-effect="bold">The multiplication rule:</strong><em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>)<em data-effect="italics">P</em>(<em data-effect="italics">B</em>)</p> <p id="fs-idm39113360"><strong data-effect="bold">The addition rule:</strong><em data-effect="italics">P</em>(<em data-effect="italics">A</em> OR <em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">B</em>) &#8211; <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>)</p> </div> <div id="fs-idm40455392" class="practice" data-depth="1"><p id="eip-475"><em data-effect="italics">Use the following information to answer the next ten exercises.</em> Forty-eight percent of all Californians registered voters prefer life in prison without parole over the death penalty for a person convicted of first degree murder. Among Latino California registered voters, 55% prefer life in prison without parole over the death penalty for a person convicted of first degree murder. 37.6% of all Californians are Latino.</p> <p id="element-40">In this problem, let:</p> <ul id="element-448"><li><em data-effect="italics">C</em> = Californians (registered voters) preferring life in prison without parole over the death penalty for a person convicted of first degree murder.</li> <li><em data-effect="italics">L</em> = Latino Californians</li> </ul> <p id="element-198">Suppose that one Californian is randomly selected.</p> <div id="exercise1" data-type="exercise"><div id="id13144382" data-type="problem"><p id="exercise1p">Find <em data-effect="italics">P</em>(<em data-effect="italics">C</em>).</p> </div> </div> <div id="exercise2" data-type="exercise"><div id="id13144446" data-type="problem"><p id="exercise2p">Find <em data-effect="italics">P</em>(<em data-effect="italics">L</em>).</p> </div> <div id="id13144481" data-type="solution"><p id="exercise2s">0.376</p> </div> </div> <div id="exercise3" data-type="exercise"><div id="id13144510" data-type="problem"><p id="exercise3p">Find <em data-effect="italics">P</em>(<em data-effect="italics">C</em>|<em data-effect="italics">L</em>).</p> </div> </div> <div id="exercise4" data-type="exercise"><div id="id13144579" data-type="problem"><p id="exercise4p">In words, what is <em data-effect="italics">C</em>|<em data-effect="italics">L</em>?</p> </div> <div id="fs-idp40726272" data-type="solution"><p id="fs-idm45305280"><em data-effect="italics">C</em>|<em data-effect="italics">L</em> means, given the person chosen is a Latino Californian, the person is a registered voter who prefers life in prison without parole for a person convicted of first degree murder.</p> </div> </div> <div id="exercise5" data-type="exercise"><div id="id13153551" data-type="problem"><p id="exercise5p">Find <em data-effect="italics">P</em>(<em data-effect="italics">L</em> AND <em data-effect="italics">C</em>).</p> </div> </div> <div id="exercise6" data-type="exercise"><div id="id13153618" data-type="problem"><p id="exercise6p">In words, what is <em data-effect="italics">L</em> AND <em data-effect="italics">C</em>?</p> </div> <div id="fs-idp16382448" data-type="solution"><p id="fs-idp16979504"><em data-effect="italics">L</em> AND <em data-effect="italics">C</em> is the event that the person chosen is a Latino California registered voter who prefers life without parole over the death penalty for a person convicted of first degree murder.</p> </div> </div> <div id="exercise7" data-type="exercise"><div id="id13153658" data-type="problem"><p id="exercise7p">Are <em data-effect="italics">L</em> and <em data-effect="italics">C</em> independent events? Show why or why not.</p> </div> </div> <div data-type="exercise"><div id="id13153716" data-type="problem"><p id="exercise8p">Find <em data-effect="italics">P</em>(<em data-effect="italics">L</em> OR <em data-effect="italics">C</em>).</p> </div> <div id="id13153756" data-type="solution"><p id="exercise8s">0.6492</p> </div> </div> <div id="exercise9" data-type="exercise"><div id="id13153785" data-type="problem"><p id="exercise9p">In words, what is <em data-effect="italics">L</em> OR <em data-effect="italics">C</em>?</p> </div> </div> <div data-type="exercise"><div id="id11298886" data-type="problem"><p id="exercise10p">Are <em data-effect="italics">L</em> and <em data-effect="italics">C</em> mutually exclusive events? Show why or why not.</p> </div> <div id="id11298914" data-type="solution"><p id="exercise10s">No, because <em data-effect="italics">P</em>(<em data-effect="italics">L</em> AND <em data-effect="italics">C</em>) does not equal 0.</p> </div> </div> </div> <div id="fs-idm12466720" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <div data-type="exercise"><div id="eip-293" data-type="problem"><p>&nbsp;</p> </div> </div> <div id="eip-803" data-type="exercise"><div data-type="problem"><p id="eip-268">1) After Rob Ford, the mayor of Toronto, announced his plans to cut budget costs in late 2011, the Forum Research polled 1,046 people to measure the mayor’s popularity. Everyone polled expressed either approval or disapproval. These are the results their poll produced:</p> <ul><li>In early 2011, 60 percent of the population approved of Mayor Ford’s actions in office.</li> <li>In mid-2011, 57 percent of the population approved of his actions.</li> <li>In late 2011, the percentage of popular approval was measured at 42 percent. <ol id="eip-idp103823808" type="a"><li>What is the sample size for this study?</li> <li>What proportion in the poll disapproved of Mayor Ford, according to the results from late 2011?</li> <li>How many people polled responded that they approved of Mayor Ford in late 2011?</li> <li>What is the probability that a person supported Mayor Ford, based on the data collected in mid-2011?</li> <li>What is the probability that a person supported Mayor Ford, based on the data collected in early 2011?</li> </ol> </li> </ul> <p>&nbsp;</p> </div> <div id="eip-333" data-type="solution"></div> </div> <p><em data-effect="italics">2) Use the following information to answer the next three exercises.</em> The casino game, roulette, allows the gambler to bet on the probability of a ball, which spins in the roulette wheel, landing on a particular color, number, or range of numbers. The table used to place bets contains of 38 numbers, and each number is assigned to a color and a range.</p> <div id="M04_ch03-fig001" class="bc-figure figure"><div class="bc-figcaption figcaption">(credit: film8ker/wikibooks)</div> <p><span id="eip-idp90831520" data-type="media" data-alt="This is an image of a roulette table."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/CNX_Stats_C03_RWP_003-1.jpg" alt="This is an image of a roulette table." width="450" data-media-type="image/jpg" /></span></p> </div> <div id="eip-628" data-type="exercise"><div data-type="problem"><ol id="eip-idp70489216" type="a"><li>List the sample space of the 38 possible outcomes in roulette.</li> <li>You bet on red. Find <em data-effect="italics">P</em>(red).</li> <li>You bet on -1st 12- (1st Dozen). Find <em data-effect="italics">P</em>(-1st 12-).</li> <li>You bet on an even number. Find <em data-effect="italics">P</em>(even number).</li> <li>Is getting an odd number the complement of getting an even number? Why?</li> <li>Find two mutually exclusive events.</li> <li>Are the events Even and 1st Dozen independent?</li> </ol> </div> </div> <div id="eip-374" data-type="exercise"><div id="eip-932" data-type="problem"><p id="eip-idm28370736">3) Compute the probability of winning the following types of bets:</p> <ol id="eip-idp36374016" type="a"><li>Betting on two lines that touch each other on the table as in 1-2-3-4-5-6</li> <li>Betting on three numbers in a line, as in 1-2-3</li> <li>Betting on one number</li> <li>Betting on four numbers that touch each other to form a square, as in 10-11-13-14</li> <li>Betting on two numbers that touch each other on the table, as in 10-11 or 10-13</li> <li>Betting on 0-00-1-2-3</li> <li>Betting on 0-1-2; or 0-00-2; or 00-2-3</li> </ol> </div> <p>&nbsp;</p> <div id="eip-81" data-type="solution"></div> </div> <div id="eip-95" data-type="exercise"><div id="eip-919" data-type="problem"><p id="eip-972">4) Compute the probability of winning the following types of bets:</p> <ol id="eip-idp87688432" type="a"><li>Betting on a color</li> <li>Betting on one of the dozen groups</li> <li>Betting on the range of numbers from 1 to 18</li> <li>Betting on the range of numbers 19–36</li> <li>Betting on one of the columns</li> <li>Betting on an even or odd number (excluding zero)</li> </ol> </div> </div> <div id="element-555" data-type="exercise"><div id="id43812808" data-type="problem"><p>5) Suppose that you have eight cards. Five are green and three are yellow. The five green cards are numbered 1, 2, 3, 4, and 5. The three yellow cards are numbered 1, 2, and 3. The cards are well shuffled. You randomly draw one card.</p> <ul id="element-2351"><li><em data-effect="italics">G</em> = card drawn is green</li> <li><em data-effect="italics">E</em> = card drawn is even-numbered <ol type="a"><li>List the sample space.</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">G</em>) = _____</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">G</em>|<em data-effect="italics">E</em>) = _____</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">G</em> AND <em data-effect="italics">E</em>) = _____</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">G</em> OR <em data-effect="italics">E</em>) = _____</li> <li>Are <em data-effect="italics">G</em> and <em data-effect="italics">E</em> mutually exclusive? Justify your answer numerically.</li> </ol> </li> </ul> <p>&nbsp;</p> </div> <div id="id43826106" data-type="solution"></div> </div> <div id="element-81" data-type="exercise"><div id="id43791232" data-type="problem"><p id="element-235">6) Roll two fair dice separately. Each die has six faces.</p> <ol type="a"><li>List the sample space.</li> <li>Let <em data-effect="italics">A</em> be the event that either a three or four is rolled first, followed by an even number. Find <em data-effect="italics">P</em>(<em data-effect="italics">A</em>).</li> <li>Let <em data-effect="italics">B</em> be the event that the sum of the two rolls is at most seven. Find <em data-effect="italics">P</em>(<em data-effect="italics">B</em>).</li> <li>In words, explain what “<em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>)” represents. Find <em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>).</li> <li>Are <em data-effect="italics">A</em> and <em data-effect="italics">B</em> mutually exclusive events? Explain your answer in one to three complete sentences, including numerical justification.</li> <li>Are <em data-effect="italics">A</em> and <em data-effect="italics">B</em> independent events? Explain your answer in one to three complete sentences, including numerical justification.</li> </ol> </div> </div> <div data-type="exercise"><div id="id43757922" data-type="problem"><p id="element-834">7) A special deck of cards has ten cards. Four are green, three are blue, and three are red. When a card is picked, its color of it is recorded. An experiment consists of first picking a card and then tossing a coin.</p> <ol id="element-274" type="a" data-mark-suffix="."><li>List the sample space.</li> <li>Let <em data-effect="italics">A</em> be the event that a blue card is picked first, followed by landing a head on the coin toss. Find <em data-effect="italics">P</em>(<em data-effect="italics">A</em>).</li> <li>Let <em data-effect="italics">B</em> be the event that a red or green is picked, followed by landing a head on the coin toss. Are the events <em data-effect="italics">A</em> and <em data-effect="italics">B</em> mutually exclusive? Explain your answer in one to three complete sentences, including numerical justification.</li> <li>Let <em data-effect="italics">C</em> be the event that a red or blue is picked, followed by landing a head on the coin toss. Are the events <em data-effect="italics">A</em> and <em data-effect="italics">C</em> mutually exclusive? Explain your answer in one to three complete sentences, including numerical justification.</li> </ol> </div> <div id="fs-idp74477648" data-type="solution"><div id="fs-idm40299392" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="eip-idm18483184">The coin toss is independent of the card picked first.</p> </div> <p>&nbsp;</p> </div> </div> <div id="element-52" data-type="exercise"><div id="id43758328" data-type="problem"><p>8) An experiment consists of first rolling a die and then tossing a coin.</p> <ol type="a"><li>List the sample space.</li> <li>Let <em data-effect="italics">A</em> be the event that either a three or a four is rolled first, followed by landing a head on the coin toss. Find <em data-effect="italics">P</em>(<em data-effect="italics">A</em>).</li> <li>Let <em data-effect="italics">B</em> be the event that the first and second tosses land on heads. Are the events <em data-effect="italics">A</em> and <em data-effect="italics">B</em> mutually exclusive? Explain your answer in one to three complete sentences, including numerical justification.</li> </ol> </div> </div> <div id="element-982" data-type="exercise"><div id="id43758438" data-type="problem"><p>9) An experiment consists of tossing a nickel, a dime, and a quarter. Of interest is the side the coin lands on.</p> <ol id="element-485" type="a"><li>List the sample space.</li> <li>Let <em data-effect="italics">A</em> be the event that there are at least two tails. Find <em data-effect="italics">P</em>(<em data-effect="italics">A</em>).</li> <li>Let <em data-effect="italics">B</em> be the event that the first and second tosses land on heads. Are the events <em data-effect="italics">A</em> and <em data-effect="italics">B</em> mutually exclusive? Explain your answer in one to three complete sentences, including justification.</li> </ol> <p>&nbsp;</p> </div> <div id="id43758538" data-type="solution"></div> </div> <div id="element-317" data-type="exercise"><div id="id43758917" data-type="problem"><p id="element-508">10) Consider the following scenario: <span data-type="newline" data-count="1"><br /> </span>Let <em data-effect="italics">P</em>(<em data-effect="italics">C</em>) = 0.4. <span data-type="newline" data-count="1"><br /> </span>Let <em data-effect="italics">P</em>(<em data-effect="italics">D</em>) = 0.5. <span data-type="newline" data-count="1"><br /> </span>Let <em data-effect="italics">P</em>(<em data-effect="italics">C</em>|<em data-effect="italics">D</em>) = 0.6.</p> <ol id="element-966" type="a"><li>Find <em data-effect="italics">P</em>(<em data-effect="italics">C</em> AND <em data-effect="italics">D</em>).</li> <li>Are <em data-effect="italics">C</em> and <em data-effect="italics">D</em> mutually exclusive? Why or why not?</li> <li>Are <em data-effect="italics">C</em> and <em data-effect="italics">D</em> independent events? Why or why not?</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">C</em> OR <em data-effect="italics">D</em>).</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">D</em>|<em data-effect="italics">C</em>).</li> </ol> </div> </div> <div id="element-134" data-type="exercise"><div id="id43759783" data-type="problem"><p id="element-469"><em data-effect="italics">11) Y</em> and <em data-effect="italics">Z</em> are independent events.</p> <ol id="listy" type="a"><li>Rewrite the basic Addition Rule <em data-effect="italics">P</em>(<em data-effect="italics">Y</em> OR <em data-effect="italics">Z</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">Y</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">Z</em>) &#8211; <em data-effect="italics">P</em>(<em data-effect="italics">Y</em> AND <em data-effect="italics">Z</em>) using the information that <em data-effect="italics">Y</em> and <em data-effect="italics">Z</em> are independent events.</li> <li>Use the rewritten rule to find <em data-effect="italics">P</em>(<em data-effect="italics">Z</em>) if <em data-effect="italics">P</em>(<em data-effect="italics">Y</em> OR <em data-effect="italics">Z</em>) = 0.71 and <em data-effect="italics">P</em>(<em data-effect="italics">Y</em>) = 0.42.</li> </ol> <p>&nbsp;</p> </div> <div id="eip-idp48037008" data-type="solution"></div> </div> <div id="element-15" data-type="exercise"><div id="id43759983" data-type="problem"><p><em data-effect="italics">12) G</em> and <em data-effect="italics">H</em> are mutually exclusive events. <em data-effect="italics">P</em>(<em data-effect="italics">G</em>) = 0.5 <em data-effect="italics">P</em>(<em data-effect="italics">H</em>) = 0.3</p> <ol type="a"><li>Explain why the following statement MUST be false: <em data-effect="italics">P</em>(<em data-effect="italics">H</em>|<em data-effect="italics">G</em>) = 0.4.</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">H</em> OR <em data-effect="italics">G</em>).</li> <li>Are <em data-effect="italics">G</em> and <em data-effect="italics">H</em> independent or dependent events? Explain in a complete sentence.</li> </ol> </div> </div> <div id="fs-idp11038848" data-type="exercise"><div id="id43570398" data-type="problem"><p>13) Approximately 281,000,000 people over age five live in the United States. Of these people, 55,000,000 speak a language other than English at home. Of those who speak another language at home, 62.3% speak Spanish.</p> <p id="element-819">Let: <em data-effect="italics">E</em> = speaks English at home; <em data-effect="italics">E′</em> = speaks another language at home; <em data-effect="italics">S</em> = speaks Spanish;</p> <p id="element-554p">Finish each probability statement by matching the correct answer.</p> <table summary="This table presents probability statements in the first column and answers to those probability statements in the second column. Match each cell to its correct counterpart."><colgroup><col data-width="2in" /> <col data-width="2in" /></colgroup> <thead><tr><th>Probability Statements</th> <th>Answers</th> </tr> </thead> <tbody><tr><td>a. <em data-effect="italics">P</em>(<em data-effect="italics">E′</em>) =</td> <td>i. 0.8043</td> </tr> <tr><td>b. <em data-effect="italics">P</em>(<em data-effect="italics">E</em>) =</td> <td>ii. 0.623</td> </tr> <tr><td>c. <em data-effect="italics">P</em>(<em data-effect="italics">S</em> and <em data-effect="italics">E′</em>) =</td> <td>iii. 0.1957</td> </tr> <tr><td>d. <em data-effect="italics">P</em>(<em data-effect="italics">S</em>|<em data-effect="italics">E′</em>) =</td> <td>iv. 0.1219</td> </tr> </tbody> </table> </div> <div id="id43570647" data-type="solution"><p id="fs-idm41318944"></p></div> </div> <div data-type="exercise"><div id="id43570907" data-type="problem"><p id="element-623">14) 1994, the U.S. government held a lottery to issue 55,000 Green Cards (permits for non-citizens to work legally in the U.S.). Renate Deutsch, from Germany, was one of approximately 6.5 million people who entered this lottery. Let <em data-effect="italics">G</em> = won green card.</p> <ol id="element-897" type="a"><li>What was Renate’s chance of winning a Green Card? Write your answer as a probability statement.</li> <li>In the summer of 1994, Renate received a letter stating she was one of 110,000 finalists chosen. Once the finalists were chosen, assuming that each finalist had an equal chance to win, what was Renate’s chance of winning a Green Card? Write your answer as a conditional probability statement. Let <em data-effect="italics">F</em> = was a finalist.</li> <li>Are <em data-effect="italics">G</em> and <em data-effect="italics">F</em> independent or dependent events? Justify your answer numerically and also explain why.</li> <li>Are <em data-effect="italics">G</em> and <em data-effect="italics">F</em> mutually exclusive events? Justify your answer numerically and explain why.</li> </ol> </div> </div> <div data-type="exercise"><div id="id43571209" data-type="problem"><p id="element-384">15) Three professors at George Washington University did an experiment to determine if economists are more selfish than other people. They dropped 64 stamped, addressed envelopes with ?10 cash in different classrooms on the George Washington campus. 44% were returned overall. From the economics classes 56% of the envelopes were returned. From the business, psychology, and history classes 31% were returned.</p> <p>Let: <em data-effect="italics">R</em> = money returned; <em data-effect="italics">E</em> = economics classes; <em data-effect="italics">O</em> = other classes</p> <ol type="a"><li>Write a probability statement for the overall percent of money returned.</li> <li>Write a probability statement for the percent of money returned out of the economics classes.</li> <li>Write a probability statement for the percent of money returned out of the other classes.</li> <li>Is money being returned independent of the class? Justify your answer numerically and explain it.</li> <li>Based upon this study, do you think that economists are more selfish than other people? Explain why or why not. Include numbers to justify your answer.</li> </ol> <p>&nbsp;</p> </div> <div id="fs-idp46073040" data-type="solution"></div> </div> <div id="fs-idm127904" data-type="exercise"><div id="fs-idm56833280" data-type="problem"><p id="fs-idm27818464">16) The following table of data obtained from www.baseball-almanac.com shows hit information for four players. Suppose that one hit from the table is randomly selected.</p> <table id="fs-idp54590480" summary=""><thead><tr><th>Name</th> <th>Single</th> <th>Double</th> <th>Triple</th> <th>Home Run</th> <th>Total Hits</th> </tr> </thead> <tbody><tr><td>Babe Ruth</td> <td>1,517</td> <td>506</td> <td>136</td> <td>714</td> <td>2,873</td> </tr> <tr><td>Jackie Robinson</td> <td>1,054</td> <td>273</td> <td>54</td> <td>137</td> <td>1,518</td> </tr> <tr><td>Ty Cobb</td> <td>3,603</td> <td>174</td> <td>295</td> <td>114</td> <td>4,189</td> </tr> <tr><td>Hank Aaron</td> <td>2,294</td> <td>624</td> <td>98</td> <td>755</td> <td>3,771</td> </tr> <tr><td>Total</td> <td>8,471</td> <td>1,577</td> <td>583</td> <td>1,720</td> <td>12,351</td> </tr> </tbody> </table> <p id="fs-idm14793216">Are &#8220;the hit being made by Hank Aaron&#8221; and &#8220;the hit being a double&#8221; independent events?</p> <ol id="fs-idm49744624" type="a"><li>Yes, because <em data-effect="italics">P</em>(hit by Hank Aaron|hit is a double) = <em data-effect="italics">P</em>(hit by Hank Aaron)</li> <li>No, because <em data-effect="italics">P</em>(hit by Hank Aaron|hit is a double) ≠ <em data-effect="italics">P</em>(hit is a double)</li> <li>No, because <em data-effect="italics">P</em>(hit is by Hank Aaron|hit is a double) ≠ <em data-effect="italics">P</em>(hit by Hank Aaron)</li> <li>Yes, because <em data-effect="italics">P</em>(hit is by Hank Aaron|hit is a double) = <em data-effect="italics">P</em>(hit is a double)</li> </ol> </div> </div> <div id="eip-8" data-type="exercise"><div id="eip-874" data-type="problem"><p>17) United Blood Services is a blood bank that serves more than 500 hospitals in 18 states. According to their website, a person with type O blood and a negative Rh factor (Rh-) can donate blood to any person with any bloodtype. Their data show that 43% of people have type O blood and 15% of people have Rh- factor; 52% of people have type O or Rh- factor.</p> <ol id="eip-idm79628432" type="a"><li>Find the probability that a person has both type O blood and the Rh- factor.</li> <li>Find the probability that a person does NOT have both type O blood and the Rh- factor.</li> </ol> <p>&nbsp;</p> </div> <div id="eip-999" data-type="solution"></div> </div> <div id="eip-203" data-type="exercise"><div id="eip-id1873416" data-type="problem"><p id="eip-id1172778107766">18) At a college, 72% of courses have final exams and 46% of courses require research papers. Suppose that 32% of courses have a research paper and a final exam. Let <em data-effect="italics">F</em> be the event that a course has a final exam. Let <em data-effect="italics">R</em> be the event that a course requires a research paper.</p> <ol type="a"><li>Find the probability that a course has a final exam or a research project.</li> <li>Find the probability that a course has NEITHER of these two requirements.</li> </ol> </div> </div> <div id="eip-913" data-type="exercise"><div id="eip-id1164887148050" data-type="problem"><p id="eip-id1164882729280">19) In a box of assorted cookies, 36% contain chocolate and 12% contain nuts. Of those, 8% contain both chocolate and nuts. Sean is allergic to both chocolate and nuts.</p> <ol id="eip-id1164895994519" type="a"><li>Find the probability that a cookie contains chocolate or nuts (he can&#8217;t eat it).</li> <li>Find the probability that a cookie does not contain chocolate or nuts (he can eat it).</li> </ol> <p>&nbsp;</p> </div> <div id="eip-id1164896038986" data-type="solution"></div> </div> <div data-type="exercise"><div id="eip-id1164258184377" data-type="problem"><p id="eip-id1164261703387">20) A college finds that 10% of students have taken a distance learning class and that 40% of students are part time students. Of the part time students, 20% have taken a distance learning class. Let <em data-effect="italics">D</em> = event that a student takes a distance learning class and <em data-effect="italics">E</em> = event that a student is a part time student</p> <ol id="eip-id4894720" type="a"><li>Find <em data-effect="italics">P</em>(<em data-effect="italics">D</em> AND <em data-effect="italics">E</em>).</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">E</em>|<em data-effect="italics">D</em>).</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">D</em> OR <em data-effect="italics">E</em>).</li> <li>Using an appropriate test, show whether <em data-effect="italics">D</em> and <em data-effect="italics">E</em> are independent.</li> <li>Using an appropriate test, show whether <em data-effect="italics">D</em> and <em data-effect="italics">E</em> are mutually exclusive.</li> </ol> <p id="eip-id1164603274351">21) On February 28, 2013, a Field Poll Survey reported that 61% of California registered voters approved of allowing two people of the same gender to marry and have regular marriage laws apply to them. Among 18 to 39 year olds (California registered voters), the approval rating was 78%. Six in ten California registered voters said that the upcoming Supreme Court’s ruling about the constitutionality of California’s Proposition 8 was either very or somewhat important to them. Out of those CA registered voters who support same-sex marriage, 75% say the ruling is important to them.</p> <p id="eip-id1170640566743">In this problem, let:</p> <ul><li><em data-effect="italics">C</em> = California registered voters who support same-sex marriage.</li> <li><em data-effect="italics">B</em> = California registered voters who say the Supreme Court’s ruling about the constitutionality of California’s Proposition 8 is very or somewhat important to them</li> <li><em data-effect="italics">A</em> = California registered voters who are 18 to 39 years old. <ol type="a"><li>Find <em data-effect="italics">P</em>(<em data-effect="italics">C</em>).</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">B</em>).</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">C</em>|<em data-effect="italics">A</em>).</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">B</em>|<em data-effect="italics">C</em>).</li> <li>In words, what is <em data-effect="italics">C</em>|<em data-effect="italics">A</em>?</li> <li>In words, what is <em data-effect="italics">B</em>|<em data-effect="italics">C</em>?</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">C</em> AND <em data-effect="italics">B</em>).</li> <li>In words, what is <em data-effect="italics">C</em> AND <em data-effect="italics">B</em>?</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">C</em> OR <em data-effect="italics">B</em>).</li> <li>Are <em data-effect="italics">C</em> and <em data-effect="italics">B</em> mutually exclusive events? Show why or why not.</li> </ol> </li> </ul> <p><strong>Answers to odd questions</strong></p> <p>1)</p> <ol id="eip-idp17755920" type="a"><li>The Forum Research surveyed 1,046 Torontonians.</li> <li>58%</li> <li>42% of 1,046 = 439 (rounding to the nearest integer)</li> <li>0.57</li> <li>0.60.</li> </ol> <p>3)</p> <ol id="eip-idp41310672" type="a"><li><em data-effect="italics">P</em>(Betting on two line that touch each other on the table) = \(\frac{6}{38}\)</li> <li><em data-effect="italics">P</em>(Betting on three numbers in a line) = \(\frac{3}{38}\)</li> <li><em data-effect="italics">P</em>(Bettting on one number) = \(\frac{1}{38}\)</li> <li><em data-effect="italics">P</em>(Betting on four number that touch each other to form a square) = \(\frac{4}{38}\)</li> <li><em data-effect="italics">P</em>(Betting on two number that touch each other on the table ) = \(\frac{2}{38}\)</li> <li><em data-effect="italics">P</em>(Betting on 0-00-1-2-3) = \(\frac{5}{38}\)</li> <li><em data-effect="italics">P</em>(Betting on 0-1-2; or 0-00-2; or 00-2-3) = \(\frac{3}{38}\)</li> </ol> <p>5)</p> <ol id="element-823" type="a"><li>{<em data-effect="italics">G</em>1, <em data-effect="italics">G</em>2, <em data-effect="italics">G</em>3, <em data-effect="italics">G</em>4, <em data-effect="italics">G</em>5, <em data-effect="italics">Y</em>1, <em data-effect="italics">Y</em>2, <em data-effect="italics">Y</em>3}</li> <li>\(\frac{5}{8}\text{}\)</li> <li>\(\frac{2}{3}\text{}\)</li> <li>\(\frac{2}{8}\text{}\)</li> <li>\(\frac{6}{8}\text{}\)</li> <li>No, because <em data-effect="italics">P</em>(<em data-effect="italics">G</em> AND <em data-effect="italics">E</em>) does not equal 0.</li> </ol> <p>7)</p> <ol id="fs-idp29257280" type="a"><li>{(<em data-effect="italics">G</em>,<em data-effect="italics">H</em>) (<em data-effect="italics">G</em>,<em data-effect="italics">T</em>) (<em data-effect="italics">B</em>,<em data-effect="italics">H</em>) (<em data-effect="italics">B</em>,<em data-effect="italics">T</em>) (<em data-effect="italics">R</em>,<em data-effect="italics">H</em>) (<em data-effect="italics">R</em>,<em data-effect="italics">T</em>)}</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">A</em>) = <em data-effect="italics">P</em>(blue)<em data-effect="italics">P</em>(head) = \(\left(\frac{3}{10}\right)\)\(\left(\frac{1}{2}\right)\) = \(\frac{3}{20}\)</li> <li>Yes, <em data-effect="italics">A</em> and <em data-effect="italics">B</em> are mutually exclusive because they cannot happen at the same time; you cannot pick a card that is both blue and also (red or green). <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = 0</li> <li>No, <em data-effect="italics">A</em> and <em data-effect="italics">C</em> are not mutually exclusive because they can occur at the same time. In fact, <em data-effect="italics">C</em> includes all of the outcomes of <em data-effect="italics">A</em>; if the card chosen is blue it is also (red or blue). <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">C</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>) = \(\frac{3}{20}\)</li> </ol> <p>9)</p> <ol id="element-281" type="a"><li><em data-effect="italics">S</em> = {(<em data-effect="italics">HHH</em>), (<em data-effect="italics">HHT</em>), (<em data-effect="italics">HTH</em>), (<em data-effect="italics">HTT</em>), (<em data-effect="italics">THH</em>), (<em data-effect="italics">THT</em>), (<em data-effect="italics">TTH</em>), (<em data-effect="italics">TTT</em>)}</li> <li>\(\frac{4}{8}\)</li> <li>Yes, because if <em data-effect="italics">A</em> has occurred, it is impossible to obtain two tails. In other words, <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = 0.</li> </ol> <p>11)</p> <ol id="listy2" type="a"><li>If <em data-effect="italics">Y</em> and <em data-effect="italics">Z</em> are independent, then <em data-effect="italics">P</em>(<em data-effect="italics">Y</em> AND <em data-effect="italics">Z</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">Y</em>)<em data-effect="italics">P</em>(<em data-effect="italics">Z</em>), so <em data-effect="italics">P</em>(<em data-effect="italics">Y</em> OR <em data-effect="italics">Z</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">Y</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">Z</em>) &#8211; <em data-effect="italics">P</em>(<em data-effect="italics">Y</em>)<em data-effect="italics">P</em>(<em data-effect="italics">Z</em>).</li> <li>0.5</li> </ol> <p>13) <span id="grpccery" data-type="list" data-list-type="enumerated" data-display="inline"><span data-type="item">a) iii b) i</span><span data-type="item"> c) </span><span data-type="item">iv d) </span><span data-type="item">ii</span></span></p> <p>15)</p> <ol id="fs-idp46073296" type="a"><li><em data-effect="italics">P</em>(<em data-effect="italics">R</em>) = 0.44</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">R</em>|<em data-effect="italics">E</em>) = 0.56</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">R</em>|<em data-effect="italics">O</em>) = 0.31</li> <li>No, whether the money is returned is not independent of which class the money was placed in. There are several ways to justify this mathematically, but one is that the money placed in economics classes is not returned at the same overall rate; <em data-effect="italics">P</em>(<em data-effect="italics">R</em>|<em data-effect="italics">E</em>) ≠ <em data-effect="italics">P</em>(<em data-effect="italics">R</em>).</li> <li>No, this study definitely does not support that notion; <em data-effect="italics"><u data-effect="underline">in fact</u></em>, it suggests the opposite. The money placed in the economics classrooms was returned at a higher rate than the money place in all classes collectively; <em data-effect="italics">P</em>(<em data-effect="italics">R</em>|<em data-effect="italics">E</em>) &gt; <em data-effect="italics">P</em>(<em data-effect="italics">R</em>).</li> </ol> <p>17)</p> <ol id="eip-idp78332576" type="a"><li><p id="eip-idp89451984"><em data-effect="italics">P</em>(type O OR Rh-) = <em data-effect="italics">P</em>(type O) + <em data-effect="italics">P</em>(Rh-) &#8211; <em data-effect="italics">P</em>(type O AND Rh-)</p> <p id="eip-idp89452368">0.52 = 0.43 + 0.15 &#8211; <em data-effect="italics">P</em>(type O AND Rh-); solve to find <em data-effect="italics">P</em>(type O AND Rh-) = 0.06</p> <p id="eip-idp68551952">6% of people have type O, Rh- blood</p> </li> <li><p id="eip-idp143998144"><em data-effect="italics">P</em>(NOT(type O AND Rh-)) = 1 &#8211; <em data-effect="italics">P</em>(type O AND Rh-) = 1 &#8211; 0.06 = 0.94</p> <p id="eip-idp143998528">94% of people do not have type O, Rh- blood</p> </li> </ol> <p>19)</p> <p>Let <em data-effect="italics">C</em> = be the event that the cookie contains chocolate. Let <em data-effect="italics">N</em> = the event that the cookie contains nuts.</p> <ol id="eip-id1164893151639" type="a"><li><em data-effect="italics">P</em>(<em data-effect="italics">C</em> OR <em data-effect="italics">N</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">C</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">N</em>) &#8211; <em data-effect="italics">P</em>(<em data-effect="italics">C</em> AND <em data-effect="italics">N</em>) = 0.36 + 0.12 &#8211; 0.08 = 0.40</li> <li><em data-effect="italics">P</em>(NEITHER chocolate NOR nuts) = 1 &#8211; <em data-effect="italics">P</em>(<em data-effect="italics">C</em> OR <em data-effect="italics">N</em>) = 1 &#8211; 0.40 = 0.60</li> </ol> <p>&nbsp;</p> </div> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="indevents"><dt>Independent Events</dt> <dd id="id9683146">The occurrence of one event has no effect on the probability of the occurrence of another event. Events <em data-effect="italics">A</em> and <em data-effect="italics">B</em> are independent if one of the following is true: <ol id="fs-idp12847248" type="1"><li><em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">B</em>|<em data-effect="italics">A</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">B</em>)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">A</em>)<em data-effect="italics">P</em>(<em data-effect="italics">B</em>)</li> </ol> </dd> </dl> <dl id="mutex"><dt>Mutually Exclusive</dt> <dd id="id9683312">Two events are mutually exclusive if the probability that they both happen at the same time is zero. If events <em data-effect="italics">A</em> and <em data-effect="italics">B</em> are mutually exclusive, then <em data-effect="italics">P</em>(<em data-effect="italics">A</em> AND <em data-effect="italics">B</em>) = 0.</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-contingency-tables" title="Chapter 4.5: Contingency Tables"><div class="chapter-title-wrap"><h3 class="chapter-number">28</h3><h2 class="chapter-title"><span class="display-none">Chapter 4.5: Contingency Tables</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p id="element-864">A <span data-type="term">contingency table</span> provides a way of portraying data that can facilitate calculating probabilities. The table helps in determining conditional probabilities quite easily. The table displays sample values in relation to two different variables that may be dependent or contingent on one another. Later on, we will use contingency tables again, but in another manner.</p> <div id="element-775" class="textbox textbox--examples" data-type="example"><p id="element-557">Suppose a study of speeding violations and drivers who use cell phones produced the following fictional data:</p> <table id="element-838" summary="A study of speeding violations"><thead><tr><th></th> <th>Speeding violation in the last year</th> <th>No speeding violation in the last year</th> <th>Total</th> </tr> </thead> <tbody><tr><td>Uses cell phone while driving</td> <td>25</td> <td>280</td> <td>305</td> </tr> <tr><td>Does not use cell phone while driving</td> <td>45</td> <td>405</td> <td>450</td> </tr> <tr><td>Total</td> <td>70</td> <td>685</td> <td>755</td> </tr> </tbody> </table> <p id="element-42">The total number of people in the sample is 755. The row totals are 305 and 450. The column totals are 70 and 685. Notice that 305 + 450 = 755 and 70 + 685 = 755.</p> <p>Calculate the following probabilities using the table.</p> <p>&nbsp;</p> <p id="eip-150">a. Find <em data-effect="italics">P</em>(Driver is a cell phone user).<span data-type="newline"><br /> </span> b. Find <em data-effect="italics">P</em>(driver had no violation in the last year).<span data-type="newline"><br /> </span> c. Find <em data-effect="italics">P</em>(Driver had no violation in the last year AND was a cell phone user).<span data-type="newline"><br /> </span> d. Find <em data-effect="italics">P</em>(Driver is a cell phone user OR driver had no violation in the last year).<span data-type="newline"><br /> </span> e. Find <em data-effect="italics">P</em>(Driver is a cell phone user GIVEN driver had a violation in the last year).<span data-type="newline"><br /> </span> f. Find <em data-effect="italics">P</em>(Driver had no violation last year GIVEN driver was not a cell phone user)</p> <p id="eip-276"><span data-type="title">Solutions:</span>a. \(\frac{\text{number of cell phone users}}{\text{total number in study}}\text{ }=\text{ }\frac{305}{755}\)</p> <p>b. \(\frac{\text{number that had no violation}}{\text{total number in study}}\text{ }=\text{ }\frac{685}{755}\)</p> <p>c. \(\frac{280}{755}\)</p> <p>d. \(\left(\frac{305}{755}\text{ }+\text{ }\frac{685}{755}\right)\text{ }-\text{ }\frac{280}{755}\text{ }=\text{ }\frac{710}{755}\)</p> <p>e. \(\frac{25}{70}\) (The sample space is reduced to the number of drivers who had a violation.)</p> <p>f. \(\frac{405}{450}\) (The sample space is reduced to the number of drivers who were not cell phone users.)</p> </div> <div id="fs-idm8469088" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try it</div> <div id="fs-idm7340080" data-type="exercise"><div id="fs-idm53303712" data-type="problem"><p id="fs-idp10168704"><a class="autogenerated-content" href="#M05_ch03-tbl002">(Figure)</a> shows the number of athletes who stretch before exercising and how many had injuries within the past year.</p> <table id="M05_ch03-tbl002" summary=""><thead><tr><th></th> <th>Injury in last year</th> <th>No injury in last year</th> <th>Total</th> </tr> </thead> <tbody><tr><td>Stretches</td> <td>55</td> <td>295</td> <td>350</td> </tr> <tr><td>Does not stretch</td> <td>231</td> <td>219</td> <td>450</td> </tr> <tr><td>Total</td> <td>286</td> <td>514</td> <td>800</td> </tr> </tbody> </table> <ol id="fs-idm56428944" type="a"><li>What is <em data-effect="italics">P</em>(athlete stretches before exercising)?</li> <li>What is <em data-effect="italics">P</em>(athlete stretches before exercising|no injury in the last year)?</li> </ol> </div> </div> </div> <div id="element-511" class="textbox textbox--examples" data-type="example"><p id="element-98"><a class="autogenerated-content" href="#M05_ch03-tbl003">(Figure)</a> shows a random sample of 100 hikers and the areas of hiking they prefer.</p> <table id="M05_ch03-tbl003" summary=""><caption><span data-type="title">Hiking Area Preference</span></caption> <thead><tr><th>Sex</th> <th>The Coastline</th> <th>Near Lakes and Streams</th> <th>On Mountain Peaks</th> <th>Total</th> </tr> </thead> <tbody><tr><td>Female</td> <td>18</td> <td>16</td> <td>___</td> <td>45</td> </tr> <tr><td>Male</td> <td>___</td> <td>___</td> <td>14</td> <td>55</td> </tr> <tr><td>Total</td> <td>___</td> <td>41</td> <td>___</td> <td>___</td> </tr> </tbody> </table> <div data-type="exercise"><div id="id41647708" data-type="problem"><p id="element-665">a. Complete the table.</p> </div> <div id="id41647726" data-type="solution"><p id="fs-idm12745968">a.</p> <table id="element-850s" summary=""><caption><span data-type="title">Hiking Area Preference</span></caption> <thead><tr><th>Sex</th> <th>The Coastline</th> <th>Near Lakes and Streams</th> <th>On Mountain Peaks</th> <th>Total</th> </tr> </thead> <tbody><tr><td>Female</td> <td>18</td> <td>16</td> <td><strong>11</strong></td> <td>45</td> </tr> <tr><td>Male</td> <td><strong>16</strong></td> <td><strong>25</strong></td> <td>14</td> <td>55</td> </tr> <tr><td>Total</td> <td><strong>34</strong></td> <td>41</td> <td><strong>25</strong></td> <td><strong>100</strong></td> </tr> </tbody> </table> </div> </div> <div data-type="exercise"><div id="id41524865" data-type="problem"><p>b. Are the events &#8220;being female&#8221; and &#8220;preferring the coastline&#8221; independent events?</p> <p>Let <em data-effect="italics">F</em> = being female and let <em data-effect="italics">C</em> = preferring the coastline.</p> <ol id="element-1242"><li>Find <em data-effect="italics">P</em>(<em data-effect="italics">F</em> AND <em data-effect="italics">C</em>).</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">F</em>)<em data-effect="italics">P</em>(<em data-effect="italics">C</em>)</li> </ol> <p id="element-3532">Are these two numbers the same? If they are, then <em data-effect="italics">F</em> and <em data-effect="italics">C</em> are independent. If they are not, then <em data-effect="italics">F</em> and <em data-effect="italics">C</em> are not independent.</p> </div> <div id="id41524993" data-type="solution" data-print-placement="end"><p id="eip-idp6169056">b.</p> <ol id="eip-idm12758816"><li><em data-effect="italics">P</em>(<em data-effect="italics">F</em> AND <em data-effect="italics">C</em>) = \(\frac{18}{100}\) = 0.18</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">F</em>)<em data-effect="italics">P</em>(<em data-effect="italics">C</em>) = \(\left(\frac{45}{100}\right)\left(\frac{34}{100}\right)\) = (0.45)(0.34) = 0.153</li> </ol> <p><em data-effect="italics">P</em>(<em data-effect="italics">F</em> AND <em data-effect="italics">C</em>) ≠ <em data-effect="italics">P</em>(<em data-effect="italics">F</em>)<em data-effect="italics">P</em>(<em data-effect="italics">C</em>), so the events <em data-effect="italics">F</em> and <em data-effect="italics">C</em> are not independent.</p> <p>&nbsp;</p> </div> </div> <div id="element-414" data-type="exercise"><div id="id41525175" data-type="problem"><p id="element-717">c. Find the probability that a person is male given that the person prefers hiking near lakes and streams. Let <em data-effect="italics">M</em> = being male, and let <em data-effect="italics">L</em> = prefers hiking near lakes and streams.</p> <ol id="element-2341" type="1"><li>What word tells you this is a conditional?</li> <li>Fill in the blanks and calculate the probability: <em data-effect="italics">P</em>(___|___) = ___.</li> <li>Is the sample space for this problem all 100 hikers? If not, what is it?</li> </ol> </div> <div id="id41525272" data-type="solution" data-print-placement="end"><p id="fs-idm47502368">c.</p> <ol id="element-2341s" type="1"><li>The word &#8216;given&#8217; tells you that this is a conditional.</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">M</em>|<em data-effect="italics">L</em>) = \(\frac{25}{41}\)</li> <li>No, the sample space for this problem is the 41 hikers who prefer lakes and streams.</li> </ol> <p>&nbsp;</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id41525358" data-type="problem"><p id="element-715">d. Find the probability that a person is female or prefers hiking on mountain peaks. Let <em data-effect="italics">F</em> = being female, and let <em data-effect="italics">P</em> = prefers mountain peaks.</p> <ol id="list1213"><li>Find <em data-effect="italics">P</em>(<em data-effect="italics">F</em>).</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">P</em>).</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">F</em> AND <em data-effect="italics">P</em>).</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">F</em> OR <em data-effect="italics">P</em>).</li> </ol> </div> <div id="id41669265" data-type="solution" data-print-placement="end"><p id="fs-idm40596864">d.</p> <ol id="list1213s"><li><em data-effect="italics">P</em>(<em data-effect="italics">F</em>) = \(\frac{45}{100}\)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">P</em>) = \(\frac{25}{100}\)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">F</em> AND <em data-effect="italics">P</em>) = \(\frac{11}{100}\)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">F</em> OR <em data-effect="italics">P</em>) = \(\frac{45}{100}\) + \(\frac{25}{100}\) &#8211; \(\frac{11}{100}\) = \(\frac{59}{100}\)</li> </ol> </div> </div> </div> <div id="fs-idm2360992" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp7366752" data-type="exercise"><div id="fs-idm68351312" data-type="problem"><p><a class="autogenerated-content" href="#M05_ch03-tbl005">(Figure)</a> shows a random sample of 200 cyclists and the routes they prefer. Let <em data-effect="italics">M</em> = males and <em data-effect="italics">H</em> = hilly path.</p> <table id="M05_ch03-tbl005" summary=""><thead><tr><th>Gender</th> <th>Lake Path</th> <th>Hilly Path</th> <th>Wooded Path</th> <th>Total</th> </tr> </thead> <tbody><tr><td>Female</td> <td>45</td> <td>38</td> <td>27</td> <td>110</td> </tr> <tr><td>Male</td> <td>26</td> <td>52</td> <td>12</td> <td>90</td> </tr> <tr><td>Total</td> <td>71</td> <td>90</td> <td>39</td> <td>200</td> </tr> </tbody> </table> <ol id="fs-idm26020880" type="a"><li>Out of the males, what is the probability that the cyclist prefers a hilly path?</li> <li>Are the events “being male” and “preferring the hilly path” independent events?</li> </ol> </div> </div> </div> <div id="element-883" class="textbox textbox--examples" data-type="example"><p>Muddy Mouse lives in a cage with three doors. If Muddy goes out the first door, the probability that he gets caught by Alissa the cat is \(\frac{1}{5}\text{}\) and the probability he is not caught is \(\frac{4}{5}\text{}\). If he goes out the second door, the probability he gets caught by Alissa is \(\frac{1}{4}\) and the probability he is not caught is \(\frac{3}{4}\). The probability that Alissa catches Muddy coming out of the third door is \(\frac{1}{2}\) and the probability she does not catch Muddy is \(\frac{1}{2}\). It is equally likely that Muddy will choose any of the three doors so the probability of choosing each door is \(\frac{1}{3}\).</p> <table summary=""><caption><span data-type="title">Door Choice</span></caption> <thead><tr><th>Caught or Not</th> <th>Door One</th> <th>Door Two</th> <th>Door Three</th> <th>Total</th> </tr> </thead> <tbody><tr><td>Caught</td> <td>\(\frac{1}{15}\text{}\)</td> <td>\(\frac{1}{12}\text{}\)</td> <td>\(\frac{1}{6}\text{}\)</td> <td>____</td> </tr> <tr><td>Not Caught</td> <td>\(\frac{4}{15}\)</td> <td>\(\frac{3}{12}\)</td> <td>\(\frac{1}{6}\)</td> <td>____</td> </tr> <tr><td>Total</td> <td>____</td> <td>____</td> <td>____</td> <td>1</td> </tr> </tbody> </table> <ul id="element-791"><li>The first entry \(\frac{1}{15}=\left(\frac{1}{5}\right)\left(\frac{1}{3}\right)\) is <em data-effect="italics">P</em>(Door One AND Caught)</li> <li>The entry \(\frac{4}{15}=\left(\frac{4}{5}\right)\left(\frac{1}{3}\right)\) is <em data-effect="italics">P</em>(Door One AND Not Caught)</li> </ul> <p id="element-94">Verify the remaining entries.</p> <p>&nbsp;</p> <div data-type="exercise"><div id="id41669984" data-type="problem"><p>a. Complete the probability contingency table. Calculate the entries for the totals. Verify that the lower-right corner entry is 1.</p> </div> <div id="id41670004" data-type="solution" data-print-placement="end"><p id="fs-idm19141920">a.</p> <table id="element-6166" summary="Displayed is a contingency table showing all the entries for the Muddy Mouse problem."><caption><span data-type="title">Door Choice</span></caption> <thead><tr><th>Caught or Not</th> <th>Door One</th> <th>Door Two</th> <th>Door Three</th> <th>Total</th> </tr> </thead> <tbody><tr><td>Caught</td> <td>\(\frac{1}{15}\text{}\)</td> <td>\(\frac{1}{12}\text{}\)</td> <td>\(\frac{1}{6}\text{}\)</td> <td><strong>\(\frac{19}{60}\)</strong></td> </tr> <tr><td>Not Caught</td> <td>\(\frac{4}{15}\)</td> <td>\(\frac{3}{12}\)</td> <td>\(\frac{1}{6}\)</td> <td><strong>\(\frac{41}{60}\)</strong></td> </tr> <tr><td>Total</td> <td><strong>\(\frac{5}{15}\)</strong></td> <td><strong>\(\frac{4}{12}\)</strong></td> <td><strong>\(\frac{2}{6}\)</strong></td> <td>1</td> </tr> </tbody> </table> </div> </div> <div id="element-604" data-type="exercise"><div id="id41670368" data-type="problem"><p id="element-117">b. What is the probability that Alissa does not catch Muddy?</p> </div> <div id="id41670388" data-type="solution"><p id="element-70">b. \(\frac{41}{60}\)</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id41670423" data-type="problem"><p id="element-45">c. What is the probability that Muddy chooses Door One OR Door Two given that Muddy is caught by Alissa?</p> </div> <div id="id41670452" data-type="solution"><p>c. \(\frac{9}{19}\)</p> </div> </div> </div> <div id="fs-idm73418960" class="textbox textbox--examples" data-type="example"><p id="fs-idm37807440"><a class="autogenerated-content" href="#Ch03_M04_tbl007">(Figure)</a> contains the number of crimes per 100,000 inhabitants from 2008 to 2011 in the U.S.</p> <table id="Ch03_M04_tbl007" summary="United States Crime Index Rates Per 100,000 Inhabitants 2008 – 2011"><caption><span data-type="title">United States Crime Index Rates Per 100,000 Inhabitants 2008–2011</span></caption> <thead><tr><th>Year</th> <th>Robbery</th> <th>Burglary</th> <th>Rape</th> <th>Vehicle</th> <th>Total</th> </tr> </thead> <tbody><tr><td>2008</td> <td>145.7</td> <td>732.1</td> <td>29.7</td> <td>314.7</td> <td></td> </tr> <tr><td>2009</td> <td>133.1</td> <td>717.7</td> <td>29.1</td> <td>259.2</td> <td></td> </tr> <tr><td>2010</td> <td>119.3</td> <td>701</td> <td>27.7</td> <td>239.1</td> <td></td> </tr> <tr><td>2011</td> <td>113.7</td> <td>702.2</td> <td>26.8</td> <td>229.6</td> <td></td> </tr> <tr><td>Total</td> <td></td> <td></td> <td></td> <td></td> <td></td> </tr> </tbody> </table> <div id="fs-idm29224880" data-type="exercise"><div id="fs-idm58902352" data-type="problem"><p id="fs-idp18911776">TOTAL each column and each row. Total data = 4,520.7</p> <ol id="fs-idm6434864" type="a"><li>Find <em data-effect="italics">P</em>(2009 AND Robbery).</li> <li>Find <em data-effect="italics">P</em>(2010 AND Burglary).</li> <li>Find <em data-effect="italics">P</em>(2010 OR Burglary).</li> <li>Find <em data-effect="italics">P</em>(2011|Rape).</li> <li>Find <em data-effect="italics">P</em>(Vehicle|2008).</li> </ol> </div> <div id="fs-idm46779568" data-type="solution"><p id="fs-idm37451408">a. 0.0294, b. 0.1551, c. 0.7165, d. 0.2365, e. 0.2575</p> </div> </div> </div> <div id="fs-idm31309776" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm30200592" data-type="exercise"><div id="fs-idm54491920" data-type="problem"><p id="fs-idm50410048"><a class="autogenerated-content" href="#M05_ch03-tbl009">(Figure)</a> relates the weights and heights of a group of individuals participating in an observational study.</p> <table id="M05_ch03-tbl009" summary=""><thead><tr><th>Weight/Height</th> <th>Tall</th> <th>Medium</th> <th>Short</th> <th>Totals</th> </tr> </thead> <tbody><tr><td>Obese</td> <td>18</td> <td>28</td> <td>14</td> <td></td> </tr> <tr><td>Normal</td> <td>20</td> <td>51</td> <td>28</td> <td></td> </tr> <tr><td>Underweight</td> <td>12</td> <td>25</td> <td>9</td> <td></td> </tr> <tr><td>Totals</td> <td></td> <td></td> <td></td> <td></td> </tr> </tbody> </table> <ol id="fs-idm43315168" type="a"><li>Find the total for each row and column</li> <li>Find the probability that a randomly chosen individual from this group is Tall.</li> <li>Find the probability that a randomly chosen individual from this group is Obese and Tall.</li> <li>Find the probability that a randomly chosen individual from this group is Tall given that the idividual is Obese.</li> <li>Find the probability that a randomly chosen individual from this group is Obese given that the individual is Tall.</li> <li>Find the probability a randomly chosen individual from this group is Tall and Underweight.</li> <li>Are the events Obese and Tall independent?</li> </ol> </div> </div> </div> <div id="fs-idm122777312" class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="fs-idp25484384">“Blood Types.” American Red Cross, 2013. Available online at http://www.redcrossblood.org/learn-about-blood/blood-types (accessed May 3, 2013).</p> <p id="fs-idm90987472">Data from the National Center for Health Statistics, part of the United States Department of Health and Human Services.</p> <p id="fs-idm79839440">Data from United States Senate. Available online at www.senate.gov (accessed May 2, 2013).</p> <p id="fs-idm90988000">Haiman, Christopher A., Daniel O. Stram, Lynn R. Wilkens, Malcom C. Pike, Laurence N. Kolonel, Brien E. Henderson, and Loīc Le Marchand. “Ethnic and Racial Differences in the Smoking-Related Risk of Lung Cancer.” The New England Journal of Medicine, 2013. Available online at http://www.nejm.org/doi/full/10.1056/NEJMoa033250 (accessed May 2, 2013).</p> <p id="eip-29">“Human Blood Types.” Unite Blood Services, 2011. Available online at http://www.unitedbloodservices.org/learnMore.aspx (accessed May 2, 2013).</p> <p id="fs-idm7900000">Samuel, T. M. “Strange Facts about RH Negative Blood.” eHow Health, 2013. Available online at http://www.ehow.com/facts_5552003_strange-rh-negative-blood.html (accessed May 2, 2013).</p> <p>“United States: Uniform Crime Report – State Statistics from 1960–2011.” The Disaster Center. Available online at http://www.disastercenter.com/crime/ (accessed May 2, 2013).</p> </div> <div id="fs-idm68934880" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idp35891312">There are several tools you can use to help organize and sort data when calculating probabilities. Contingency tables help display data and are particularly useful when calculating probabilites that have multiple dependent variables.</p> </div> <div id="fs-idp46156576" class="practice" data-depth="1"><h3 data-type="title"><em data-effect="italics">Use the following information to answer the next four exercises.</em><a class="autogenerated-content" href="#M05_ch03-tbl011">(Figure)</a> shows a random sample of musicians and how they learned to play their instruments.GenderSelf-taughtStudied in SchoolPrivate InstructionTotalFemale12382272Male19241558Total316237130Find <em data-effect="italics">P</em>(musician is a female).</h3> </div> <div id="eip-388" data-type="exercise"><div id="eip-950" data-type="problem"><p id="eip-249">Find <em data-effect="italics">P</em>(musician is a male AND had private instruction).</p> </div> <div data-type="solution"><p id="eip-225"><em data-effect="italics">P</em>(musician is a male AND had private instruction) = \(\frac{15}{130}\) = \(\frac{3}{26}\) = 0.12</p> </div> </div> <div id="eip-631" data-type="exercise"><div id="eip-500" data-type="problem"><p>Find <em data-effect="italics">P</em>(musician is a female OR is self taught).</p> </div> </div> <div id="eip-350" data-type="exercise"><div data-type="problem"><p>Are the events “being a female musician” and “learning music in school” mutually exclusive events?</p> </div> <div data-type="solution"><p id="eip-609"><em data-effect="italics">P</em>(being a female musician AND learning music in school) = \(\frac{38}{130}\) = \(\frac{19}{65}\) = 0.29</p> <p id="eip-610"><em data-effect="italics">P</em>(being a female musician)<em data-effect="italics">P</em>(learning music in school) = \(\left(\frac{72}{130}\right)\left(\frac{62}{130}\right)\) = \(\frac{4,464}{16,900}\) = \(\frac{1,116}{4,225}\) = 0.26</p> <p>No, they are not independent because <em data-effect="italics">P</em>(being a female musician AND learning music in school) is not equal to <em data-effect="italics">P</em>(being a female musician)<em data-effect="italics">P</em>(learning music in school).</p> </div> </div> <div id="fs-idm32828224" class="bring-together-exercises" data-depth="1"><h3 data-type="title">Bringing It Together</h3> <p id="element-12">Use the following information to answer the next seven exercises. An article in the <cite><span data-type="cite-title">New England Journal of Medicine</span></cite>, reported about a study of smokers in California and Hawaii. In one part of the report, the self-reported ethnicity and smoking levels per day were given. Of the people smoking at most ten cigarettes per day, there were 9,886 African Americans, 2,745 Native Hawaiians, 12,831 Latinos, 8,378 Japanese Americans, and 7,650 Whites. Of the people smoking 11 to 20 cigarettes per day, there were 6,514 African Americans, 3,062 Native Hawaiians, 4,932 Latinos, 10,680 Japanese Americans, and 9,877 Whites. Of the people smoking 21 to 30 cigarettes per day, there were 1,671 African Americans, 1,419 Native Hawaiians, 1,406 Latinos, 4,715 Japanese Americans, and 6,062 Whites. Of the people smoking at least 31 cigarettes per day, there were 759 African Americans, 788 Native Hawaiians, 800 Latinos, 2,305 Japanese Americans, and 3,970 Whites.</p> <div id="element-841" data-type="exercise"><div id="id24840283" data-type="problem"><p id="element-2314">Complete the table using the data provided. Suppose that one person from the study is randomly selected. Find the probability that person smoked 11 to 20 cigarettes per day.</p> <table id="element-762" summary="Partially filled ethnicity by smoking level table with the first column listing the smoking levels (4 rows plus the total), the blank second column lists African American values, blank third column for Native Hawaiians, blank fourth column for Latinos, blank fifth column for Japanese Americans, blank sixth column for Whites, and blank seventh column for the Total."><caption><span data-type="title">Smoking Levels by Ethnicity</span></caption> <thead><tr><th>Smoking Level</th> <th>African American</th> <th>Native Hawaiian</th> <th>Latino</th> <th>Japanese Americans</th> <th>White</th> <th>TOTALS</th> </tr> </thead> <tbody><tr><td>1–10</td> <td></td> <td></td> <td></td> <td></td> <td></td> <td></td> </tr> <tr><td>11–20</td> <td></td> <td></td> <td></td> <td></td> <td></td> <td></td> </tr> <tr><td>21–30</td> <td></td> <td></td> <td></td> <td></td> <td></td> <td></td> </tr> <tr><td>31+</td> <td></td> <td></td> <td></td> <td></td> <td></td> <td></td> </tr> <tr><td>TOTALS</td> <td></td> <td></td> <td></td> <td></td> <td></td> <td></td> </tr> </tbody> </table> </div> </div> <div id="eip-892" data-type="exercise"><div id="eip-189" data-type="problem"><p id="eip-241">Suppose that one person from the study is randomly selected. Find the probability that person smoked 11 to 20 cigarettes per day.</p> </div> <div id="id24840301" data-type="solution"><p id="element-603">\(\frac{35,065}{100,450}\)</p> </div> </div> <div id="element-120" data-type="exercise"><div id="id24840332" data-type="problem"><p>Find the probability that the person was Latino.</p> </div> </div> <div id="exercise-124" data-type="exercise"><div id="id24840396" data-type="problem"><p id="exercise-124p">In words, explain what it means to pick one person from the study who is “Japanese American <strong>AND</strong> smokes 21 to 30 cigarettes per day.” Also, find the probability.</p> </div> <div id="id25124759" data-type="solution"><p id="exercise-124s">To pick one person from the study who is Japanese American AND smokes 21 to 30 cigarettes per day means that the person has to meet both criteria: both Japanese American and smokes 21 to 30 cigarettes. The sample space should include everyone in the study. The probability is \(\frac{4,715}{100,450}\).</p> </div> </div> <div id="exercise-125" data-type="exercise"><div id="id25124791" data-type="problem"><p id="exercise-125p">In words, explain what it means to pick one person from the study who is “Japanese American <strong>OR</strong> smokes 21 to 30 cigarettes per day.” Also, find the probability.</p> </div> </div> <div id="exercise-126" data-type="exercise"><div id="id25124851" data-type="problem"><p id="exercise-126p">In words, explain what it means to pick one person from the study who is “Japanese American <strong>GIVEN</strong> that person smokes 21 to 30 cigarettes per day.” Also, find the probability.</p> </div> <div id="id25124879" data-type="solution"><p id="exercise-126s">To pick one person from the study who is Japanese American given that person smokes 21-30 cigarettes per day, means that the person must fulfill both criteria and the sample space is reduced to those who smoke 21-30 cigarettes per day. The probability is \(\frac{4715}{15,273}\).</p> </div> </div> <div id="exercise-127" data-type="exercise"><div id="id25124911" data-type="problem"><p id="exercise-127p">Prove that smoking level/day and ethnicity are dependent events.</p> </div> </div> </div> <div id="fs-idp18062160" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <p><em data-effect="italics">Use the information in the <a class="autogenerated-content" href="#M05_ch03-tbl13">(Figure)</a> to answer the next eight exercises.</em> The table shows the political party affiliation of each of 67 members of the US Senate in June 2012, and when they are up for reelection.</p> <table id="M05_ch03-tbl13" summary="..."><thead><tr><th>Up for reelection:</th> <th>Democratic Party</th> <th>Republican Party</th> <th>Other</th> <th>Total</th> </tr> </thead> <tbody><tr><td>November 2014</td> <td>20</td> <td>13</td> <td>0</td> <td></td> </tr> <tr><td>November 2016</td> <td>10</td> <td>24</td> <td>0</td> <td></td> </tr> <tr><td>Total</td> <td></td> <td></td> <td></td> <td></td> </tr> </tbody> </table> <div data-type="exercise"><div id="eip-939" data-type="problem"><p id="eip-356">1) What is the probability that a randomly selected senator has an “Other” affiliation?</p> </div> <div data-type="solution"><p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="eip-512" data-type="problem"><p>2) What is the probability that a randomly selected senator is up for reelection in November 2016?</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="eip-589" data-type="problem"><p id="eip-287">3) What is the probability that a randomly selected senator is a Democrat and up for reelection in November 2016?</p> </div> <div data-type="solution"><p>&nbsp;</p> </div> </div> <div id="eip-331" data-type="exercise"><div data-type="problem"><p id="eip-72">4) What is the probability that a randomly selected senator is a Republican or is up for reelection in November 2014?</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="eip-261" data-type="problem"><p id="eip-264">5) Suppose that a member of the US Senate is randomly selected. Given that the randomly selected senator is up for reelection in November 2016, what is the probability that this senator is a Democrat?</p> </div> <div data-type="solution"><p>&nbsp;</p> </div> </div> <div id="eip-164" data-type="exercise"><div data-type="problem"><p>6) Suppose that a member of the US Senate is randomly selected. What is the probability that the senator is up for reelection in November 2014, knowing that this senator is a Republican?</p> </div> </div> <div id="eip-88" data-type="exercise"><div id="eip-832" data-type="problem"><p>7) The events “Republican” and “Up for reelection in 2016” are ________</p> <ol id="eip-idp126651680" type="a"><li>mutually exclusive.</li> <li>independent.</li> <li>both mutually exclusive and independent.</li> <li>neither mutually exclusive nor independent.</li> </ol> </div> <div data-type="solution"><p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="eip-82" data-type="problem"><p id="eip-544">8) The events “Other” and “Up for reelection in November 2016” are ________</p> <ol id="eip-idp21360256" type="a"><li>mutually exclusive.</li> <li>independent.</li> <li>both mutually exclusive and independent.</li> <li>neither mutually exclusive nor independent.</li> </ol> <p>&nbsp;</p> </div> </div> <div id="tablular-ex" data-type="exercise"><div id="id43571370" data-type="problem"><p>9) <a class="autogenerated-content" href="#M05_ch03-tbl020">(Figure)</a> gives the number of suicides estimated in the U.S. for a recent year by age, race (black or white), and sex. We are interested in possible relationships between age, race, and sex. We will let suicide victims be our population.</p> <table id="M05_ch03-tbl020" summary="This partially filled table presents the data of suicides by age and race and sex. The first column lists the race and sex, the second column lists ages 1-14, third column lists 15-24, fourth column lists 25-64, blank fifth column lists over 64, and the sixth column lists the totals. The first row lists white, male, the second row is white, female, the third row is black, male, the fourth row is black, female, the blank fifth row is all others, and the total is on the sixth row."><thead><tr><th>Race and Sex</th> <th>1–14</th> <th>15–24</th> <th>25–64</th> <th>over 64</th> <th>TOTALS</th> </tr> </thead> <tbody><tr><td>white, male</td> <td>210</td> <td>3,360</td> <td>13,610</td> <td></td> <td>22,050</td> </tr> <tr><td>white, female</td> <td>80</td> <td>580</td> <td>3,380</td> <td></td> <td>4,930</td> </tr> <tr><td>black, male</td> <td>10</td> <td>460</td> <td>1,060</td> <td></td> <td>1,670</td> </tr> <tr><td>black, female</td> <td>0</td> <td>40</td> <td>270</td> <td></td> <td>330</td> </tr> <tr><td>all others</td> <td></td> <td></td> <td></td> <td></td> <td></td> </tr> <tr><td>TOTALS</td> <td>310</td> <td>4,650</td> <td>18,780</td> <td></td> <td>29,760</td> </tr> </tbody> </table> <p id="eip-idp27013472">Do not include &#8220;all others&#8221; for parts f and g.</p> <ol type="a"><li>Fill in the column for the suicides for individuals over age 64.</li> <li>Fill in the row for all other races.</li> <li>Find the probability that a randomly selected individual was a white male.</li> <li>Find the probability that a randomly selected individual was a black female.</li> <li>Find the probability that a randomly selected individual was black</li> <li>Find the probability that a randomly selected individual was a black or white male.</li> <li>Out of the individuals over age 64, find the probability that a randomly selected individual was a black or white male.</li> </ol> <p>&nbsp;</p> </div> <div id="id43572030" data-type="solution"></div> </div> <p id="element-686"><em data-effect="italics">Use the following information to answer the next two exercises.</em> The table of data obtained from <cite><span data-type="cite-title">www.baseball-almanac.com</span></cite> shows hit information for four well known baseball players. Suppose that one hit from the table is randomly selected.</p> <table id="element-695" summary="This table presents data based on type of hit by baseball player. The first column lists the names of the baseball player, second column lists single hits, third column lists double, fourth column is triple, fifth column is home run, and sixth column is total hits. The first row is Babe Ruth, the second row is Jackie Robinson, third row is Ty Cobb, fourth row is Hank Aaron, and Total is in the fifth row."><thead><tr><th>NAME</th> <th>Single</th> <th>Double</th> <th>Triple</th> <th>Home Run</th> <th>TOTAL HITS</th> </tr> </thead> <tbody><tr><td>Babe Ruth</td> <td>1,517</td> <td>506</td> <td>136</td> <td>714</td> <td>2,873</td> </tr> <tr><td>Jackie Robinson</td> <td>1,054</td> <td>273</td> <td>54</td> <td>137</td> <td>1,518</td> </tr> <tr><td>Ty Cobb</td> <td>3,603</td> <td>174</td> <td>295</td> <td>114</td> <td>4,189</td> </tr> <tr><td>Hank Aaron</td> <td>2,294</td> <td>624</td> <td>98</td> <td>755</td> <td>3,771</td> </tr> <tr><td>TOTAL</td> <td>8,471</td> <td>1,577</td> <td>583</td> <td>1,720</td> <td>12,351</td> </tr> </tbody> </table> <div data-type="exercise"><div id="id43574123" data-type="problem"><p>10) Find <em data-effect="italics">P</em>(hit was made by Babe Ruth).</p> <ol type="a"><li>\(\frac{1518}{2873}\)</li> <li>\(\frac{2873}{12351}\)</li> <li>\(\frac{583}{12351}\)</li> <li>\(\frac{4189}{12351}\)</li> </ol> </div> </div> <div id="element-940" data-type="exercise"><div id="id43574389" data-type="problem"><p>11) Find <em data-effect="italics">P</em>(hit was made by Ty Cobb|The hit was a Home Run).</p> <ol id="element-901" type="a"><li>\(\frac{4189}{12351}\)</li> <li>\(\frac{114}{1720}\)</li> <li>\(\frac{1720}{4189}\)</li> <li>\(\frac{114}{12351}\)</li> </ol> </div> <div id="id43574628" data-type="solution"><p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id43569269" data-type="problem"><p>12) <a class="autogenerated-content" href="#M05_ch03-tbl021">(Figure)</a> identifies a group of children by one of four hair colors, and by type of hair.</p> <table id="M05_ch03-tbl021" summary="A partially filled table for hair color by hair type. The first column lists hair type, the second column lists brown hair, third column lists blond hair, fourth column lists black hair, fifth column lists red hair, and the sixth column lists totals. The first row lists wavy hair, second row lists straight hair, and the third row lists the total. All values are listed in the brown column except the total, all values are listed in the blond column except for the first row, only the first row of the black column is filled in, all of the red column is filled in except for the total, and the first and third rows of the total column are filled in."><thead><tr><th>Hair Type</th> <th>Brown</th> <th>Blond</th> <th>Black</th> <th>Red</th> <th>Totals</th> </tr> </thead> <tbody><tr><td>Wavy</td> <td>20</td> <td></td> <td>15</td> <td>3</td> <td>43</td> </tr> <tr><td>Straight</td> <td>80</td> <td>15</td> <td></td> <td>12</td> <td></td> </tr> <tr><td>Totals</td> <td></td> <td>20</td> <td></td> <td></td> <td>215</td> </tr> </tbody> </table> <ol id="element-894" type="a"><li>Complete the table.</li> <li>What is the probability that a randomly selected child will have wavy hair?</li> <li>What is the probability that a randomly selected child will have either brown or blond hair?</li> <li>What is the probability that a randomly selected child will have wavy brown hair?</li> <li>What is the probability that a randomly selected child will have red hair, given that he or she has straight hair?</li> <li>If <em data-effect="italics">B</em> is the event of a child having brown hair, find the probability of the complement of <em data-effect="italics">B</em>.</li> <li>In words, what does the complement of <em data-effect="italics">B</em> represent?</li> </ol> </div> </div> <div data-type="exercise"><div data-type="problem"><p>13) In a previous year, the weights of the members of the <strong>San Francisco 49ers</strong> and the <strong>Dallas Cowboys</strong> were published in the <cite><span data-type="cite-title">San Jose Mercury News</span></cite>. The factual data were compiled into the following table.</p> <table id="element-13" summary="This table presents weight in pounds by shirt number. The first column lists the shirt number, the second column lists weight ≤ 210, the third column lists 211-250, fourth column lists 251-290, and the fifth column lists 291 ≤. The first row lists shirt numbers 1-33, second row lists 34-66, and the third row lists 66-99."><thead><tr><th>Shirt#</th> <th>≤ 210</th> <th>211–250</th> <th>251–290</th> <th>&gt; 290</th> </tr> </thead> <tbody><tr><td>1–33</td> <td>21</td> <td>5</td> <td>0</td> <td>0</td> </tr> <tr><td>34–66</td> <td>6</td> <td>18</td> <td>7</td> <td>4</td> </tr> <tr><td>66–99</td> <td>6</td> <td>12</td> <td>22</td> <td>5</td> </tr> </tbody> </table> <p>For the following, suppose that you randomly select one player from the 49ers or Cowboys.</p> <ol type="a"><li>Find the probability that his shirt number is from 1 to 33.</li> <li>Find the probability that he weighs at most 210 pounds.</li> <li>Find the probability that his shirt number is from 1 to 33 AND he weighs at most 210 pounds.</li> <li>Find the probability that his shirt number is from 1 to 33 OR he weighs at most 210 pounds.</li> <li>Find the probability that his shirt number is from 1 to 33 GIVEN that he weighs at most 210 pounds.</li> </ol> <p>&nbsp;</p> </div> <div id="fs-idm65248128" data-type="solution"><ol id="fs-idm65247872" type="a"></ol> <p><strong>Answers to odd questions</strong></p> <p>1) 0</p> <p>3) \(\frac{10}{67}\)</p> <p>5) \(\frac{10}{34}\)</p> <p>7) d</p> <p>9)</p> <ol id="element-271" type="a"><li><table id="fs-idm9443200" summary=""><thead><tr><th>Race and Sex</th> <th>1–14</th> <th>15–24</th> <th>25–64</th> <th>over 64</th> <th>TOTALS</th> </tr> </thead> <tbody><tr><td>white, male</td> <td>210</td> <td>3,360</td> <td>13,610</td> <td>4,870</td> <td>22,050</td> </tr> <tr><td>white, female</td> <td>80</td> <td>580</td> <td>3,380</td> <td>890</td> <td>4,930</td> </tr> <tr><td>black, male</td> <td>10</td> <td>460</td> <td>1,060</td> <td>140</td> <td>1,670</td> </tr> <tr><td>black, female</td> <td>0</td> <td>40</td> <td>270</td> <td>20</td> <td>330</td> </tr> <tr><td>all others</td> <td></td> <td></td> <td></td> <td>100</td> <td></td> </tr> <tr><td>TOTALS</td> <td>310</td> <td>4,650</td> <td>18,780</td> <td>6,020</td> <td>29,760</td> </tr> </tbody> </table> </li> <li><table id="fs-idm74994096" summary=""><thead><tr><th>Race and Sex</th> <th>1–14</th> <th>15–24</th> <th>25–64</th> <th>over 64</th> <th>TOTALS</th> </tr> </thead> <tbody><tr><td>white, male</td> <td>210</td> <td>3,360</td> <td>13,610</td> <td>4,870</td> <td>22,050</td> </tr> <tr><td>white, female</td> <td>80</td> <td>580</td> <td>3,380</td> <td>890</td> <td>4,930</td> </tr> <tr><td>black, male</td> <td>10</td> <td>460</td> <td>1,060</td> <td>140</td> <td>1,670</td> </tr> <tr><td>black, female</td> <td>0</td> <td>40</td> <td>270</td> <td>20</td> <td>330</td> </tr> <tr><td>all others</td> <td>10</td> <td>210</td> <td>460</td> <td>100</td> <td>780</td> </tr> <tr><td>TOTALS</td> <td>310</td> <td>4,650</td> <td>18,780</td> <td>6,020</td> <td>29,760</td> </tr> </tbody> </table> </li> <li>\(\frac{\text{22,050}}{\text{29,760}}\)</li> <li>\(\frac{\text{330}}{\text{29,760}}\)</li> <li>\(\frac{\text{2,000}}{\text{29,760}}\)</li> <li>\(\frac{23720}{\left(29760-780\right)}=\frac{23720}{28980}\)</li> <li>\(\frac{5010}{\left(6020-100\right)}=\frac{5010}{5920}\)</li> </ol> <p>11) b</p> <p>13)</p> <ol type="a"><li>\(\frac{26}{106}\)</li> <li>\(\frac{33}{106}\)</li> <li>\(\frac{21}{106}\)</li> <li>\(\left(\frac{26}{106}\right)\) + \(\left(\frac{33}{106}\right)\) &#8211; \(\left(\frac{21}{106}\right)\) = \(\left(\frac{38}{106}\right)\)</li> <li>\(\frac{21}{33}\)</li> </ol> </div> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="contintable"><dt>contingency table</dt> <dd id="id17487593">the method of displaying a frequency distribution as a table with rows and columns to show how two variables may be dependent (contingent) upon each other; the table provides an easy way to calculate conditional probabilities.</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-tree-and-venn-diagrams" title="Chapter 4.6: Tree and Venn Diagrams"><div class="chapter-title-wrap"><h3 class="chapter-number">29</h3><h2 class="chapter-title"><span class="display-none">Chapter 4.6: Tree and Venn Diagrams</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p id="fs-idp115004208">Sometimes, when the probability problems are complex, it can be helpful to graph the situation. Tree diagrams and Venn diagrams are two tools that can be used to visualize and solve conditional probabilities.</p> <div id="fs-idp113760352" class="bc-section section" data-depth="1"><h3 data-type="title">Tree Diagrams</h3> <p id="fs-idp119305568">A <span data-type="term">tree diagram</span> is a special type of graph used to determine the outcomes of an experiment. It consists of &#8220;branches&#8221; that are labeled with either frequencies or probabilities. Tree diagrams can make some probability problems easier to visualize and solve. The following example illustrates how to use a tree diagram.</p> <div id="element-192" class="textbox textbox--examples" data-type="example"><p id="element-200">In an urn, there are 11 balls. Three balls are red (<em data-effect="italics">R</em>) and eight balls are blue (<em data-effect="italics">B</em>). Draw two balls, one at a time, <strong>with replacement</strong>. &#8220;With replacement&#8221; means that you put the first ball back in the urn before you select the second ball. The tree diagram using frequencies that show all the possible outcomes follows.</p> <div class="bc-figure figure"><div class="bc-figcaption figcaption">Total = 64 + 24 + 24 + 9 = 121</div> <p><span id="id47069242" data-type="media" data-alt="This is a tree diagram with branches showing frequencies of each draw. The first branch shows two lines: 8B and 3R. The second branch has a set of two lines (8B and 3R) for each line of the first branch. Multiply along each line to find 64BB, 24BR, 24RB, and 9RR."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch03_07_01N-1.jpg" alt="This is a tree diagram with branches showing frequencies of each draw. The first branch shows two lines: 8B and 3R. The second branch has a set of two lines (8B and 3R) for each line of the first branch. Multiply along each line to find 64BB, 24BR, 24RB, and 9RR." width="400" data-media-type="image/jpg" data-print-width="4in" /></span></p> </div> <p>The first set of branches represents the first draw. The second set of branches represents the second draw. Each of the outcomes is distinct. In fact, we can list each red ball as <em data-effect="italics">R</em>1, <em data-effect="italics">R</em>2, and <em data-effect="italics">R</em>3 and each blue ball as <em data-effect="italics">B</em>1, <em data-effect="italics">B</em>2, <em data-effect="italics">B</em>3, <em data-effect="italics">B</em>4, <em data-effect="italics">B</em>5, <em data-effect="italics">B</em>6, <em data-effect="italics">B</em>7, and <em data-effect="italics">B</em>8. Then the nine <em data-effect="italics">RR</em> outcomes can be written as:</p> <p id="element-1255"><span id="set-54_1" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item"><em data-effect="italics">R</em>1  <em data-effect="italics">R</em>1  </span><span data-type="item"><em data-effect="italics">R</em>1  <em data-effect="italics">R</em>2  </span><span data-type="item"><em data-effect="italics">R</em>1  <em data-effect="italics">R</em>3  </span><span data-type="item"><em data-effect="italics">R</em>2  <em data-effect="italics">R</em>1  </span><span data-type="item"><em data-effect="italics">R</em>2  <em data-effect="italics">R</em>2  </span><span data-type="item"><em data-effect="italics">R</em>2  <em data-effect="italics">R</em>3  </span><span data-type="item"><em data-effect="italics">R</em>3  <em data-effect="italics">R</em>1  </span><span data-type="item"><em data-effect="italics">R</em>3  <em data-effect="italics">R</em>2  </span><span data-type="item"><em data-effect="italics">R</em>3  <em data-effect="italics">R</em>3</span></span></p> <p id="element-5235">The other outcomes are similar.</p> <p id="element-767">There are a total of 11 balls in the urn. Draw two balls, one at a time, with replacement. There are 11(11) = 121 outcomes, the size of the <span data-type="term">sample space</span>.</p> <p>&nbsp;</p> <div id="element-135" data-type="exercise"><div id="id47069535" data-type="problem"><p id="element-638">a. List the 24 <em data-effect="italics">BR</em> outcomes: <em data-effect="italics">B</em>1<em data-effect="italics">R</em>1,  <em data-effect="italics">B</em>1<em data-effect="italics">R</em>2,  <em data-effect="italics">B</em>1<em data-effect="italics">R</em>3, &#8230;</p> </div> <div id="id47069675" data-type="solution" data-print-placement="end"><p id="element-63435">a. <span id="element-12351" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item"><em data-effect="italics">B</em>1<em data-effect="italics">R</em>1  </span><span data-type="item"><em data-effect="italics">B</em>1<em data-effect="italics">R</em>2  </span><span data-type="item"><em data-effect="italics">B</em>1<em data-effect="italics">R</em>3  </span><span data-type="item"><em data-effect="italics">B</em>2<em data-effect="italics">R</em>1  </span><span data-type="item"><em data-effect="italics">B</em>2<em data-effect="italics">R</em>2  </span><span data-type="item"><em data-effect="italics">B</em>2<em data-effect="italics">R</em>3  </span><span data-type="item"><em data-effect="italics">B</em>3<em data-effect="italics">R</em>1  </span><span data-type="item"><em data-effect="italics">B</em>3<em data-effect="italics">R</em>2  </span><span data-type="item"><em data-effect="italics">B</em>3<em data-effect="italics">R</em>3  </span><span data-type="item"><em data-effect="italics">B</em>4<em data-effect="italics">R</em>1  </span><span data-type="item"><em data-effect="italics">B</em>4<em data-effect="italics">R</em>2  </span><span data-type="item"><em data-effect="italics">B</em>4<em data-effect="italics">R</em>3  </span><span data-type="item"><em data-effect="italics">B</em>5<em data-effect="italics">R</em>1  </span><span data-type="item"><em data-effect="italics">B</em>5<em data-effect="italics">R</em>2  </span><span data-type="item"><em data-effect="italics">B</em>5<em data-effect="italics">R</em>3  </span><span data-type="item"><em data-effect="italics">B</em>6<em data-effect="italics">R</em>1  </span><span data-type="item"><em data-effect="italics">B</em>6<em data-effect="italics">R</em>2  </span><span data-type="item"><em data-effect="italics">B</em>6<em data-effect="italics">R</em>3  </span><span data-type="item"><em data-effect="italics">B</em>7<em data-effect="italics">R</em>1  </span><span data-type="item"><em data-effect="italics">B</em>7<em data-effect="italics">R</em>2  </span><span data-type="item"><em data-effect="italics">B</em>7<em data-effect="italics">R</em>3  </span><span data-type="item"><em data-effect="italics">B</em>8<em data-effect="italics">R</em>1  </span><span data-type="item"><em data-effect="italics">B</em>8<em data-effect="italics">R</em>2  </span><span data-type="item"><em data-effect="italics">B</em>8<em data-effect="italics">R</em>3</span></span></p> <p>&nbsp;</p> </div> </div> <div id="element-38" data-type="exercise"><div id="id47069956" data-type="problem"><p id="element-909">b. Using the tree diagram, calculate <em data-effect="italics">P</em>(<em data-effect="italics">RR</em>).</p> </div> <div id="id47069979" data-type="solution"><p>b. <em data-effect="italics">P</em>(<em data-effect="italics">RR</em>) = \(\left(\frac{3}{11}\right)\left(\frac{3}{11}\right)\) = \(\frac{9}{121}\)</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id47070044" data-type="problem"><p>c. Using the tree diagram, calculate <em data-effect="italics">P</em>(<em data-effect="italics">RB</em> OR <em data-effect="italics">BR</em>).</p> </div> <div id="id47070067" data-type="solution"><p>c. <em data-effect="italics">P</em>(<em data-effect="italics">RB</em> OR <em data-effect="italics">BR</em>) = \(\left(\frac{3}{11}\right)\left(\frac{8}{11}\right)\) + \(\left(\frac{8}{11}\right)\left(\frac{3}{11}\right)\) = \(\frac{48}{121}\)</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id46124799" data-type="problem"><p>d. Using the tree diagram, calculate <em data-effect="italics">P</em>(<em data-effect="italics">R</em> on 1st draw AND <em data-effect="italics">B</em> on 2nd draw).</p> </div> <div id="id46124822" data-type="solution"><p id="element-989">d. <em data-effect="italics">P</em>(<em data-effect="italics">R</em> on 1st draw AND <em data-effect="italics">B</em> on 2nd draw) = <em data-effect="italics">P</em>(<em data-effect="italics">RB</em>) = \(\left(\frac{3}{11}\right)\left(\frac{8}{11}\right)\) = \(\frac{24}{121}\)</p> <p>&nbsp;</p> </div> </div> <div id="element-59" data-type="exercise"><div id="id46124893" data-type="problem"><p id="element-239">e. Using the tree diagram, calculate <em data-effect="italics">P</em>(<em data-effect="italics">R</em> on 2nd draw GIVEN <em data-effect="italics">B</em> on 1st draw).</p> </div> <div id="id46124917" data-type="solution"><p>e. <em data-effect="italics">P</em>(<em data-effect="italics">R</em> on 2nd draw GIVEN <em data-effect="italics">B</em> on 1st draw) = <em data-effect="italics">P</em>(<em data-effect="italics">R</em> on 2nd|<em data-effect="italics">B</em> on 1st) = \(\frac{24}{88}\) = \(\frac{3}{11}\)</p> <p>This problem is a conditional one. The sample space has been reduced to those outcomes that already have a blue on the first draw. There are 24 + 64 = 88 possible outcomes (24 <em data-effect="italics">BR</em> and 64 <em data-effect="italics">BB</em>). Twenty-four of the 88 possible outcomes are <em data-effect="italics">BR</em>. \(\frac{24}{88}\) = \(\frac{3}{11}\).</p> <p>&nbsp;</p> </div> </div> <div id="element-946" data-type="exercise"><div id="id46125047" data-type="problem"><p>f. Using the tree diagram, calculate <em data-effect="italics">P</em>(<em data-effect="italics">BB</em>).</p> </div> <div id="id46125071" data-type="solution"><p id="element-925">f. <em data-effect="italics">P</em>(<em data-effect="italics">BB</em>) = \(\frac{64}{121}\)</p> <p>&nbsp;</p> </div> </div> <div id="element-285" data-type="exercise"><div id="id46125112" data-type="problem"><p>g. Using the tree diagram, calculate <em data-effect="italics">P</em>(<em data-effect="italics">B</em> on the 2nd draw given <em data-effect="italics">R</em> on the first draw).</p> </div> <div id="id46125136" data-type="solution" data-print-placement="end"><p id="element-602">g. <em data-effect="italics">P</em>(<em data-effect="italics">B</em> on 2nd draw|<em data-effect="italics">R</em> on 1st draw) = \(\frac{8}{11}\)</p> <p id="element-892">There are 9 + 24 outcomes that have <em data-effect="italics">R</em> on the first draw (9 <em data-effect="italics">RR</em> and 24 <em data-effect="italics">RB</em>). The sample space is then 9 + 24 = 33. 24 of the 33 outcomes have <em data-effect="italics">B</em> on the second draw. The probability is then \(\frac{24}{33}\).</p> </div> </div> </div> <div id="fs-idp102479616" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm22506096" data-type="exercise"><div id="fs-idp17589872" data-type="problem"><p id="fs-idp67142992">In a standard deck, there are 52 cards. 12 cards are face cards (event <em data-effect="italics">F</em>) and 40 cards are not face cards (event <em data-effect="italics">N</em>). Draw two cards, one at a time, with replacement. All possible outcomes are shown in the tree diagram as frequencies. Using the tree diagram, calculate <em data-effect="italics">P</em>(<em data-effect="italics">FF</em>).</p> <div id="eip-idp41385280" class="bc-figure figure"><span id="fs-idm2866400" data-type="media" data-alt="This is a tree diagram with branches showing frequencies of each draw. The first branch shows two lines: 12F and 40N. The second branch has a set of two lines (12F and 40N) for each line of the first branch. Multiply along each line to find 144FF, 480FN, 480NF, and 1,600NN." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C03_M07_tryit001N-1.jpg" alt="This is a tree diagram with branches showing frequencies of each draw. The first branch shows two lines: 12F and 40N. The second branch has a set of two lines (12F and 40N) for each line of the first branch. Multiply along each line to find 144FF, 480FN, 480NF, and 1,600NN." width="380" data-media-type="image/jpg" /></span></div> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><p>An urn has three red marbles and eight blue marbles in it. Draw two marbles, one at a time, this time without replacement, from the urn. <strong>&#8220;Without replacement&#8221;</strong> means that you do not put the first ball back before you select the second marble. Following is a tree diagram for this situation. The branches are labeled with probabilities instead of frequencies. The numbers at the ends of the branches are calculated by multiplying the numbers on the two corresponding branches, for example, \(\left(\frac{3}{11}\right)\left(\frac{2}{10}\right)=\frac{6}{110}\).</p> <div id="element-325a" class="bc-figure figure"><div class="bc-figcaption figcaption">Total = \(\frac{56+24+24+6}{110}=\frac{110}{110}=1\)</div> <p><span id="id47078287" data-type="media" data-alt="This is a tree diagram with branches showing probabilities of each draw. The first branch shows 2 lines: B 8/11 and R 3/11. The second branch has a set of 2 lines for each first branch line. Below B 8/11 are B 7/10 and R 3/10. Below R 3/11 are B 8/10 and R 2/10. Multiply along each line to find BB 56/110, BR 24/110, RB 24/110, and RR 6/110."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch03_07_02-1.jpg" alt="This is a tree diagram with branches showing probabilities of each draw. The first branch shows 2 lines: B 8/11 and R 3/11. The second branch has a set of 2 lines for each first branch line. Below B 8/11 are B 7/10 and R 3/10. Below R 3/11 are B 8/10 and R 2/10. Multiply along each line to find BB 56/110, BR 24/110, RB 24/110, and RR 6/110." width="400" data-media-type="image/jpg" /></span></p> </div> <div id="id47078390" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="eip-idp61215856">If you draw a red on the first draw from the three red possibilities, there are two red marbles left to draw on the second draw. You do not put back or replace the first marble after you have drawn it. You draw <strong>without replacement</strong>, so that on the second draw there are ten marbles left in the urn.</p> </div> <p id="element-993"><span data-type="newline"><br /> </span>Calculate the following probabilities using the tree diagram.</p> <p>&nbsp;</p> <div data-type="exercise"><div id="id47078429" data-type="problem"><p id="element-452">a. <em data-effect="italics">P</em>(<em data-effect="italics">RR</em>) = ________</p> </div> <div id="id47078452" data-type="solution"><p id="element-659a">a. <em data-effect="italics">P</em>(<em data-effect="italics">RR</em>) = \(\left(\frac{3}{11}\right)\left(\frac{2}{10}\right)=\frac{6}{110}\)</p> <p>&nbsp;</p> </div> </div> <div id="element-238" data-type="exercise"><div id="id47078522" data-type="problem"><p id="element-608">b. Fill in the blanks:</p> <p><em data-effect="italics">P</em>(<em data-effect="italics">RB</em> OR <em data-effect="italics">BR</em>) = \(\left(\frac{3}{11}\right)\left(\frac{8}{10}\right)\text{ }+\text{ (___)(___) }=\text{ }\frac{48}{110}\)</p> </div> <div id="id47078597" data-type="solution" data-print-placement="end"><p id="fs-idm21770224">b. <em data-effect="italics">P</em>(<em data-effect="italics">RB</em> OR <em data-effect="italics">BR</em>) = \(\left(\frac{3}{11}\right)\left(\frac{8}{10}\right)\) + \(\left(\frac{8}{11}\right)\left(\frac{3}{10}\right)\) = \(\frac{48}{110}\)</p> <p>&nbsp;</p> </div> </div> <div id="element-921" data-type="exercise"><div id="id46086278" data-type="problem"><p id="element-8295">c. <em data-effect="italics">P</em>(<em data-effect="italics">R</em> on 2nd|<em data-effect="italics">B</em> on 1st) =</p> </div> <div id="id46086303" data-type="solution" data-print-placement="end"><p id="element-43">c. <em data-effect="italics">P</em>(<em data-effect="italics">R</em> on 2nd|<em data-effect="italics">B</em> on 1st) = \(\frac{3}{10}\)</p> <p>&nbsp;</p> </div> </div> <div id="fs-idm185198528" data-type="exercise"><div id="id46086349" data-type="problem"><p id="element-363">d. Fill in the blanks.</p> <p id="element-400"><em data-effect="italics">P</em>(<em data-effect="italics">R</em> on 1st AND <em data-effect="italics">B</em> on 2nd) = <em data-effect="italics">P</em>(<em data-effect="italics">RB</em>) = (___)(___) = \(\frac{24}{100}\)</p> </div> <div id="id46086410" data-type="solution"><p>d. <em data-effect="italics">P</em>(<em data-effect="italics">R</em> on 1st AND <em data-effect="italics">B</em> on 2nd) = <em data-effect="italics">P</em>(<em data-effect="italics">RB</em>) = \(\left(\frac{3}{11}\right)\left(\frac{8}{10}\right)\) = \(\frac{24}{100}\)</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id46086515" data-type="problem"><p>e. Find <em data-effect="italics">P</em>(<em data-effect="italics">BB</em>).</p> </div> <div id="id46086539" data-type="solution"><p id="element-547">e. <em data-effect="italics">P</em>(<em data-effect="italics">BB</em>) = \(\left(\frac{8}{11}\right)\left(\frac{7}{10}\right)\)</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id46086597" data-type="problem"><p id="element-64">f. Find <em data-effect="italics">P</em>(<em data-effect="italics">B</em> on 2nd|<em data-effect="italics">R</em> on 1st).</p> </div> <div id="id46086622" data-type="solution"><p id="element-851">f. Using the tree diagram, <em data-effect="italics">P</em>(<em data-effect="italics">B</em> on 2nd|<em data-effect="italics">R</em> on 1st) = <em data-effect="italics">P</em>(<em data-effect="italics">R</em>|<em data-effect="italics">B</em>) = \(\frac{8}{10}\).</p> </div> </div> <p>If we are using probabilities, we can label the tree in the following general way.</p> <p><span id="id46086747" data-type="media" data-alt="This is a tree diagram for a two-step experiment. The first branch shows first outcome: P(B) and P(R). The second branch has a set of 2 lines for each line of the first branch: the probability of B given B = P(BB), the probability of R given B = P(RB), the probability of B given R = P(BR), and the probability of R given R = P(RR)." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch03_07_03N-1.jpg" alt="This is a tree diagram for a two-step experiment. The first branch shows first outcome: P(B) and P(R). The second branch has a set of 2 lines for each line of the first branch: the probability of B given B = P(BB), the probability of R given B = P(RB), the probability of B given R = P(BR), and the probability of R given R = P(RR)." width="400" data-media-type="image/jpg" /></span></p> <ul id="element-425"><li><em data-effect="italics">P</em>(<em data-effect="italics">R</em>|<em data-effect="italics">R</em>) here means <em data-effect="italics">P</em>(<em data-effect="italics">R</em> on 2nd|<em data-effect="italics">R</em> on 1st)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">B</em>|<em data-effect="italics">R</em>) here means <em data-effect="italics">P</em>(<em data-effect="italics">B</em> on 2nd|<em data-effect="italics">R</em> on 1st)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">R</em>|<em data-effect="italics">B</em>) here means <em data-effect="italics">P</em>(<em data-effect="italics">R</em> on 2nd|<em data-effect="italics">B</em> on 1st)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">B</em>|<em data-effect="italics">B</em>) here means <em data-effect="italics">P</em>(<em data-effect="italics">B</em> on 2nd|<em data-effect="italics">B</em> on 1st)</li> </ul> </div> <div id="fs-idp64832640" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp40630704" data-type="exercise"><div id="fs-idp516608" data-type="problem"><p id="fs-idp47082864">In a standard deck, there are 52 cards. Twelve cards are face cards (<em data-effect="italics">F</em>) and 40 cards are not face cards (<em data-effect="italics">N</em>). Draw two cards, one at a time, without replacement. The tree diagram is labeled with all possible probabilities.</p> <div id="eip-idm127200928" class="bc-figure figure"><span id="fs-idp88881552" data-type="media" data-alt="This is a tree diagram with branches showing frequencies of each draw. The first branch shows 2 lines: F 12/52 and N 40/52. The second branch has a set of 2 lines (F 11/52 and N 40/51) for each line of the first branch. Multiply along each line to find FF 121/2652, FN 480/2652, NF 480/2652, and NN 1560/2652." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C03_M07_tryit002-1.jpg" alt="This is a tree diagram with branches showing frequencies of each draw. The first branch shows 2 lines: F 12/52 and N 40/52. The second branch has a set of 2 lines (F 11/52 and N 40/51) for each line of the first branch. Multiply along each line to find FF 121/2652, FN 480/2652, NF 480/2652, and NN 1560/2652." width="400" data-media-type="image/jpg" data-print-width="4in" /></span></div> <ol id="fs-idm44947616" type="a"><li>Find <em data-effect="italics">P</em>(<em data-effect="italics">FN</em> OR <em data-effect="italics">NF</em>).</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">N</em>|<em data-effect="italics">F</em>).</li> <li>Find <em data-effect="italics">P</em>(at most one face card). <span data-type="newline"><br /> </span>Hint: &#8220;At most one face card&#8221; means zero or one face card.</li> <li>Find <em data-effect="italics">P</em>(at least on face card). <span data-type="newline"><br /> </span>Hint: &#8220;At least one face card&#8221; means one or two face cards.</li> </ol> </div> </div> </div> <div id="fs-idp72039056" class="textbox textbox--examples" data-type="example"><p id="fs-idp127859600">A litter of kittens available for adoption at the Humane Society has four tabby kittens and five black kittens. A family comes in and randomly selects two kittens (without replacement) for adoption.</p> <p><span id="fs-idp49760160" data-type="media" data-alt="This is a tree diagram with branches showing probabilities of kitten choices. The first branch shows two lines: T 4/9 and B 5/9. The second branch has a set of 2 lines for each first branch line. Below T 4/9 are T 3/8 and B 5/8. Below B 5/9 are T 4/8 and B 4/8. Multiply along each line to find probabilities of possible combinations." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C03_M07_001N-1.jpg" alt="This is a tree diagram with branches showing probabilities of kitten choices. The first branch shows two lines: T 4/9 and B 5/9. The second branch has a set of 2 lines for each first branch line. Below T 4/9 are T 3/8 and B 5/8. Below B 5/9 are T 4/8 and B 4/8. Multiply along each line to find probabilities of possible combinations." width="400" data-media-type="image/jpg" data-print-width="4in" /></span></p> <div id="fs-idp40987552" data-type="exercise"><div id="fs-idp44496224" data-type="problem"><ol id="fs-idm1310240" type="a"><li>What is the probability that both kittens are tabby?a.\(\left(\frac{1}{2}\right)\left(\frac{1}{2}\right)\) b.\(\left(\frac{4}{9}\right)\left(\frac{4}{9}\right)\) c.\(\left(\frac{4}{9}\right)\left(\frac{3}{8}\right)\) d.\(\left(\frac{4}{9}\right)\left(\frac{5}{9}\right)\)</li> <li>What is the probability that one kitten of each coloring is selected?a.\(\left(\frac{4}{9}\right)\left(\frac{5}{9}\right)\) b.\(\left(\frac{4}{9}\right)\left(\frac{5}{8}\right)\) c.\(\left(\frac{4}{9}\right)\left(\frac{5}{9}\right)+\left(\frac{5}{9}\right)\left(\frac{4}{9}\right)\) d.\(\left(\frac{4}{9}\right)\left(\frac{5}{8}\right)+\left(\frac{5}{9}\right)\left(\frac{4}{8}\right)\)</li> <li>What is the probability that a tabby is chosen as the second kitten when a black kitten was chosen as the first?</li> <li>What is the probability of choosing two kittens of the same color?</li> </ol> </div> <div id="fs-idm2776368" data-type="solution"><p id="fs-idp7489584">a. c, b. d, c. \(\frac{4}{8}\), d. \(\frac{32}{72}\)</p> </div> </div> </div> <div id="fs-idp64465776" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm62426848" data-type="exercise"><div id="fs-idp61299152" data-type="problem"><p id="fs-idm3074160">Suppose there are four red balls and three yellow balls in a box. Two balls are drawn from the box without replacement. What is the probability that one ball of each coloring is selected?</p> </div> </div> </div> </div> <div class="bc-section section" data-depth="1"><h3 data-type="title">Venn Diagram</h3> <p>A <span data-type="term">Venn diagram</span> is a picture that represents the outcomes of an experiment. It generally consists of a box that represents the sample space S together with circles or ovals. The circles or ovals represent events.</p> <div class="textbox textbox--examples" data-type="example"><p>Suppose an experiment has the outcomes 1, 2, 3, &#8230; , 12 where each outcome has an equal chance of occurring. Let event <em data-effect="italics">A</em> = {1, 2, 3, 4, 5, 6} and event <em data-effect="italics">B</em> = {6, 7, 8, 9}. Then <em data-effect="italics">A</em> AND <em data-effect="italics">B</em> = {6} and <em data-effect="italics">A</em> OR <em data-effect="italics">B</em> = {1, 2, 3, 4, 5, 6, 7, 8, 9}. The Venn diagram is as follows:</p> <div id="eip-idm17287840" class="bc-figure figure"><span id="id18119489" data-type="media" data-alt="A Venn diagram. An oval representing set A contains the values 1, 2, 3, 4, 5, and 6. An oval representing set B also contains the 6, along with 7, 8, and 9. The values 10, 11, and 12 are present but not contained in either set." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch03_06_01-1.jpg" alt="A Venn diagram. An oval representing set A contains the values 1, 2, 3, 4, 5, and 6. An oval representing set B also contains the 6, along with 7, 8, and 9. The values 10, 11, and 12 are present but not contained in either set." width="380" data-media-type="image/png" /></span></div> </div> <div id="fs-idp42923984" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp129832272" data-type="exercise"><div id="fs-idp89658416" data-type="problem"><p id="fs-idp136729728">Suppose an experiment has outcomes black, white, red, orange, yellow, green, blue, and purple, where each outcome has an equal chance of occurring. Let event <em data-effect="italics">C</em> = {green, blue, purple} and event <em data-effect="italics">P</em> = {red, yellow, blue}. Then <em data-effect="italics">C</em> AND <em data-effect="italics">P</em> = {blue} and <em data-effect="italics">C</em> OR <em data-effect="italics">P</em> = {green, blue, purple, red, yellow}. Draw a Venn diagram representing this situation.</p> </div> </div> </div> <div id="element-872" class="textbox textbox--examples" data-type="example"><p id="element-491">Flip two fair coins. Let <em data-effect="italics">A</em> = tails on the first coin. Let <em data-effect="italics">B</em> = tails on the second coin. Then <em data-effect="italics">A</em> = {<em data-effect="italics">TT</em>, <em data-effect="italics">TH</em>} and <em data-effect="italics">B</em> = {<em data-effect="italics">TT</em>, <em data-effect="italics">HT</em>}. Therefore, <em data-effect="italics">A</em> AND <em data-effect="italics">B</em> = {<em data-effect="italics">TT</em>}. <em data-effect="italics">A</em> OR <em data-effect="italics">B</em> = {<em data-effect="italics">TH</em>, <em data-effect="italics">TT</em>, <em data-effect="italics">HT</em>}.</p> <p id="element-245">The sample space when you flip two fair coins is <em data-effect="italics">X</em> = {<em data-effect="italics">HH</em>, <em data-effect="italics">HT</em>, <em data-effect="italics">TH</em>, <em data-effect="italics">TT</em>}. The outcome <em data-effect="italics">HH</em> is in NEITHER <em data-effect="italics">A</em> NOR <em data-effect="italics">B</em>. The Venn diagram is as follows:</p> <div id="eip-idm154602320" class="bc-figure figure"><span id="id18154607" data-type="media" data-alt="This is a venn diagram. An oval representing set A contains Tails + Heads and Tails + Tails. An oval representing set B also contains Tails + Tails, along with Heads + Tails. The universe S contains Heads + Heads, but this value is not contained in either set A or B." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch03_06_02-1.jpg" alt="This is a venn diagram. An oval representing set A contains Tails + Heads and Tails + Tails. An oval representing set B also contains Tails + Tails, along with Heads + Tails. The universe S contains Heads + Heads, but this value is not contained in either set A or B." width="400" data-media-type="image/jpg" /></span></div> </div> <div id="fs-idm20878064" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="eip-idm77418224" data-type="exercise"><div id="eip-idm66827488" data-type="problem"><p id="eip-idp2578208">Roll a fair, six-sided die. Let <em data-effect="italics">A</em> = a prime number of dots is rolled. Let <em data-effect="italics">B</em> = an odd number of dots is rolled. Then <em data-effect="italics">A</em> = {2, 3, 5} and <em data-effect="italics">B</em> = {1, 3, 5}. Therefore, <em data-effect="italics">A</em> AND <em data-effect="italics">B</em> = {3, 5}. <em data-effect="italics">A</em> OR <em data-effect="italics">B</em> = {1, 2, 3, 5}. The sample space for rolling a fair die is <em data-effect="italics">S</em> = {1, 2, 3, 4, 5, 6}. Draw a Venn diagram representing this situation.</p> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><p id="element-383"><strong>Forty percent</strong> of the students at a local college belong to a club and <strong>50%</strong> work part time. <strong>Five percent</strong> of the students work part time and belong to a club. Draw a Venn diagram showing the relationships. Let <em data-effect="italics">C</em> = student belongs to a club and <em data-effect="italics">PT</em> = student works part time.</p> <div id="fs-idp1123328" class="bc-figure figure"><span id="id18154697" data-type="media" data-alt="This is a venn diagram with one set containing students in clubs and another set containing students working part-time. Both sets share students who are members of clubs and also work part-time. The universe is labeled S."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch03_06_03N-1.jpg" alt="This is a venn diagram with one set containing students in clubs and another set containing students working part-time. Both sets share students who are members of clubs and also work part-time. The universe is labeled S." width="400" data-media-type="image/jpg" /></span></div> <p>If a student is selected at random, find</p> <ul><li>the probability that the student belongs to a club. <em data-effect="italics">P</em>(<em data-effect="italics">C</em>) = 0.40</li> <li>the probability that the student works part time. <em data-effect="italics">P</em>(<em data-effect="italics">PT</em>) = 0.50</li> <li>the probability that the student belongs to a club AND works part time. <em data-effect="italics">P</em>(<em data-effect="italics">C</em> AND <em data-effect="italics">PT</em>) = 0.05</li> <li>the probability that the student belongs to a club <strong>given</strong> that the student works part time. \(P\text{(}C\text{|}PT\text{)} = \frac{P\text{(}C\text{ AND }PT\text{)}}{P\text{(}PT\text{)}} = \frac{0.05}{0.50} = 0.1\)</li> <li>the probability that the student belongs to a club <strong>OR</strong> works part time. <em data-effect="italics">P</em>(<em data-effect="italics">C</em> OR <em data-effect="italics">PT</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">C</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">PT</em>) &#8211; <em data-effect="italics">P</em>(<em data-effect="italics">C</em> AND <em data-effect="italics">PT</em>) = 0.40 + 0.50 &#8211; 0.05 = 0.85</li> </ul> </div> <div id="fs-idm55345616" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp89579120" data-type="exercise"><div id="fs-idp89636640" data-type="problem"><p id="fs-idp120162528">Fifty percent of the workers at a factory work a second job, 25% have a spouse who also works, 5% work a second job and have a spouse who also works. Draw a Venn diagram showing the relationships. Let <em data-effect="italics">W</em> = works a second job and <em data-effect="italics">S</em> = spouse also works.</p> </div> </div> </div> <div id="fs-idp87467600" class="textbox textbox--examples" data-type="example"><div id="fs-idp88064048" data-type="exercise"><div id="fs-idp115024704" data-type="problem"><p id="fs-idp199220464">A person with type O blood and a negative Rh factor (Rh-) can donate blood to any person with any blood type. Four percent of African Americans have type O blood and a negative RH factor, 5−10% of African Americans have the Rh- factor, and 51% have type O blood.</p> <div id="fs-idp143600032" class="bc-figure figure"><span id="fs-idp78366192" data-type="media" data-alt="This is an empty Venn diagram showing two overlapping circles. The left circle is labeled O and the right circle is labeled RH-."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C03_M06_001f-1.jpg" alt="This is an empty Venn diagram showing two overlapping circles. The left circle is labeled O and the right circle is labeled RH-." width="400" data-media-type="image/jpg" /></span></div> <p id="fs-idp137061024">The “O” circle represents the African Americans with type O blood. The “Rh-“ oval represents the African Americans with the Rh- factor.</p> <p id="fs-idp107178080">We will take the average of 5% and 10% and use 7.5% as the percent of African Americans who have the Rh- factor. Let <em data-effect="italics">O</em> = African American with Type O blood and <em data-effect="italics">R</em> = African American with Rh- factor.</p> <ol id="fs-idp55980272" type="a"><li><em data-effect="italics">P</em>(<em data-effect="italics">O</em>) = ___________</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">R</em>) = ___________</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">O</em> AND <em data-effect="italics">R</em>) = ___________</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">O</em> OR <em data-effect="italics">R</em>) = ____________</li> <li>In the Venn Diagram, describe the overlapping area using a complete sentence.</li> <li>In the Venn Diagram, describe the area in the rectangle but outside both the circle and the oval using a complete sentence.</li> </ol> </div> <div id="fs-idp133204112" data-type="solution"><p id="fs-idp88524560">a. 0.51; b. 0.075; c. 0.04; d. 0.545; e. The area represents the African Americans that have type O blood and the Rh- factor. f. The area represents the African Americans that have neither type O blood nor the Rh- factor.</p> </div> </div> </div> <div id="fs-idp47601696" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp85340096" data-type="exercise"><div id="fs-idp129244656" data-type="problem"><p id="fs-idp200572912">In a bookstore, the probability that the customer buys a novel is 0.6, and the probability that the customer buys a non-fiction book is 0.4. Suppose that the probability that the customer buys both is 0.2.</p> <ol id="fs-idp93717888" type="a"><li>Draw a Venn diagram representing the situation.</li> <li>Find the probability that the customer buys either a novel or anon-fiction book.</li> <li>In the Venn diagram, describe the overlapping area using a complete sentence.</li> <li>Suppose that some customers buy only compact disks. Draw an oval in your Venn diagram representing this event.</li> </ol> </div> </div> </div> </div> <div id="fs-idp43368080" class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="fs-idm13822272">Data from Clara County Public H.D.</p> <p id="fs-idm13821888">Data from the American Cancer Society.</p> <p id="fs-idm13822784">Data from The Data and Story Library, 1996. Available online at http://lib.stat.cmu.edu/DASL/ (accessed May 2, 2013).</p> <p id="fs-idp46541952">Data from the Federal Highway Administration, part of the United States Department of Transportation.</p> <p id="fs-idp46542448">Data from the United States Census Bureau, part of the United States Department of Commerce.</p> <p id="fs-idp46541568">Data from USA Today.</p> <p id="eip-478">“Environment.” The World Bank, 2013. Available online at http://data.worldbank.org/topic/environment (accessed May 2, 2013).</p> <p>“Search for Datasets.” Roper Center: Public Opinion Archives, University of Connecticut., 2013. Available online at http://www.ropercenter.uconn.edu/data_access/data/search_for_datasets.html (accessed May 2, 2013).</p> </div> <div id="fs-idp73675888" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idp59827120">A tree diagram use branches to show the different outcomes of experiments and makes complex probability questions easy to visualize.</p> <p id="fs-idp49604256">A Venn diagram is a picture that represents the outcomes of an experiment. It generally consists of a box that represents the sample space <em data-effect="italics">S</em> together with circles or ovals. The circles or ovals represent events. A Venn diagram is especially helpful for visualizing the OR event, the AND event, and the complement of an event and for understanding conditional probabilities.</p> </div> <div id="fs-idp14696688" class="practice" data-depth="1"><h3 data-type="title">The probability that a man develops some form of cancer in his lifetime is 0.4567. The probability that a man has at least one false positive test result (meaning the test comes back for cancer when the man does not have it) is 0.51. Let: <em data-effect="italics">C</em> = a man develops cancer in his lifetime; <em data-effect="italics">P</em> = man has at least one false positive. Construct a tree diagram of the situation.</h3> </div> <div id="fs-idp133440112" data-type="solution"><div id="eip-idm83918928" class="bc-figure figure"><span id="fs-idp90548816" data-type="media" data-alt="This is a tree diagram with two branches. The first branch, labeled Cancer, shows two lines: 0.4567 C and 0.5433 C'. The second branch is labeled False Positive. From C, there are two lines: 0 P and 1 P'. From C', there are two lines: 0.51 P and 0.49 P'." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C03_M07_101N-1.jpg" alt="This is a tree diagram with two branches. The first branch, labeled Cancer, shows two lines: 0.4567 C and 0.5433 C'. The second branch is labeled False Positive. From C, there are two lines: 0 P and 1 P'. From C', there are two lines: 0.51 P and 0.49 P'." width="380" data-media-type="image/jpg" /></span></div> </div> <div><p id="eip-id1171737024994">A box of cookies contains three chocolate and seven butter cookies. Miguel randomly selects a cookie and eats it. Then he randomly selects another cookie and eats it. (How many cookies did he take?)</p> <ol id="eip-id1171740927516" type="a"><li>Draw the tree that represents the possibilities for the cookie selections. Write the probabilities along each branch of the tree.</li> <li>Are the probabilities for the flavor of the SECOND cookie that Miguel selects independent of his first selection? Explain.</li> <li>For each complete path through the tree, write the event it represents and find the probabilities.</li> <li>Let S be the event that both cookies selected were the same flavor. Find <em data-effect="italics">P</em>(<em data-effect="italics">S</em>).</li> <li>Let <em data-effect="italics">T</em> be the event that the cookies selected were different flavors. Find <em data-effect="italics">P</em>(<em data-effect="italics">T</em>) by two different methods: by using the complement rule and by using the branches of the tree. Your answers should be the same with both methods.</li> <li>Let <em data-effect="italics">U</em> be the event that the second cookie selected is a butter cookie. Find <em data-effect="italics">P</em>(<em data-effect="italics">U</em>).</li> </ol> </div> <div id="fs-idp43118896" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <p id="fs-idp132890032"><em data-effect="italics">Use the following information to answer the next two exercises.</em> This tree diagram shows the tossing of an unfair coin followed by drawing one bead from a cup containing three red (<em data-effect="italics">R</em>), four yellow (<em data-effect="italics">Y</em>) and five blue (<em data-effect="italics">B</em>) beads. For the coin, <em data-effect="italics">P</em>(<em data-effect="italics">H</em>) = \(\frac{2}{3}\) and <em data-effect="italics">P</em>(<em data-effect="italics">T</em>) = \(\frac{1}{3}\) where <em data-effect="italics">H</em> is heads and <em data-effect="italics">T</em> is tails.</p> <div id="id43573084" class="bc-figure figure"><span id="id43573089" data-type="media" data-alt="Tree diagram with 2 branches. The first branch consists of 2 lines of H=2/3 and T=1/3. The second branch consists of 2 sets of 3 lines each with the both sets containing R=3/12, Y=4/12, and B=5/12."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch_03_11_01-1.jpg" alt="Tree diagram with 2 branches. The first branch consists of 2 lines of H=2/3 and T=1/3. The second branch consists of 2 sets of 3 lines each with the both sets containing R=3/12, Y=4/12, and B=5/12." width="180" data-media-type="image/jpg" /></span></div> <div id="element-684" data-type="exercise"><div id="id43573398" data-type="problem"><p id="element-573">1) Find <em data-effect="italics">P</em>(Blue bead).</p> <ol id="element-161" type="a"><li>\(\frac{15}{36}\)</li> <li>\(\frac{10}{36}\)</li> <li>\(\frac{10}{12}\)</li> <li>\(\frac{6}{36}\)</li> </ol> </div> <div id="id43573635" data-type="solution"><p id="element-4">2) Find <em data-effect="italics">P</em>(tossing a Head on the coin AND a Red bead)</p> <ol id="element-67" type="a"><li>\(\frac{2}{3}\)</li> <li>\(\frac{5}{15}\)</li> <li>\(\frac{6}{36}\)</li> <li>\(\frac{5}{36}\)</li> </ol> <p>&nbsp;</p> </div> </div> <div id="eip-346" data-type="exercise"><div id="eip-id1171744729232" data-type="problem"></div> </div> </div> <div id="fs-idm72330096" class="bring-together-homework" data-depth="1"><h3 data-type="title">Bringing It Together</h3> <p id="fs-idp109849696"><em data-effect="italics">Use the following information to answer the next two exercises.</em> Suppose that you have eight cards. Five are green and three are yellow. The cards are well shuffled.</p> <div data-type="exercise"><div id="eip-idm282416864" data-type="problem"><p id="element-137">3) Suppose that you randomly draw two cards, one at a time, <strong>with replacement</strong>. <span data-type="newline" data-count="1"><br /> </span>Let <em data-effect="italics">G</em><sub>1</sub> = first card is green <span data-type="newline" data-count="1"><br /> </span>Let <em data-effect="italics">G</em><sub>2</sub> = second card is green</p> <ol id="fs-idm25585520" type="a"><li>Draw a tree diagram of the situation.</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">G</em><sub>1</sub> AND <em data-effect="italics">G</em><sub>2</sub>).</li> <li>Find <em data-effect="italics">P</em>(at least one green).</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">G</em><sub>2</sub>|<em data-effect="italics">G</em><sub>1</sub>).</li> <li>Are <em data-effect="italics">G</em><sub>2</sub> and <em data-effect="italics">G</em><sub>1</sub> independent events? Explain why or why not.</li> </ol> <p>&nbsp;</p> </div> <div id="fs-idp20236912" data-type="solution"></div> </div> <div data-type="exercise"><div id="eip-idm163410608" data-type="problem"><p>4) Suppose that you randomly draw two cards, one at a time, <strong>without replacement</strong>. <span data-type="newline" data-count="1"><br /> </span><em data-effect="italics">G<sub>1</sub></em> = first card is green <span data-type="newline" data-count="1"><br /> </span><em data-effect="italics">G<sub>2</sub></em> = second card is green</p> <ol type="a"><li>Draw a tree diagram of the situation.</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">G<sub>1</sub></em> AND <em data-effect="italics">G<sub>2</sub></em>).</li> <li>Find <em data-effect="italics">P</em>(at least one green).</li> <li>Find <em data-effect="italics">P</em>(<em data-effect="italics">G<sub>2</sub></em>|<em data-effect="italics">G<sub>1</sub></em>).</li> <li>Are <em data-effect="italics">G<sub>2</sub></em> and <em data-effect="italics">G<sub>1</sub></em> independent events? Explain why or why not.</li> </ol> <p>&nbsp;</p> </div> </div> <p id="fs-idp143517536"><em data-effect="italics">5) Use the following information to answer the next two exercises.</em> The percent of licensed U.S. drivers (from a recent year) that are female is 48.60. Of the females, 5.03% are age 19 and under; 81.36% are age 20–64; 13.61% are age 65 or over. Of the licensed U.S. male drivers, 5.04% are age 19 and under; 81.43% are age 20–64; 13.53% are age 65 or over.</p> <div id="element-950" data-type="exercise"><div id="eip-idm29557008" data-type="problem"><p>Complete the following.</p> <ol type="a"><li>Construct a table or a tree diagram of the situation.</li> <li>Find <em data-effect="italics">P</em>(driver is female).</li> <li>Find <em data-effect="italics">P</em>(driver is age 65 or over|driver is female).</li> <li>Find <em data-effect="italics">P</em>(driver is age 65 or over AND female).</li> <li>In words, explain the difference between the probabilities in part c and part d.</li> <li>Find <em data-effect="italics">P</em>(driver is age 65 or over).</li> <li>Are being age 65 or over and being female mutually exclusive events? How do you know?</li> </ol> <p>&nbsp;</p> </div> <div id="fs-idp149290096" data-type="solution"></div> </div> <div data-type="exercise"><div id="eip-idm102025184" data-type="problem"><p id="fs-idm73415264">6) Suppose that 10,000 U.S. licensed drivers are randomly selected.</p> <ol type="a"><li>How many would you expect to be male?</li> <li>Using the table or tree diagram, construct a contingency table of gender versus age group.</li> <li>Using the contingency table, find the probability that out of the age 20–64 group, a randomly selected driver is female.</li> </ol> <p>&nbsp;</p> </div> </div> <div id="element-974" data-type="exercise"><div id="eip-idm45711968" data-type="problem"><p id="element-261">7) Approximately 86.5% of Americans commute to work by car, truck, or van. Out of that group, 84.6% drive alone and 15.4% drive in a carpool. Approximately 3.9% walk to work and approximately 5.3% take public transportation.</p> <ol id="element-742" type="a"><li>Construct a table or a tree diagram of the situation. Include a branch for all other modes of transportation to work.</li> <li>Assuming that the walkers walk alone, what percent of all commuters travel alone to work?</li> <li>Suppose that 1,000 workers are randomly selected. How many would you expect to travel alone to work?</li> <li>Suppose that 1,000 workers are randomly selected. How many would you expect to drive in a carpool?</li> </ol> <p>&nbsp;</p> </div> <div id="fs-idp100614736" data-type="solution"></div> </div> <div id="eip-272" data-type="exercise"><div data-type="problem"><p id="eip-216">8) When the Euro coin was introduced in 2002, two math professors had their statistics students test whether the Belgian one Euro coin was a fair coin. They spun the coin rather than tossing it and found that out of 250 spins, 140 showed a head (event <em data-effect="italics">H</em>) while 110 showed a tail (event <em data-effect="italics">T</em>). On that basis, they claimed that it is not a fair coin.</p> <ol id="eip-idp132357088" type="a"><li>Based on the given data, find <em data-effect="italics">P</em>(<em data-effect="italics">H</em>) and <em data-effect="italics">P</em>(<em data-effect="italics">T</em>).</li> <li>Use a tree to find the probabilities of each possible outcome for the experiment of tossing the coin twice.</li> <li>Use the tree to find the probability of obtaining exactly one head in two tosses of the coin.</li> <li>Use the tree to find the probability of obtaining at least one head.</li> </ol> </div> </div> <div data-type="exercise"><div id="eip-idp1286960" data-type="problem"><p id="fs-idm23436880"><em data-effect="italics"> Use the following information to answer the next two exercises.</em> The following are real data from Santa Clara County, CA. As of a certain time, there had been a total of 3,059 documented cases of AIDS in the county. They were grouped into the following categories:</p> <table id="element-436" summary="This table presents data of documented cases of AIDS with risk factor by gender. The first row lists the female values and the second row lists the male values. The first column lists the gender, the second column lists homosexual/bisexual, the third column lists IV drug user, the fourth column lists heterosexual contact, and the fifth column lists other."><caption><span data-type="title">* includes homosexual/bisexual IV drug users</span></caption> <thead><tr><th></th> <th>Homosexual/Bisexual</th> <th>IV Drug User*</th> <th>Heterosexual Contact</th> <th>Other</th> <th>Totals</th> </tr> </thead> <tbody><tr><td>Female</td> <td>0</td> <td>70</td> <td>136</td> <td>49</td> <td>____</td> </tr> <tr><td>Male</td> <td>2,146</td> <td>463</td> <td>60</td> <td>135</td> <td>____</td> </tr> <tr><td>Totals</td> <td>____</td> <td>____</td> <td>____</td> <td>____</td> <td>____</td> </tr> </tbody> </table> <p id="element-406">9)  Suppose a person with AIDS in Santa Clara County is randomly selected.</p> <ol id="element-232" type="a"><li>Find <em data-effect="italics">P</em>(Person is female).</li> <li>Find <em data-effect="italics">P</em>(Person has a risk factor heterosexual contact).</li> <li>Find <em data-effect="italics">P</em>(Person is female OR has a risk factor of IV drug user).</li> <li>Find <em data-effect="italics">P</em>(Person is female AND has a risk factor of homosexual/bisexual).</li> <li>Find <em data-effect="italics">P</em>(Person is male AND has a risk factor of IV drug user).</li> <li>Find <em data-effect="italics">P</em>(Person is female GIVEN person got the disease from heterosexual contact).</li> <li>Construct a Venn diagram. Make one group females and the other group heterosexual contact.</li> </ol> <p>&nbsp;</p> </div> </div> <div id="element-148s" data-type="exercise"><div id="eip-idm140591216" data-type="problem"><p id="element-532">10) Answer these questions using probability rules. Do NOT use the contingency table. Three thousand fifty-nine cases of AIDS had been reported in Santa Clara County, CA, through a certain date. Those cases will be our population. Of those cases, 6.4% obtained the disease through heterosexual contact and 7.4% are female. Out of the females with the disease, 53.3% got the disease from heterosexual contact.</p> <ol id="element-664" type="a"><li>Find <em data-effect="italics">P</em>(Person is female).</li> <li>Find <em data-effect="italics">P</em>(Person obtained the disease through heterosexual contact).</li> <li>Find <em data-effect="italics">P</em>(Person is female GIVEN person got the disease from heterosexual contact)</li> <li>Construct a Venn diagram representing this situation. Make one group females and the other group heterosexual contact. Fill in all values as probabilities.</li> </ol> <p>&nbsp;</p> <p><strong>Answers to odd questions</strong></p> <p>1) a</p> <p>3)</p> <ol id="fs-idp23896432" type="a"><li><div id="eip-idp101566032" class="bc-figure figure"><span id="fs-idp99953792" data-type="media" data-alt="This is a tree diagram with branches showing probabilities of each draw. The first branch shows two lines: 5/8 Green and 3/8 Yellow. The second branch has a set of two lines (5/8 Green and 3/8 Yellow) for each line of the first branch." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C03_M07_100-1.jpg" alt="This is a tree diagram with branches showing probabilities of each draw. The first branch shows two lines: 5/8 Green and 3/8 Yellow. The second branch has a set of two lines (5/8 Green and 3/8 Yellow) for each line of the first branch." width="380" data-media-type="image/jpg" /></span></div> </li> <li><em data-effect="italics">P</em>(<em data-effect="italics">GG</em>) = \(\left(\frac{5}{8}\right)\left(\frac{5}{8}\right)\) = \(\frac{25}{64}\)</li> <li><em data-effect="italics">P</em>(at least one green) = <em data-effect="italics">P</em>(<em data-effect="italics">GG</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">GY</em>) + <em data-effect="italics">P</em>(<em data-effect="italics">YG</em>) = \(\frac{25}{64}\) + \(\frac{15}{64}\) + \(\frac{15}{64}\) = \(\frac{55}{64}\)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">G</em>|<em data-effect="italics">G</em>) = \(\frac{5}{8}\)</li> <li>Yes, they are independent because the first card is placed back in the bag before the second card is drawn; the composition of cards in the bag remains the same from draw one to draw two.</li> </ol> <p>5)</p> <ol id="fs-idp149290352" type="a"><li><table id="fs-idp154688496" summary=""><thead><tr><th></th> <th>&lt;20</th> <th>20–64</th> <th>&gt;64</th> <th>Totals</th> </tr> </thead> <tbody><tr><td><strong>Female</strong></td> <td>0.0244</td> <td>0.3954</td> <td>0.0661</td> <td>0.486</td> </tr> <tr><td><strong>Male</strong></td> <td>0.0259</td> <td>0.4186</td> <td>0.0695</td> <td>0.514</td> </tr> <tr><td><strong>Totals</strong></td> <td>0.0503</td> <td>0.8140</td> <td>0.1356</td> <td>1</td> </tr> </tbody> </table> </li> <li><em data-effect="italics">P</em>(<em data-effect="italics">F</em>) = 0.486</li> <li><em data-effect="italics">P</em>(&gt;64|<em data-effect="italics">F</em>) = 0.1361</li> <li><em data-effect="italics">P</em>(&gt;64 and <em data-effect="italics">F</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">F</em>) <em data-effect="italics">P</em>(&gt;64|<em data-effect="italics">F</em>) = (0.486)(0.1361) = 0.0661</li> <li><em data-effect="italics">P</em>(&gt;64|<em data-effect="italics">F</em>) is the percentage of female drivers who are 65 or older and <em data-effect="italics">P</em>(&gt;64 and <em data-effect="italics">F</em>) is the percentage of drivers who are female and 65 or older.</li> <li><em data-effect="italics">P</em>(&gt;<em data-effect="italics">64</em>) = <em data-effect="italics">P</em>(&gt;64 and <em data-effect="italics">F</em>) + <em data-effect="italics">P</em>(&gt;64 and <em data-effect="italics">M</em>) = 0.1356</li> <li>No, being female and 65 or older are not mutually exclusive because they can occur at the same time P(&gt;64 and <em data-effect="italics">F</em>) = 0.0661.</li> </ol> <p>7)</p> <ol id="fs-idp77575584" type="a"><li><table id="fs-idp116055648" summary=""><thead><tr><th></th> <th>Car, Truck or Van</th> <th>Walk</th> <th>Public Transportation</th> <th>Other</th> <th>Totals</th> </tr> </thead> <tbody><tr><td><strong>Alone</strong></td> <td>0.7318</td> <td></td> <td></td> <td></td> <td></td> </tr> <tr><td><strong>Not Alone</strong></td> <td>0.1332</td> <td></td> <td></td> <td></td> <td></td> </tr> <tr><td><strong>Totals</strong></td> <td>0.8650</td> <td>0.0390</td> <td>0.0530</td> <td>0.0430</td> <td>1</td> </tr> </tbody> </table> </li> <li>If we assume that all walkers are alone and that none from the other two groups travel alone (which is a big assumption) we have: <em data-effect="italics">P</em>(Alone) = 0.7318 + 0.0390 = 0.7708.</li> <li>Make the same assumptions as in (b) we have: (0.7708)(1,000) = 771</li> <li>(0.1332)(1,000) = 133</li> </ol> <p>9)</p> <p id="element-436p">The completed contingency table is as follows:</p> <table id="element-436s" summary="This table is similar to above except all blank values are now filled in."><caption><span data-type="title">* includes homosexual/bisexual IV drug users</span></caption> <thead><tr><th></th> <th>Homosexual/Bisexual</th> <th>IV Drug User*</th> <th>Heterosexual Contact</th> <th>Other</th> <th>Totals</th> </tr> </thead> <tbody><tr><td>Female</td> <td>0</td> <td>70</td> <td>136</td> <td>49</td> <td><strong data-effect="bold">255</strong></td> </tr> <tr><td>Male</td> <td>2,146</td> <td>463</td> <td>60</td> <td>135</td> <td><strong>2,804</strong></td> </tr> <tr><td>Totals</td> <td><strong>2,146</strong></td> <td><strong>533</strong></td> <td><strong>196</strong></td> <td><strong>184</strong></td> <td><strong>3,059</strong></td> </tr> </tbody> </table> <ol type="a"><li>\(\frac{255}{3059}\)</li> <li>\(\frac{196}{3059}\)</li> <li>\(\frac{718}{3059}\)</li> <li>0</li> <li>\(\frac{463}{3059}\)</li> <li>\(\frac{136}{196}\)</li> <li><div id="eip-idp75092976" class="bc-figure figure"><span id="eip-idm12430048" data-type="media" data-alt="" data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C03_M06_100N-1.jpg" alt="" width="350" data-media-type="image/jpg" /></span></div> </li> </ol> </div> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="treediagram"><dt>Tree Diagram</dt> <dd id="id18749941">the useful visual representation of a sample space and events in the form of a “tree” with branches marked by possible outcomes together with associated probabilities (frequencies, relative frequencies)</dd> </dl> <dl id="vendiagram"><dt>Venn Diagram</dt> <dd id="id18154967">the visual representation of a sample space and events in the form of circles or ovals showing their intersections</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-probability-topics" title="Activity 4.7: Probability Topics"><div class="chapter-title-wrap"><h3 class="chapter-number">30</h3><h2 class="chapter-title"><span class="display-none">Activity 4.7: Probability Topics</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-id1172778073391" class="statistics lab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Probability Topics</div> <p>Class time:</p> <p id="element-279">Names:</p> <p id="fs-idp7760432"><span data-type="title">Student Learning Outcomes</span></p> <ul><li>The student will use theoretical and empirical methods to estimate probabilities.</li> <li>The student will appraise the differences between the two estimates.</li> <li>The student will demonstrate an understanding of long-term relative frequencies.</li> </ul> <p id="element-410"><span data-type="title">Do the Experiment</span> Count out 40 mixed-color M&amp;Ms® which is approximately one small bag’s worth. Record the number of each color in <a class="autogenerated-content" href="#M05_ch03-tbl009">(Figure)</a>. Use the information from this table to complete <a class="autogenerated-content" href="#M05_ch03-tbl010">(Figure)</a>. Next, put the M&amp;Ms in a cup. The experiment is to pick two M&amp;Ms, one at a time. Do <strong>not</strong> look at them as you pick them. The first time through, replace the first M&amp;M before picking the second one. Record the results in the “With Replacement” column of <a class="autogenerated-content" href="#M05_ch03-tbl011">(Figure)</a>. Do this 24 times. The second time through, after picking the first M&amp;M, do <strong>not</strong> replace it before picking the second one. Then, pick the second one. Record the results in the “Without Replacement” column section of <a class="autogenerated-content" href="#M05_ch03-tbl012">(Figure)</a>. After you record the pick, put <strong>both</strong> M&amp;Ms back. Do this a total of 24 times, also. Use the data from <a class="autogenerated-content" href="#M05_ch03-tbl012">(Figure)</a> to calculate the empirical probability questions. Leave your answers in unreduced fractional form. Do <strong>not</strong> multiply out any fractions.</p> <table summary="Partially filled theoretical data table. The first column lists the color (6 rows) and the blank second column lists the quantity values."><caption><span data-type="title">Population</span></caption> <thead><tr><th>Color</th> <th>Quantity</th> </tr> </thead> <tbody><tr><td>Yellow (<em data-effect="italics">Y</em>)</td> <td></td> </tr> <tr><td>Green (<em data-effect="italics">G</em>)</td> <td></td> </tr> <tr><td>Blue (<em data-effect="italics">BL</em>)</td> <td></td> </tr> <tr><td>Brown (<em data-effect="italics">B</em>)</td> <td></td> </tr> <tr><td>Orange (<em data-effect="italics">O</em>)</td> <td></td> </tr> <tr><td>Red (<em data-effect="italics">R</em>)</td> <td></td> </tr> </tbody> </table> <table id="M05_ch03-tbl010" summary=""><caption><span data-type="title">Theoretical Probabilities</span></caption> <thead><tr><th></th> <th>With Replacement</th> <th>Without Replacement</th> </tr> </thead> <tbody><tr><td><em data-effect="italics">P</em>(2 reds)</td> <td></td> <td></td> </tr> <tr><td><em data-effect="italics">P</em>(<em data-effect="italics">R</em><sub>1</sub><em data-effect="italics">B</em><sub>2</sub> OR <em data-effect="italics">B</em><sub>1</sub><em data-effect="italics">R</em><sub>2</sub>)</td> <td></td> <td></td> </tr> <tr><td><em data-effect="italics">P</em>(<em data-effect="italics">R</em><sub>1</sub> AND <em data-effect="italics">G</em><sub>2</sub>)</td> <td></td> <td></td> </tr> <tr><td><em data-effect="italics">P</em>(<em data-effect="italics">G</em><sub>2</sub>|<em data-effect="italics">R</em><sub>1</sub>)</td> <td></td> <td></td> </tr> <tr><td><em data-effect="italics">P</em>(no yellows)</td> <td></td> <td></td> </tr> <tr><td><em data-effect="italics">P</em>(doubles)</td> <td></td> <td></td> </tr> <tr><td><em data-effect="italics">P</em>(no doubles)</td> <td></td> <td></td> </tr> </tbody> </table> <div id="fs-idp15354112" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="fs-idp27436528"><em data-effect="italics">G</em><sub>2</sub> = green on second pick; <em data-effect="italics">R</em><sub>1</sub> = red on first pick; <em data-effect="italics">B</em><sub>1</sub> = brown on first pick; <em data-effect="italics">B</em><sub>2</sub> = brown on second pick; doubles = both picks are the same colour.</p> </div> <table summary="Blank empirical results table with the first column designated for with replacement and the second column listed for without replacement. 24 empty cells."><caption><span data-type="title">Empirical Results</span></caption> <thead><tr><th>With Replacement</th> <th>Without Replacement</th> </tr> </thead> <tbody><tr><td>( __ , __ ) ( __ , __ )</td> <td>( __ , __ ) ( __ , __ )</td> </tr> <tr><td>( __ , __ ) ( __ , __ )</td> <td>( __ , __ ) ( __ , __ )</td> </tr> <tr><td>( __ , __ ) ( __ , __ )</td> <td>( __ , __ ) ( __ , __ )</td> </tr> <tr><td>( __ , __ ) ( __ , __ )</td> <td>( __ , __ ) ( __ , __ )</td> </tr> <tr><td>( __ , __ ) ( __ , __ )</td> <td>( __ , __ ) ( __ , __ )</td> </tr> <tr><td>( __ , __ ) ( __ , __ )</td> <td>( __ , __ ) ( __ , __ )</td> </tr> <tr><td>( __ , __ ) ( __ , __ )</td> <td>( __ , __ ) ( __ , __ )</td> </tr> <tr><td>( __ , __ ) ( __ , __ )</td> <td>( __ , __ ) ( __ , __ )</td> </tr> <tr><td>( __ , __ ) ( __ , __ )</td> <td>( __ , __ ) ( __ , __ )</td> </tr> <tr><td>( __ , __ ) ( __ , __ )</td> <td>( __ , __ ) ( __ , __ )</td> </tr> <tr><td>( __ , __ ) ( __ , __ )</td> <td>( __ , __ ) ( __ , __ )</td> </tr> <tr><td>( __ , __ ) ( __ , __ )</td> <td>( __ , __ ) ( __ , __ )</td> </tr> </tbody> </table> <table id="M05_ch03-tbl012" summary=""><caption><span data-type="title">Empirical Probabilities</span></caption> <thead><tr><th></th> <th>With Replacement</th> <th>Without Replacement</th> </tr> </thead> <tbody><tr><td><em data-effect="italics">P</em>(2 reds)</td> <td></td> <td></td> </tr> <tr><td><em data-effect="italics">P</em>(<em data-effect="italics">R</em><sub>1</sub><em data-effect="italics">B</em><sub>2</sub> OR <em data-effect="italics">B</em><sub>1</sub><em data-effect="italics">R</em><sub>2</sub>)</td> <td></td> <td></td> </tr> <tr><td><em data-effect="italics">P</em>(<em data-effect="italics">R</em><sub>1</sub> AND <em data-effect="italics">G</em><sub>2</sub>)</td> <td></td> <td></td> </tr> <tr><td><em data-effect="italics">P</em>(<em data-effect="italics">G</em><sub>2</sub>|<em data-effect="italics">R</em><sub>1</sub>)</td> <td></td> <td></td> </tr> <tr><td><em data-effect="italics">P</em>(no yellows)</td> <td></td> <td></td> </tr> <tr><td><em data-effect="italics">P</em>(doubles)</td> <td></td> <td></td> </tr> <tr><td><em data-effect="italics">P</em>(no doubles)</td> <td></td> <td></td> </tr> </tbody> </table> <p id="fs-idp8995504"><span data-type="title">Discussion Questions</span></p> <ol id="element-148"><li>Why are the “With Replacement” and “Without Replacement” probabilities different?</li> <li>Convert <em data-effect="italics">P</em>(no yellows) to decimal format for both Theoretical “With Replacement” and for Empirical “With Replacement”. Round to four decimal places. <ol id="sublist1" type="a"><li>Theoretical “With Replacement”: <em data-effect="italics">P</em>(no yellows) = _______</li> <li>Empirical “With Replacement”: <em data-effect="italics">P</em>(no yellows) = _______</li> <li>Are the decimal values “close”? Did you expect them to be closer together or farther apart? Why?</li> </ol> </li> <li>If you increased the number of times you picked two M&amp;Ms to 240 times, why would empirical probability values change?</li> <li>Would this change (see part 3) cause the empirical probabilities and theoretical probabilities to be closer together or farther apart? How do you know?</li> <li>Explain the differences in what <em data-effect="italics">P</em>(<em data-effect="italics">G</em><sub>1</sub> AND <em data-effect="italics">R</em><sub>2</sub>) and <em data-effect="italics">P</em>(<em data-effect="italics">R</em><sub>1</sub>|<em data-effect="italics">G</em><sub>2</sub>) represent. Hint: Think about the sample space for each probability.</li> </ol> </div> </div></div>
<div class="part " id="part-discrete-random-variables"><div class="part-title-wrap"><h3 class="part-number">V</h3><h1 class="part-title">Chapter 5: Discrete Random Variables</h1></div><div class="ugc part-ugc"></div></div>
<div class="chapter standard" id="chapter-introduction-16" title="Chapter 5.1: Introduction"><div class="chapter-title-wrap"><h3 class="chapter-number">31</h3><h2 class="chapter-title"><span class="display-none">Chapter 5.1: Introduction</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-idp163541072" class="splash"><div class="bc-figcaption figcaption">You can use probability and discrete random variables to calculate the likelihood of lightning striking the ground five times during a half-hour thunderstorm. (Credit: Leszek Leszczynski)</div> <p><span id="fs-idp141532880" data-type="media" data-alt="This photo shows branch lightening coming from a dark cloud and hitting the ground."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/CNX_Stats_C04_CON-1.jpg" alt="This photo shows branch lightening coming from a dark cloud and hitting the ground." width="500" data-media-type="image/jpg" /></span></p> </div> <div id="fs-idp120416736" class="chapter-objectives" data-type="note" data-has-label="true" data-label=""><div data-type="title">Chapter Objectives</div> <p>By the end of this chapter, the student should be able to:</p> <ul id="list14235"><li>Recognize and understand discrete probability distribution functions, in general.</li> <li>Calculate and interpret expected values.</li> <li>Recognize the binomial probability distribution and apply it appropriately.</li> <li>Recognize the Poisson probability distribution and apply it appropriately.</li> <li>Recognize the geometric probability distribution and apply it appropriately.</li> <li>Recognize the hypergeometric probability distribution and apply it appropriately.</li> <li>Classify discrete word problems by their distributions.</li> </ul> </div> <p id="introch04">A student takes a ten-question, true-false quiz. Because the student had such a busy schedule, he or she could not study and guesses randomly at each answer. What is the probability of the student passing the test with at least a 70%?</p> <p id="element-804">Small companies might be interested in the number of long-distance phone calls their employees make during the peak time of the day. Suppose the average is 20 calls. What is the probability that the employees make more than 20 long-distance phone calls during the peak time?</p> <p>These two examples illustrate two different types of probability problems involving discrete random variables. Recall that discrete data are data that you can count. A <span data-type="term">random variable</span> describes the outcomes of a statistical experiment in words. The values of a random variable can vary with each repetition of an experiment.</p> <div id="randvarnot" class="bc-section section" data-depth="1"><h3 data-type="title">Random Variable Notation</h3> <p>Upper case letters such as <em data-effect="italics">X</em> or <em data-effect="italics">Y</em> denote a random variable. Lower case letters like <em data-effect="italics">x</em> or <em data-effect="italics">y</em> denote the value of a random variable. If <strong><em data-effect="italics">X</em> is a random variable, then <em data-effect="italics">X</em> is written in words, and <em data-effect="italics">x</em> is given as a number.</strong></p> <p>For example, let <em data-effect="italics">X</em> = the number of heads you get when you toss three fair coins. The sample space for the toss of three fair coins is <em data-effect="italics">TTT</em>; <em data-effect="italics">THH</em>; <em data-effect="italics">HTH</em>; <em data-effect="italics">HHT</em>; <em data-effect="italics">HTT</em>; <em data-effect="italics">THT</em>; <em data-effect="italics">TTH</em>; <em data-effect="italics">HHH</em>. Then, <em data-effect="italics">x</em> = 0, 1, 2, 3. <em data-effect="italics">X</em> is in words and <em data-effect="italics">x</em> is a number. Notice that for this example, the <em data-effect="italics">x</em> values are countable outcomes. Because you can count the possible values that <em data-effect="italics">X</em> can take on and the outcomes are random (the <em data-effect="italics">x</em> values 0, 1, 2, 3), <em data-effect="italics">X</em> is a discrete random variable.</p> </div> <div id="fs-idp37582368" class="statistics collab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Collaborative Exercise</div> <p>Toss a coin ten times and record the number of heads. After all members of the class have completed the experiment (tossed a coin ten times and counted the number of heads), fill in <a class="autogenerated-content" href="#M01_Ch04_tbl001">(Figure)</a>. Let <em data-effect="italics">X</em> = the number of heads in ten tosses of the coin.</p> <table id="M01_Ch04_tbl001" summary="Table showing x, frequency of x and relative frequency of x. x = the number of heads in 10 tosses of a fair coin."><thead><tr><th><strong><em data-effect="italics">x</em></strong></th> <th><strong>Frequency of <em data-effect="italics">x</em></strong></th> <th><strong>Relative Frequency of <em data-effect="italics">x</em></strong></th> </tr> </thead> <tbody><tr><td></td> <td></td> <td></td> </tr> <tr><td></td> <td></td> <td></td> </tr> <tr><td></td> <td></td> <td></td> </tr> <tr><td></td> <td></td> <td></td> </tr> <tr><td></td> <td></td> <td></td> </tr> <tr><td></td> <td></td> <td></td> </tr> </tbody> </table> <ol type="a"><li>Which value(s) of <em data-effect="italics">x</em> occurred most frequently?</li> <li>If you tossed the coin 1,000 times, what values could <em data-effect="italics">x</em> take on? Which value(s) of <em data-effect="italics">x</em> do you think would occur most frequently?</li> <li>What does the relative frequency column sum to?</li> </ol> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="randvar"><dt>Random Variable (RV)</dt> <dd id="id1167466326893">a characteristic of interest in a population being studied; common notation for variables are upper case Latin letters <em data-effect="italics">X</em>, <em data-effect="italics">Y</em>, <em data-effect="italics">Z</em>,&#8230;; common notation for a specific value from the domain (set of all possible values of a variable) are lower case Latin letters <em data-effect="italics">x, y,</em> and <em data-effect="italics">z</em>. For example, if <em data-effect="italics">X</em> is the number of children in a family, then <em data-effect="italics">x</em> represents a specific integer 0, 1, 2, 3,&#8230;. Variables in statistics differ from variables in intermediate algebra in the two following ways. <ul id="arrvee"><li>The domain of the random variable (RV) is not necessarily a numerical set; the domain may be expressed in words; for example, if <em data-effect="italics">X</em> = hair color then the domain is {black, blond, gray, green, orange}.</li> <li>We can tell what specific value <em data-effect="italics">x</em> the random variable <em data-effect="italics">X</em> takes only after performing the experiment.</li> </ul> </dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-probability-distribution-function-pdf-for-a-discrete-random-variable" title="Chapter 5.2: Probability Distribution Function (PDF) for a Discrete Random Variable"><div class="chapter-title-wrap"><h3 class="chapter-number">32</h3><h2 class="chapter-title"><span class="display-none">Chapter 5.2: Probability Distribution Function (PDF) for a Discrete Random Variable</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p>A discrete <span data-type="term">probability distribution function</span> has two characteristics:</p> <ol id="element-yu2" type="1"><li>Each probability is between zero and one, inclusive.</li> <li>The sum of the probabilities is one.</li> </ol> <div id="example1" class="textbox textbox--examples" data-type="example"><p id="element-165">A child psychologist is interested in the number of times a newborn baby&#8217;s crying wakes its mother after midnight. For a random sample of 50 mothers, the following information was obtained. Let <em data-effect="italics">X</em> = the number of times per week a newborn baby&#8217;s crying wakes its mother after midnight. For this example, <em data-effect="italics">x</em> = 0, 1, 2, 3, 4, 5.</p> <p id="fs-idp70402976"><em data-effect="italics">P</em>(<em data-effect="italics">x</em>) = probability that <em data-effect="italics">X</em> takes on a value <em data-effect="italics">x</em>.</p> <table id="M02_Ch04_tbl001" summary="PDF table for the the number of times a newborn wakes its mother after midnight and probabilities."><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>0</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 0) = \(\frac{2}{50}\)</td> </tr> <tr><td>1</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 1) = \(\frac{11}{50}\)</td> </tr> <tr><td>2</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 2) = \(\frac{23}{50}\)</td> </tr> <tr><td>3</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 3) = \(\frac{9}{50}\)</td> </tr> <tr><td>4</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 4) = \(\frac{4}{50}\)</td> </tr> <tr><td>5</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 5) = \(\frac{1}{50}\)</td> </tr> </tbody> </table> <p id="element-260"><em data-effect="italics">X</em> takes on the values 0, 1, 2, 3, 4, 5. This is a discrete PDF because:</p> <ol id="enumprac" type="a"><li>Each <em data-effect="italics">P</em>(<em data-effect="italics">x</em>) is between zero and one, inclusive.</li> <li>The sum of the probabilities is one, that is,</li> </ol> <div id="fifsum" data-type="equation">\(\frac{2}{50}+\frac{11}{50}+\frac{23}{50}+\frac{9}{50}+\frac{4}{50}+\frac{1}{50}=1\)</div> </div> <div id="fs-idm96796576" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp42457440" data-type="exercise"><div id="fs-idp103638368" data-type="problem"><p id="fs-idp104932624">A hospital researcher is interested in the number of times the average post-op patient will ring the nurse during a 12-hour shift. For a random sample of 50 patients, the following information was obtained. Let <em data-effect="italics">X</em> = the number of times a patient rings the nurse during a 12-hour shift. For this exercise, <em data-effect="italics">x</em> = 0, 1, 2, 3, 4, 5. <em data-effect="italics">P</em>(<em data-effect="italics">x</em>) = the probability that <em data-effect="italics">X</em> takes on value <em data-effect="italics">x</em>. Why is this a discrete probability distribution function (two reasons)?</p> <table id="fs-idp71433824" summary="Exercise 1 Table"><thead><tr><th><em data-effect="italics">X</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>0</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 0) = \(\frac{4}{50}\)</td> </tr> <tr><td>1</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 1) = \(\frac{8}{50}\)</td> </tr> <tr><td>2</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 2) = \(\frac{16}{50}\)</td> </tr> <tr><td>3</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 3) = \(\frac{14}{50}\)</td> </tr> <tr><td>4</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 4) = \(\frac{6}{50}\)</td> </tr> <tr><td>5</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 5) = \(\frac{2}{50}\)</td> </tr> </tbody> </table> </div> </div> </div> <div id="element-852" class="textbox textbox--examples" data-type="example"><p id="element-500">Suppose Nancy has classes <strong>three days</strong> a week. She attends classes three days a week <strong>80%</strong> of the time, <strong>two days 15%</strong> of the time, <strong>one day 4%</strong> of the time, and <strong>no days 1%</strong> of the time. Suppose one week is randomly selected.</p> <p>&nbsp;</p> <div data-type="exercise"><div id="eip-idp158961632" data-type="problem"><p id="eip-idp110729456">a. Let <em data-effect="italics">X</em> = the number of days Nancy ____________________.</p> </div> <div id="eip-idp214323360" data-type="solution"><p id="eip-idp133021120">a. Let <em data-effect="italics">X</em> = the number of days Nancy attends class per week.</p> <p>&nbsp;</p> </div> </div> <div id="eip-694" data-type="exercise"><div id="eip-idp2555296" data-type="problem"><p id="eip-idm52442560">b. <em data-effect="italics">X</em> takes on what values?</p> </div> <div id="eip-idm30850864" data-type="solution"><p id="eip-idm11988624">b. 0, 1, 2, and 3</p> <p>&nbsp;</p> </div> </div> <div id="eip-439" data-type="exercise"><div id="eip-idp58531312" data-type="problem"><p id="eip-idm2356304">c. Suppose one week is randomly chosen. Construct a probability distribution table (called a PDF table) like the one in <a class="autogenerated-content" href="#example1">(Figure)</a>. The table should have two columns labeled <em data-effect="italics">x</em> and <em data-effect="italics">P</em>(<em data-effect="italics">x</em>). What does the <em data-effect="italics">P</em>(<em data-effect="italics">x</em>) column sum to?</p> </div> <div id="eip-idp64718704" data-type="solution"><p id="eip-idp146953184">c.</p> <table id="eip-idp64719200" summary="PDF table of the number of times Nancy attends class per week and probabilities."><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>0</td> <td>0.01</td> </tr> <tr><td>1</td> <td>0.04</td> </tr> <tr><td>2</td> <td>0.15</td> </tr> <tr><td>3</td> <td>0.80</td> </tr> </tbody> </table> </div> </div> </div> <div id="fs-idm157100592" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp17767152" data-type="exercise"><div id="fs-idm59642592" data-type="problem"><p id="fs-idp10711280">Jeremiah has basketball practice two days a week. Ninety percent of the time, he attends both practices. Eight percent of the time, he attends one practice. Two percent of the time, he does not attend either practice. What is <em data-effect="italics">X</em> and what values does it take on?</p> </div> </div> </div> <div id="fs-idp97044640" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idm43618304">The characteristics of a probability distribution function (PDF) for a discrete random variable are as follows:</p> <ol id="fs-idm39706128" type="1"><li>Each probability is between zero and one, inclusive (<em data-effect="italics">inclusive</em> means to include zero and one).</li> <li>The sum of the probabilities is one.</li> </ol> </div> <div id="fs-idp69020960" class="practice" data-depth="1"><h3 data-type="title"><em data-effect="italics">Use the following information to answer the next five exercises:</em> A company wants to evaluate its attrition rate, in other words, how long new hires stay with the company. Over the years, they have established the following probability distribution.Let <em data-effect="italics">X</em> = the number of years a new hire will stay with the company.Let <em data-effect="italics">P</em>(<em data-effect="italics">x</em>) = the probability that a new hire will stay with the company <em data-effect="italics">x</em> years.Complete <a class="autogenerated-content" href="#M02_Ch03_tbl004">(Figure)</a> using the data provided.<em data-effect="italics">x</em><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)  0 0.12     1 0.18     2 0.30     3 0.15     4 0.10    5 0.10     6 0.05</h3> </div> <div id="fs-idp35913488" data-type="solution"><table id="fs-idm128424048" summary=""><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>0</td> <td>0.12</td> </tr> <tr><td>1</td> <td>0.18</td> </tr> <tr><td>2</td> <td>0.30</td> </tr> <tr><td>3</td> <td>0.15</td> </tr> <tr><td>4</td> <td>0.10</td> </tr> <tr><td>5</td> <td>0.10</td> </tr> <tr><td>6</td> <td>0.05</td> </tr> </tbody> </table> </div> <div id="fs-idm3556768" data-type="exercise"><div id="fs-idm46583824" data-type="problem"><p id="fs-idm39567616"><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 4) = _______</p> </div> <p>0.10</p> </div> <div id="fs-idp21614176" data-type="exercise"><div id="fs-idp34115424" data-type="problem"><p id="fs-idp9314928"><em data-effect="italics">P</em>(<em data-effect="italics">x</em> ≥ 5) = _______</p> </div> <div id="fs-idm43709152" data-type="solution"><p id="fs-idp98025120">0.10 + 0.05 = 0.15</p> </div> </div> <div id="fs-idp88464496" data-type="exercise"><div id="fs-idm14331664" data-type="problem"><p id="fs-idm3848192">On average, how long would you expect a new hire to stay with the company?</p> </div> <p>0 + 0.18 + 0.60 + 0.45 + 0.40 + 0.50 + 0.30 = 2.43 years</p> </div> <div id="fs-idp18018320" data-type="exercise"><div id="fs-idm39505296" data-type="problem"><p id="fs-idm15015472">What does the column “<em data-effect="italics">P</em>(<em data-effect="italics">x</em>)” sum to?</p> </div> <div id="fs-idm46445984" data-type="solution"><p id="fs-idp107818160">1</p> </div> </div> <p id="fs-idm61093248"><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next six exercises:</em> A baker is deciding how many batches of muffins to make to sell in his bakery. He wants to make enough to sell every one and no fewer. Through observation, the baker has established a probability distribution.</p> <table id="fs-idm43289680" summary=""><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>1</td> <td>0.15</td> </tr> <tr><td>2</td> <td>0.35</td> </tr> <tr><td>3</td> <td>0.40</td> </tr> <tr><td>4</td> <td>0.10</td> </tr> </tbody> </table> <div id="fs-idm2415264" data-type="exercise"><div id="fs-idm15926960" data-type="problem"><p id="fs-idm161865296">Define the random variable <em data-effect="italics">X</em>.</p> </div> <p>Let X = the number of batches that the baker will sell.</p> </div> <div id="fs-idm14204160" data-type="exercise"><div id="fs-idp91033728" data-type="problem"><p id="fs-idp57653024">What is the probability the baker will sell more than one batch? <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; 1) = _______</p> </div> <div id="fs-idm77654992" data-type="solution"><p id="fs-idp67916960">0.35 + 0.40 + 0.10 = 0.85</p> </div> </div> <div id="fs-idp60250832" data-type="exercise"><div id="fs-idm40729216" data-type="problem"><p id="fs-idm50895232">What is the probability the baker will sell exactly one batch? <em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 1) = _______</p> </div> <p>0.15</p> </div> <div id="fs-idp100680048" data-type="exercise"><div id="fs-idm6782176" data-type="problem"><p id="fs-idm12192384">On average, how many batches should the baker make?</p> </div> <div id="fs-idm19316912" data-type="solution"><p id="fs-idp105326384">1(0.15) + 2(0.35) + 3(0.40) + 4(0.10) = 0.15 + 0.70 + 1.20 + 0.40 = 2.45</p> </div> </div> <p id="fs-idp26455296"><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next four exercises:</em> Ellen has music practice three days a week. She practices for all of the three days 85% of the time, two days 8% of the time, one day 4% of the time, and no days 3% of the time. One week is selected at random.</p> <div id="fs-idp80631856" data-type="exercise"><div id="fs-idm38076272" data-type="problem"><p id="fs-idp104793536">Define the random variable <em data-effect="italics">X</em>.</p> </div> <p>Let X = the number of days Ellen attends practice per week.</p> </div> <div id="fs-idp105752304" data-type="exercise"><div id="fs-idp98145520" data-type="problem"><p id="fs-idp107988576">Construct a probability distribution table for the data.</p> </div> <div id="fs-idp99205248" data-type="solution"><table id="fs-idp25484752" summary="Table..."><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>0</td> <td>0.03</td> </tr> <tr><td>1</td> <td>0.04</td> </tr> <tr><td>2</td> <td>0.08</td> </tr> <tr><td>3</td> <td>0.85</td> </tr> </tbody> </table> </div> </div> <div id="fs-idm12290976" data-type="exercise"><div id="fs-idp102593056" data-type="problem"><p id="fs-idm54887552">We know that for a probability distribution function to be discrete, it must have two characteristics. One is that the sum of the probabilities is one. What is the other characteristic?</p> </div> <p>Each probability is between zero and one, inclusive.</p> </div> <p id="fs-idp62348192"><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next five exercises:</em> Javier volunteers in community events each month. He does not do more than five events in a month. He attends exactly five events 35% of the time, four events 25% of the time, three events 20% of the time, two events 10% of the time, one event 5% of the time, and no events 5% of the time.</p> <div id="fs-idp45989040" data-type="exercise"><div id="fs-idp107021152" data-type="problem"><p id="fs-idp1699120">Define the random variable <em data-effect="italics">X</em>.</p> </div> <div id="fs-idp75230208" data-type="solution"><p id="fs-idp99642016">Let <em data-effect="italics">X</em> = the number of events Javier volunteers for each month.</p> </div> </div> <div id="fs-idp99208272" data-type="exercise"><div id="fs-idp102814688" data-type="problem"><p id="fs-idm38746304">What values does <em data-effect="italics">x</em> take on?</p> </div> <p>0, 1, 2, 3, 4, 5</p> </div> <div id="fs-idm44182368" data-type="exercise"><div id="fs-idm60355136" data-type="problem"><p id="fs-idp106732192">Construct a PDF table.</p> </div> <div id="fs-idp9913600" data-type="solution"><table id="fs-idm82394208" summary="Table..."><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>0</td> <td>0.05</td> </tr> <tr><td>1</td> <td>0.05</td> </tr> <tr><td>2</td> <td>0.10</td> </tr> <tr><td>3</td> <td>0.20</td> </tr> <tr><td>4</td> <td>0.25</td> </tr> <tr><td>5</td> <td>0.35</td> </tr> </tbody> </table> </div> </div> <div id="fs-idm51474800" data-type="exercise"><div id="fs-idp105026400" data-type="problem"><p id="fs-idm62777200">Find the probability that Javier volunteers for less than three events each month. <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &lt; 3) = _______</p> </div> <p>0.05 + 0.05 + 0.10 = 0.20</p> </div> <div id="fs-idm38193104" data-type="exercise"><div id="fs-idm1831824" data-type="problem"><p id="fs-idp54225264">Find the probability that Javier volunteers for at least one event each month. <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; 0) = _______</p> </div> <div id="fs-idm39812848" data-type="solution"><p id="fs-idp75344688">1 – 0.05 = 0.95</p> </div> </div> <div id="fs-idp99695744" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <div id="fs-idp971520" data-type="exercise"><div id="fs-idp65369008" data-type="problem"><p id="fs-idp99566944">1)  Suppose that the PDF for the number of years it takes to earn a Bachelor of Science (B.S.) degree is given in <a class="autogenerated-content" href="#M03_Ch04_tbl010">(Figure)</a>.</p> <table id="M03_Ch04_tbl010" summary="Exercise 30 Table"><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>3</td> <td>0.05</td> </tr> <tr><td>4</td> <td>0.40</td> </tr> <tr><td>5</td> <td>0.30</td> </tr> <tr><td>6</td> <td>0.15</td> </tr> <tr><td>7</td> <td>0.10</td> </tr> </tbody> </table> <ol id="element-99" type="a"><li>In words, define the random variable <em data-effect="italics">X</em>.</li> <li>What does it mean that the values zero, one, and two are not included for <em data-effect="italics">x</em> in the PDF?</li> </ol> </div> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="pdfelab"><dt>Probability Distribution Function (PDF)</dt> <dd id="id1165407642895">a mathematical description of a discrete random variable (<em data-effect="italics">RV</em>), given either in the form of an equation (formula) or in the form of a table listing all the possible outcomes of an experiment and the probability associated with each outcome.</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-mean-or-expected-value-and-standard-deviation" title="Chapter 5.3: Mean or Expected Value and Standard Deviation"><div class="chapter-title-wrap"><h3 class="chapter-number">33</h3><h2 class="chapter-title"><span class="display-none">Chapter 5.3: Mean or Expected Value and Standard Deviation</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p>The <span data-type="term">expected value</span> is often referred to as the <strong>&#8220;long-term&#8221; average or mean</strong>. This means that over the long term of doing an experiment over and over, you would <strong>expect</strong> this average.</p> <p id="fs-idm122105344">You toss a coin and record the result. What is the probability that the result is heads? If you flip a coin two times, does probability tell you that these flips will result in one heads and one tail? You might toss a fair coin ten times and record nine heads. As you learned in <a class="autogenerated-content" href="/contents/326ee2e0-0ccd-46ae-a776-f8857a5dad4c">(Figure)</a>, probability does not describe the short-term results of an experiment. It gives information about what can be expected in the long term. To demonstrate this, Karl Pearson once tossed a fair coin 24,000 times! He recorded the results of each toss, obtaining heads 12,012 times. <strong data-effect="bold">In his experiment, Pearson illustrated the Law of Large Numbers</strong>.</p> <p id="fs-idm32245968"><strong data-effect="bold">The Law of Large Numbers</strong> states that, as the number of trials in a probability experiment increases, the difference between the theoretical probability of an event and the relative frequency approaches zero <strong data-effect="bold">(the theoretical probability and the relative frequency get closer and closer together)</strong>. When evaluating the long-term results of statistical experiments, we often want to know the “average” outcome. This “long-term average” is known as the <span data-type="term">mean</span> or <span data-type="term">expected value</span> of the experiment and is denoted by the Greek letter <em data-effect="italics">μ</em>. In other words, after conducting many trials of an experiment, you would expect this average value.</p> <div id="id8737934" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="eip-idp95250624">To find the expected value or long term average, <em data-effect="italics">μ</em>, simply multiply each value of the random variable by its probability and add the products.</p> </div> <div id="fs-idp56252608" class="textbox textbox--examples" data-type="example"><p id="fs-idm190895152">A men&#8217;s soccer team plays soccer zero, one, or two days a week. The probability that they play zero days is 0.2, the probability that they play one day is 0.5, and the probability that they play two days is 0.3. Find the long-term average or expected value, <em data-effect="italics">μ</em>, of the number of days per week the men&#8217;s soccer team plays soccer.</p> <p id="element-126">To do the problem, first let the random variable <em data-effect="italics">X</em> = the number of days the men&#8217;s soccer team plays soccer per week. <em data-effect="italics">X</em> takes on the values 0, 1, 2. Construct a PDF table adding a column <em data-effect="italics">x</em>*<em data-effect="italics">P</em>(<em data-effect="italics">x</em>). In this column, you will multiply each <em data-effect="italics">x</em> value by its probability.</p> <table id="element-749" summary="The PDF table contains the number of days a men's soccer team plays soccer per week and thir probabilities."><caption><span data-type="title">Expected Value Table This table is called an expected value table. The table helps you calculate the expected value or long-term average.</span></caption> <thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> <th><em data-effect="italics">x</em>*<em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>0</td> <td>0.2</td> <td>(0)(0.2) = 0</td> </tr> <tr><td>1</td> <td>0.5</td> <td>(1)(0.5) = 0.5</td> </tr> <tr><td>2</td> <td>0.3</td> <td>(2)(0.3) = 0.6</td> </tr> </tbody> </table> <p id="fs-idm87467792">Add the last column <em data-effect="italics">x</em>*<em data-effect="italics">P</em>(<em data-effect="italics">x</em>) to find the long term average or expected value: (0)(0.2) + (1)(0.5) + (2)(0.3) = 0 + 0.5 + 0.6 = 1.1.</p> <p>The expected value is 1.1. The men&#8217;s soccer team would, on the average, expect to play soccer 1.1 days per week. The number 1.1 is the long-term average or expected value if the men&#8217;s soccer team plays soccer week after week after week. We say <em data-effect="italics">μ</em> = 1.1.</p> </div> <div id="fs-idm82770544" class="textbox textbox--examples" data-type="example"><p id="fs-idm122112">Find the expected value of the number of times a newborn baby&#8217;s crying wakes its mother after midnight. The expected value is the expected number of times per week a newborn baby&#8217;s crying wakes its mother after midnight. Calculate the standard deviation of the variable as well.</p> <table summary="The PDF table contains the number of times a newborn wakes its mother after midnight, their probabilities and a column for each number multiplied by its probability."><caption><span data-type="title">You expect a newborn to wake its mother after midnight 2.1 times per week, on the average.</span></caption> <thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> <th><em data-effect="italics">x</em>*<em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> <th>(<em data-effect="italics">x</em> – <em data-effect="italics">μ</em>)<sup>2</sup> ⋅ <em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>0</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 0) = \(\frac{2}{50}\)</td> <td>(0)\(\left(\frac{2}{50}\right)\) = 0</td> <td>(0 – 2.1)<sup>2</sup> ⋅ 0.04 = 0.1764</td> </tr> <tr><td>1</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 1) = \(\left(\frac{11}{50}\right)\)</td> <td>(1)\(\left(\frac{11}{50}\right)\) = \(\frac{11}{50}\)</td> <td>(1 – 2.1)<sup>2</sup> ⋅ 0.22 = 0.2662</td> </tr> <tr><td>2</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 2) = \(\frac{23}{50}\)</td> <td>(2)\(\left(\frac{23}{50}\right)\) = \(\frac{46}{50}\)</td> <td>(2 – 2.1)<sup>2</sup> ⋅ 0.46 = 0.0046</td> </tr> <tr><td>3</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 3) = \(\frac{9}{50}\)</td> <td>(3)\(\left(\frac{9}{50}\right)\) = \(\frac{27}{50}\)</td> <td>(3 – 2.1)<sup>2</sup> ⋅ 0.18 = 0.1458</td> </tr> <tr><td>4</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 4) = \(\frac{4}{50}\)</td> <td>(4)\(\left(\frac{4}{50}\right)\) = \(\frac{16}{50}\)</td> <td>(4 – 2.1)<sup>2</sup> ⋅ 0.08 = 0.2888</td> </tr> <tr><td>5</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 5) = \(\frac{1}{50}\)</td> <td>(5)\(\left(\frac{1}{50}\right)\) = \(\frac{5}{50}\)</td> <td>(5 – 2.1)<sup>2</sup> ⋅ 0.02 = 0.1682</td> </tr> </tbody> </table> <p id="fs-idp11618480">Add the values in the third column of the table to find the expected value of <em data-effect="italics">X</em>: <span data-type="newline" data-count="1"><br /> </span><em data-effect="italics">μ</em> = Expected Value = \(\frac{105}{50}\) = 2.1</p> <p id="fs-idm128391792">Use <em data-effect="italics">μ</em> to complete the table. The fourth column of this table will provide the values you need to calculate the standard deviation. For each value <em data-effect="italics">x</em>, multiply the square of its deviation by its probability. (Each deviation has the format <em data-effect="italics">x</em> – <em data-effect="italics">μ</em>).</p> <p id="fs-idm66531168">Add the values in the fourth column of the table:</p> <p id="fs-idm70422448">0.1764 + 0.2662 + 0.0046 + 0.1458 + 0.2888 + 0.1682 = 1.05</p> <p id="fs-idm146103552">The standard deviation of <em data-effect="italics">X</em> is the square root of this sum: <em data-effect="italics">σ</em> = \(\sqrt{1.05}\) ≈ 1.0247</p> <p id="eip-332">The mean, <em data-effect="italics">μ</em>, of a discrete probability function is the expected value.</p> <div id="eip-226" data-type="equation">\(\mu =\sum \left(x\bullet P\left(x\right)\right)\)</div> <p id="eip-436">The standard deviation, Σ, of the PDF is the square root of the variance.</p> <div data-type="equation">\(\sigma =\sqrt{\sum \left[{\left(x – \mu \right)}^{2} \bullet  Ρ\left(x\right)\right] }\)</div> <p id="eip-988">When all outcomes in the probability distribution are equally likely, these formulas coincide with the mean and standard deviation of the set of possible outcomes.</p> </div> <div id="fs-idm80414512" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm131173600" data-type="exercise"><div id="fs-idm112686784" data-type="problem"><p id="fs-idm86997008">A hospital researcher is interested in the number of times the average post-op patient will ring the nurse during a 12-hour shift. For a random sample of 50 patients, the following information was obtained. What is the expected value?</p> <table id="fs-idm56815952" summary="The number of times the average post-op patient will ring the nurse during a 12-hour shift"><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>0</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 0) = \(\frac{4}{50}\)</td> </tr> <tr><td>1</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 1) = \(\frac{8}{50}\)</td> </tr> <tr><td>2</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 2) = \(\frac{16}{50}\)</td> </tr> <tr><td>3</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 3) = \(\frac{14}{50}\)</td> </tr> <tr><td>4</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 4) = \(\frac{6}{50}\)</td> </tr> <tr><td>5</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 5) = \(\frac{2}{50}\)</td> </tr> </tbody> </table> </div> </div> </div> <div id="element-116" class="textbox textbox--examples" data-type="example"><p>Suppose you play a game of chance in which five numbers are chosen from 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. A computer randomly selects five numbers from zero to nine with replacement. You pay ?2 to play and could profit ?100,000 if you match all five numbers in order (you get your ?2 back plus ?100,000). Over the long term, what is your <strong>expected</strong> profit of playing the game?</p> <p>To do this problem, set up an expected value table for the amount of money you can profit.</p> <p id="element-382">Let <em data-effect="italics">X</em> = the amount of money you profit. The values of <em data-effect="italics">x</em> are not 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. Since you are interested in your profit (or loss), the values of <em data-effect="italics">x</em> are 100,000 dollars and −2 dollars.</p> <p id="element-996">To win, you must get all five numbers correct, in order. The probability of choosing one correct number is \(\frac{1}{10}\) because there are ten numbers. You may choose a number more than once. The probability of choosing all five numbers correctly and in order is</p> <div data-type="equation">\(\left(\frac{1}{10}\right)\left(\frac{1}{10}\right)\left(\frac{1}{10}\right)\left(\frac{1}{10}\right)\left(\frac{1}{10}\right)=\left(1\right)\left({10}^{-5}\right)=0.00001.\)</div> <p>Therefore, the probability of winning is 0.00001 and the probability of losing is</p> <div id="eip-755" data-type="equation">\(1-0.00001=0.99999.\)</div> <p id="element-839">The expected value table is as follows:</p> <table id="element-480" summary="In the table, to find the expected value, add the last column."><caption><span data-type="title">Αdd the last column. –1.99998 + 1 = –0.99998</span></caption> <thead><tr><th></th> <th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> <th><em data-effect="italics">x</em>*<em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>Loss</td> <td>–2</td> <td>0.99999</td> <td>(–2)(0.99999) = –1.99998</td> </tr> <tr><td>Profit</td> <td>100,000</td> <td>0.00001</td> <td>(100000)(0.00001) = 1</td> </tr> </tbody> </table> <p>Since –0.99998 is about –1, you would, on average, expect to lose approximately ?1 for each game you play. However, each time you play, you either lose ?2 or profit ?100,000. The ?1 is the average or expected LOSS per game after playing this game over and over.</p> </div> <div id="fs-idm7487840" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm58858880" data-type="exercise"><div id="fs-idm58346384" data-type="problem"><p id="fs-idm147804448">You are playing a game of chance in which four cards are drawn from a standard deck of 52 cards. You guess the suit of each card before it is drawn. The cards are replaced in the deck on each draw. You pay ?1 to play. If you guess the right suit every time, you get your money back and ?256. What is your expected profit of playing the game over the long term?</p> </div> </div> </div> <div id="element-341" class="textbox textbox--examples" data-type="example"><p id="element-408">Suppose you play a game with a biased coin. You play each game by tossing the coin once. <em data-effect="italics">P</em>(heads) = \(\frac{2}{3}\) and <em data-effect="italics">P</em>(tails) = \(\frac{1}{3}\). If you toss a head, you pay ?6. If you toss a tail, you win ?10. If you play this game many times, will you come out ahead?</p> <p>&nbsp;</p> <div id="element-193" data-type="exercise"><div id="id14631763" data-type="problem"><p id="element-653">a. Define a random variable <em data-effect="italics">X</em>.</p> </div> <div id="id14631784" data-type="solution"><p>a. <em data-effect="italics">X</em> = amount of profit</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id14631817" data-type="problem"><p>b. Complete the following expected value table.</p> <table id="element-644" summary="Problem 2 Table"><thead><tr><th></th> <th><em data-effect="italics">x</em></th> <th>____</th> <th>____</th> </tr> </thead> <tbody><tr><td>WIN</td> <td>10</td> <td>\(\frac{1}{3}\)</td> <td>____</td> </tr> <tr><td>LOSE</td> <td>____</td> <td>____</td> <td>\(\frac{–12}{3}\)</td> </tr> </tbody> </table> </div> <div id="id7273882" data-type="solution"><p id="fs-idm44109456">b.</p> <table id="element-644s" summary="In the PDF table, the outcomes (money you win or lose) of a game played with a biased coin are recorded. There is a column to record the probability and a column to record the outcome multiplied by its probability."><thead><tr><th></th> <th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> <th><em data-effect="italics">xP</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>WIN</td> <td>10</td> <td>\(\frac{1}{3}\)</td> <td>\(\frac{10}{3}\)</td> </tr> <tr><td>LOSE</td> <td>–6</td> <td>\(\frac{2}{3}\)</td> <td>\(\frac{–12}{3}\)</td> </tr> </tbody> </table> <p>&nbsp;</p> <p>&nbsp;</p> </div> </div> <div id="element-737" data-type="exercise"><div id="id10943564" data-type="problem"><p>c. What is the expected value, <em data-effect="italics">μ</em>? Do you come out ahead?</p> </div> <div id="id10943588" data-type="solution"><p id="element-535">c. Add the last column of the table. The expected value <em data-effect="italics">μ</em> = \(\frac{\text{–}2}{3}\). You lose, on average, about 67 cents each time you play the game so you do not come out ahead.</p> </div> </div> </div> <div id="fs-idp157498240" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm123810544" data-type="exercise"><div id="fs-idm43455984" data-type="problem"><p id="fs-idm133743184">Suppose you play a game with a spinner. You play each game by spinning the spinner once. <em data-effect="italics">P</em>(red) = \(\frac{2}{5}\), <em data-effect="italics">P</em>(blue) = \(\frac{2}{5}\), and <em data-effect="italics">P</em>(green) = \(\frac{1}{5}\). If you land on red, you pay ?10. If you land on blue, you don&#8217;t pay or win anything. If you land on green, you win ?10. Complete the following expected value table.</p> <table id="fs-idm71508016" summary="Exercise 3 Table 1"><thead><tr><th></th> <th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> <th></th> </tr> </thead> <tbody><tr><td>Red</td> <td></td> <td></td> <td>\(\text{–}\frac{20}{5}\)</td> </tr> <tr><td>Blue</td> <td></td> <td>\(\frac{2}{5}\)</td> <td></td> </tr> <tr><td>Green</td> <td>10</td> <td></td> <td></td> </tr> </tbody> </table> </div> </div> </div> <p id="element-903">Like data, probability distributions have standard deviations. To calculate the standard deviation (<em data-effect="italics">σ</em>) of a probability distribution, find each deviation from its expected value, square it, multiply it by its probability, add the products, and take the square root. To understand how to do the calculation, look at the table for the number of days per week a men&#8217;s soccer team plays soccer. To find the standard deviation, add the entries in the column labeled (<em data-effect="italics">x</em> – <em data-effect="italics">μ</em>)<sup>2</sup><em data-effect="italics">P</em>(<em data-effect="italics">x</em>) and take the square root.</p> <table summary="PDF table with the added columns for expected value and variance."><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> <th><em data-effect="italics">x</em>*<em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> <th>(<em data-effect="italics">x</em> – <em data-effect="italics">μ</em>)<sup>2</sup><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>0</td> <td>0.2</td> <td>(0)(0.2) = 0</td> <td>(0 – 1.1)<sup>2</sup>(0.2) = 0.242</td> </tr> <tr><td>1</td> <td>0.5</td> <td>(1)(0.5) = 0.5</td> <td>(1 – 1.1)<sup>2</sup>(0.5) = 0.005</td> </tr> <tr><td>2</td> <td>0.3</td> <td>(2)(0.3) = 0.6</td> <td>(2 – 1.1)<sup>2</sup>(0.3) = 0.243</td> </tr> </tbody> </table> <p id="eip-127">Add the last column in the table. 0.242 + 0.005 + 0.243 = 0.490. The standard deviation is the square root of 0.49, or <em data-effect="italics">σ</em> = \(\sqrt{0.49}\) = 0.7</p> <p id="element-873" class="finger">Generally for probability distributions, we use a calculator or a computer to calculate <em data-effect="italics">μ</em> and <em data-effect="italics">σ</em> to reduce roundoff error. For some probability distributions, there are short-cut formulas for calculating <em data-effect="italics">μ</em> and <em data-effect="italics">σ</em>.</p> <div id="fs-idp4384592" class="textbox textbox--examples" data-type="example"><div id="fs-idm87291728" data-type="exercise"><div id="fs-idm69882560" data-type="problem"><p id="fs-idm82328592">Toss a fair, six-sided die twice. Let <em data-effect="italics">X</em> = the number of faces that show an even number. Construct a table like <a class="autogenerated-content" href="#fs-idm71508016">(Figure)</a> and calculate the mean <em data-effect="italics">μ</em> and standard deviation <em data-effect="italics">σ</em> of <em data-effect="italics">X</em>.</p> </div> <div id="fs-idm46332752" data-type="solution"><p id="fs-idm86008592">Tossing one fair six-sided die twice has the same sample space as tossing two fair six-sided dice. The sample space has 36 outcomes:</p> <table id="fs-idm21999584" summary=""><tbody><tr><td>(1, 1)</td> <td>(1, 2)</td> <td>(1, 3)</td> <td>(1, 4)</td> <td>(1, 5)</td> <td>(1, 6)</td> </tr> <tr><td>(2, 1)</td> <td>(2, 2)</td> <td>(2, 3)</td> <td>(2, 4)</td> <td>(2, 5)</td> <td>(2, 6)</td> </tr> <tr><td>(3, 1)</td> <td>(3, 2)</td> <td>(3, 3)</td> <td>(3, 4)</td> <td>(3, 5)</td> <td>(3, 6)</td> </tr> <tr><td>(4, 1)</td> <td>(4, 2)</td> <td>(4, 3)</td> <td>(4, 4)</td> <td>(4, 5)</td> <td>(4, 6)</td> </tr> <tr><td>(5, 1)</td> <td>(5, 2)</td> <td>(5, 3)</td> <td>(5, 4)</td> <td>(5, 5)</td> <td>(5, 6)</td> </tr> <tr><td>(6, 1)</td> <td>(6, 2)</td> <td>(6, 3)</td> <td>(6, 4)</td> <td>(6, 5)</td> <td>(6, 6)</td> </tr> </tbody> </table> <p id="fs-idm130208640">Use the sample space to complete the following table:</p> <table id="fs-idp385088" summary="Calculating"><caption><span data-type="title">Calculating <em data-effect="italics">μ</em> and <em data-effect="italics">σ</em>.</span></caption> <thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> <th><em data-effect="italics">x</em><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> <th>(<em data-effect="italics">x</em> – <em data-effect="italics">μ</em>)<sup>2</sup> \(\cdot \) <em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>0</td> <td>\(\frac{9}{36}\)</td> <td>0</td> <td>(0 – 1)<sup>2</sup> ⋅ \(\frac{9}{36}\) = \(\frac{9}{36}\)</td> </tr> <tr><td>1</td> <td>\(\frac{18}{36}\)</td> <td>\(\frac{18}{36}\)</td> <td>(1 – 1)<sup>2</sup> ⋅ \(\frac{18}{36}\) = 0</td> </tr> <tr><td>2</td> <td>\(\frac{9}{36}\)</td> <td>\(\frac{18}{36}\)</td> <td>(1 – 1)<sup>2</sup> ⋅ \(\frac{9}{36}\) = \(\frac{9}{36}\)</td> </tr> </tbody> </table> <p id="fs-idm8206048">Add the values in the third column to find the expected value: <em data-effect="italics">μ</em> = \(\frac{36}{36}\) = 1. Use this value to complete the fourth column.</p> <p id="fs-idm151081888">Add the values in the fourth column and take the square root of the sum: σ = \(\sqrt{\frac{18}{36}}\) ≈ 0.7071.</p> </div> </div> </div> <div id="example1.8" class="textbox textbox--examples" data-type="example"><div id="fs-idm71286192" data-type="exercise"><div id="fs-idm79685344" data-type="problem"><p id="fs-idm64926544">On May 11, 2013 at 9:30 PM, the probability that moderate seismic activity (one moderate earthquake) would occur in the next 48 hours in Iran was about 21.42%. Suppose you make a bet that a moderate earthquake will occur in Iran during this period. If you win the bet, you win ?50. If you lose the bet, you pay ?20. Let <em data-effect="italics">X</em> = the amount of profit from a bet.</p> <p id="fs-idm74848144"><em data-effect="italics">P</em>(win) = <em data-effect="italics">P</em>(one moderate earthquake will occur) = 21.42%</p> <p id="fs-idm122921568"><em data-effect="italics">P</em>(loss) = <em data-effect="italics">P</em>(one moderate earthquake will <em data-effect="italics">not</em> occur) = 100% – 21.42%</p> <p id="fs-idm135341424">If you bet many times, will you come out ahead? Explain your answer in a complete sentence using numbers. What is the standard deviation of <em data-effect="italics">X</em>? Construct a table similar to <a class="autogenerated-content" href="#element-226">(Figure)</a> and <a class="autogenerated-content" href="#fs-idm21999584">(Figure)</a> to help you answer these questions.</p> </div> <div id="fs-idm128877280" data-type="solution"><table id="fs-idm70673696" summary=""><thead><tr><th></th> <th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P(x)</em></th> <th><em data-effect="italics">x</em><em data-effect="italics">(Px)</em></th> <th>(<em data-effect="italics">x</em> – <em data-effect="italics">μ</em>)<sup>2</sup><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>win</td> <td>50</td> <td>0.2142</td> <td>10.71</td> <td>[50 – (–5.006)]<sup>2</sup>(0.2142) = 648.0964</td> </tr> <tr><td>loss</td> <td>–20</td> <td>0.7858</td> <td>–15.716</td> <td>[–20 – (–5.006)]<sup>2</sup>(0.7858) = 176.6636</td> </tr> </tbody> </table> <p id="fs-idm111908496">Mean = Expected Value = 10.71 + (–15.716) = –5.006.</p> <p id="fs-idm85218944">If you make this bet many times under the same conditions, your long term outcome will be an average <em data-effect="italics">loss</em> of ?5.01 per bet.</p> <p id="fs-idm50457088">\(\text{Standard Deviation = }\sqrt{648.0964+176.6636}\approx 28.7186\)</p> </div> </div> </div> <div id="fs-idp152488176" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm88593344" data-type="exercise"><div id="eip-idp52574576" data-type="problem"><p id="eip-idp52574832">On May 11, 2013 at 9:30 PM, the probability that moderate seismic activity (one moderate earthquake) would occur in the next 48 hours in Japan was about 1.08%. As in <a class="autogenerated-content" href="#example1.8">(Figure)</a>, you bet that a moderate earthquake will occur in Japan during this period. If you win the bet, you win ?100. If you lose the bet, you pay ?10. Let <em data-effect="italics">X</em> = the amount of profit from a bet. Find the mean and standard deviation of <em data-effect="italics">X</em>.</p> </div> </div> </div> <p id="element-1">Some of the more common discrete probability functions are binomial, geometric, hypergeometric, and Poisson. Most elementary courses do not cover the geometric, hypergeometric, and Poisson. Your instructor will let you know if he or she wishes to cover these distributions.</p> <p id="element-428">A probability distribution function is a pattern. You try to fit a probability problem into a <strong>pattern</strong> or distribution in order to perform the necessary calculations. These distributions are tools to make solving probability problems easier. Each distribution has its own special characteristics. Learning the characteristics enables you to distinguish among the different distributions.</p> <div id="fs-idp138592560" class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="fs-idp181053488">Class Catalogue at the Florida State University. Available online at https://apps.oti.fsu.edu/RegistrarCourseLookup/SearchFormLegacy (accessed May 15, 2013).</p> <p id="fs-idp218399840">“World Earthquakes: Live Earthquake News and Highlights,” World Earthquakes, 2012. http://www.world-earthquakes.com/index.php?option=ethq_prediction (accessed May 15, 2013).</p> </div> <div id="fs-idp17559008" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idm56942800">The expected value, or mean, of a discrete random variable predicts the long-term results of a statistical experiment that has been repeated many times. The standard deviation of a probability distribution is used to measure the variability of possible outcomes.</p> </div> <div id="fs-idp161572768" class="formula-review" data-depth="1"><h3 data-type="title">Formula Review</h3> <p id="fs-idm6265952">Mean or Expected Value: \(\mu =\underset{x\in X}{{\sum }^{\text{​}}}xP\left(x\right)\)</p> <p id="fs-idm133403024">Standard Deviation: \(\sigma =\sqrt{\underset{x\in X}{{\sum }^{\text{​}}}{\left(x-\mu \right)}^{2}P\left(x\right)}\)</p> </div> <div id="fs-idm82465552" class="practice" data-depth="1"><div id="fs-idm139526544" data-type="exercise"><div id="fs-idm130140768" data-type="problem"><p id="fs-idm42698096">Complete the expected value table.</p> <table id="fs-idp12281248" summary=""><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> <th><em data-effect="italics">x</em>*<em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>0</td> <td>0.2</td> <td></td> </tr> <tr><td>1</td> <td>0.2</td> <td></td> </tr> <tr><td>2</td> <td>0.4</td> <td></td> </tr> <tr><td>3</td> <td>0.2</td> <td></td> </tr> </tbody> </table> </div> <p>&nbsp;</p> <table id="fs-idm70030416" summary=""><thead></thead> </table> <p>&nbsp;</p> </div> <div id="fs-idm146513696" data-type="exercise"><div id="fs-idm87572704" data-type="problem"><p id="fs-idp41672272">Find the expected value from the expected value table.</p> <table id="fs-idm121788096" summary=""><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> <th><em data-effect="italics">x</em>*<em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>2</td> <td>0.1</td> <td>2(0.1) = 0.2</td> </tr> <tr><td>4</td> <td>0.3</td> <td>4(0.3) = 1.2</td> </tr> <tr><td>6</td> <td>0.4</td> <td>6(0.4) = 2.4</td> </tr> <tr><td>8</td> <td>0.2</td> <td>8(0.2) = 1.6</td> </tr> </tbody> </table> </div> <div id="fs-idp8231488" data-type="solution"><p id="fs-idm57490144">0.2 + 1.2 + 2.4 + 1.6 = 5.4</p> </div> </div> <div id="fs-idp39629888" data-type="exercise"><div id="fs-idm40222912" data-type="problem"><p id="fs-idm134246528">Find the standard deviation.</p> <table id="fs-idm133091552" summary="table of standard deviation"><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> <th><em data-effect="italics">x</em>*<em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> <th>(<em data-effect="italics">x</em> – <em data-effect="italics">μ</em>)<sup>2</sup><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>2</td> <td>0.1</td> <td>2(0.1) = 0.2</td> <td>(2–5.4)<sup>2</sup>(0.1) = 1.156</td> </tr> <tr><td>4</td> <td>0.3</td> <td>4(0.3) = 1.2</td> <td>(4–5.4)<sup>2</sup>(0.3) = 0.588</td> </tr> <tr><td>6</td> <td>0.4</td> <td>6(0.4) = 2.4</td> <td>(6–5.4)<sup>2</sup>(0.4) = 0.144</td> </tr> <tr><td>8</td> <td>0.2</td> <td>8(0.2) = 1.6</td> <td>(8–5.4)<sup>2</sup>(0.2) = 1.352</td> </tr> </tbody> </table> </div> <p>σ= 1.156+0.588+0.144+1.352 = 3.24 =1.8</p> </div> <div id="fs-idm1819872" data-type="exercise"><div id="fs-idm39411248" data-type="problem"><p id="fs-idm54877632">Identify the mistake in the probability distribution table.</p> <table id="fs-idm39458912" summary=""><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> <th><em data-effect="italics">x</em>*<em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>1</td> <td>0.15</td> <td>0.15</td> </tr> <tr><td>2</td> <td>0.25</td> <td>0.50</td> </tr> <tr><td>3</td> <td>0.30</td> <td>0.90</td> </tr> <tr><td>4</td> <td>0.20</td> <td>0.80</td> </tr> <tr><td>5</td> <td>0.15</td> <td>0.75</td> </tr> </tbody> </table> </div> <div id="fs-idp2651008" data-type="solution"><p id="fs-idm60042160">The values of <em data-effect="italics">P</em>(<em data-effect="italics">x</em>) do not sum to one.</p> </div> </div> <div id="fs-idp10316752" data-type="exercise"><div id="fs-idm56968336" data-type="problem"><p id="fs-idm15847472">Identify the mistake in the probability distribution table.</p> <table id="fs-idp41547808" summary=""><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> <th><em data-effect="italics">x</em>*<em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>1</td> <td>0.15</td> <td>0.15</td> </tr> <tr><td>2</td> <td>0.25</td> <td>0.40</td> </tr> <tr><td>3</td> <td>0.25</td> <td>0.65</td> </tr> <tr><td>4</td> <td>0.20</td> <td>0.85</td> </tr> <tr><td>5</td> <td>0.15</td> <td>1</td> </tr> </tbody> </table> </div> <p>The values of xP(x) are not correct.</p> </div> <p id="fs-idp88127600"><em data-effect="italics">Use the following information to answer the next five exercises:</em> A physics professor wants to know what percent of physics majors will spend the next several years doing post-graduate research. He has the following probability distribution.</p> <table id="fs-idm82264048" summary=""><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> <th><em data-effect="italics">x</em>*<em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>1</td> <td>0.35</td> <td></td> </tr> <tr><td>2</td> <td>0.20</td> <td></td> </tr> <tr><td>3</td> <td>0.15</td> <td></td> </tr> <tr><td>4</td> <td></td> <td></td> </tr> <tr><td>5</td> <td>0.10</td> <td></td> </tr> <tr><td>6</td> <td>0.05</td> <td></td> </tr> </tbody> </table> <div id="fs-idp57624976" data-type="exercise"><div id="fs-idm62879680" data-type="problem"><p id="fs-idm88304064">Define the random variable <em data-effect="italics">X</em>.</p> </div> <div id="fs-idp77077680" data-type="solution"><p id="fs-idp108818960">Let <em data-effect="italics">X</em> = the number of years a physics major will spend doing post-graduate research.</p> </div> </div> <div id="fs-idp12372960" data-type="exercise"><div id="fs-idm279920" data-type="problem"><p id="fs-idp35057888">Define <em data-effect="italics">P</em>(<em data-effect="italics">x</em>), or the probability of <em data-effect="italics">x</em>.</p> </div> <p>Let P(x) = the probability that a physics major will do post-graduate research for x years.</p> </div> <div id="fs-idp33661936" data-type="exercise"><div id="fs-idp81310336" data-type="problem"><p id="fs-idp110219840">Find the probability that a physics major will do post-graduate research for four years. <em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 4) = _______</p> </div> <div id="fs-idm6915408" data-type="solution"><p id="fs-idm41700928">1 – 0.35 – 0.20 – 0.15 – 0.10 – 0.05 = 0.15</p> </div> </div> <div id="fs-idm517184" data-type="exercise"><div id="fs-idp85965280" data-type="problem"><p id="fs-idp41023008">FInd the probability that a physics major will do post-graduate research for at most three years. <em data-effect="italics">P</em>(<em data-effect="italics">x</em> ≤ 3) = _______</p> </div> <p>0.35 + 0.20 + 0.15 = 0.70</p> </div> <div id="fs-idp96964736" data-type="exercise"><div id="fs-idp69281968" data-type="problem"><p id="fs-idm16085360">On average, how many years would you expect a physics major to spend doing post-graduate research?</p> </div> <div id="fs-idp76360432" data-type="solution"><p id="fs-idm21541632">1(0.35) + 2(0.20) + 3(0.15) + 4(0.15) + 5(0.10) + 6(0.05) = 0.35 + 0.40 + 0.45 + 0.60 + 0.50 + 0.30 = 2.6 years</p> </div> </div> <p id="fs-idp155133792"><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next seven exercises:</em> A ballet instructor is interested in knowing what percent of each year&#8217;s class will continue on to the next, so that she can plan what classes to offer. Over the years, she has established the following probability distribution.</p> <ul id="list-24215"><li>Let <em data-effect="italics">X</em> = the number of years a student will study ballet with the teacher.</li> <li>Let <em data-effect="italics">P</em>(<em data-effect="italics">x</em>) = the probability that a student will study ballet <em data-effect="italics">x</em> years.</li> </ul> <div id="fs-idp129333088" data-type="exercise"><div id="fs-idm75363680" data-type="problem"><p id="fs-idp78893136">Complete <a class="autogenerated-content" href="#M02_Ch04_tbl021">(Figure)</a> using the data provided.</p> <table id="M02_Ch04_tbl021" summary="PDF table for the number of years a student will study ballet with the teacher. It contains columns for the number of years and their probabilities with third column for each number multiplied by its probability."><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> <th><em data-effect="italics">x</em>*<em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>1</td> <td>0.10</td> <td></td> </tr> <tr><td>2</td> <td>0.05</td> <td></td> </tr> <tr><td>3</td> <td>0.10</td> <td></td> </tr> <tr><td>4</td> <td></td> <td></td> </tr> <tr><td>5</td> <td>0.30</td> <td></td> </tr> <tr><td>6</td> <td>0.20</td> <td></td> </tr> <tr><td>7</td> <td>0.10</td> <td></td> </tr> </tbody> </table> </div> </div> <div data-type="exercise"><div id="id7565778" data-type="problem"><p>In words, define the random variable <em data-effect="italics">X</em>.</p> </div> <div id="fs-idp118291792" data-type="solution"><p id="fs-idp92658144"><em data-effect="italics">X</em> is the number of years a student studies ballet with the teacher.</p> </div> </div> <div data-type="exercise"><div id="id26516605" data-type="problem"><p><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 4) = _______</p> </div> <p>1 – 0.10 – 0.05 – 0.10 – 0.30 – 0.20 – 0.10 = 0.15</p> </div> <div data-type="exercise"><div id="id17831251" data-type="problem"><p><em data-effect="italics">P</em>(<em data-effect="italics">x</em> &lt; 4) = _______</p> </div> <div id="fs-idp138936192" data-type="solution"><p id="fs-idp138936448">0.10 + 0.05 + 0.10 = 0.25</p> </div> </div> <div data-type="exercise"><div id="id11755417" data-type="problem"><p>On average, how many years would you expect a child to study ballet with this teacher?</p> </div> <p>1(0.10) + 2(0.05) + 3(0.10) + 4(0.15) + 5(0.30) + 6(0.20) + 7(0.10) = 4.5 years</p> </div> <div data-type="exercise"><div id="id10416606" data-type="problem"><p>What does the column &#8220;<em data-effect="italics">P</em>(<em data-effect="italics">x</em>)&#8221; sum to and why?</p> </div> <div id="fs-idp101715328" data-type="solution"><p id="fs-idp101715584">The sum of the probabilities sum to one because it is a probability distribution.</p> </div> </div> <div id="exercisesix" data-type="exercise"><div id="id26246789" data-type="problem"><p id="prob_6">What does the column &#8220;<em data-effect="italics">x</em>*<em data-effect="italics">P</em>(<em data-effect="italics">x</em>)&#8221; sum to and why?</p> </div> <p>The sum of xP(x) = 4.5; it is the mean of the distribution.</p> </div> <div id="fs-idm7617472" data-type="exercise"><div id="fs-idm79058544" data-type="problem"><p id="fs-idm77353440">You are playing a game by drawing a card from a standard deck and replacing it. If the card is a face card, you win \$30. If it is not a face card, you pay \$2. There are 12 face cards in a deck of 52 cards. What is the expected value of playing the game?</p> </div> <div id="fs-idm10944544" data-type="solution"><p id="fs-idm57969456">\(-2\left(\frac{40}{52}\right)+30\left(\frac{12}{52}\right)=-1.54+6.92=5.38\)</p> </div> </div> <div id="fs-idm132715600" data-type="exercise"><div id="fs-idm79958464" data-type="problem"><p id="fs-idm11733600">You are playing a game by drawing a card from a standard deck and replacing it. If the card is a face card, you win \$30. If it is not a face card, you pay \$2. There are 12 face cards in a deck of 52 cards. Should you play the game?</p> </div> <p>Yes, because there is a positive expected value, and the more you play, the more likely you are to get closer to the expected value.</p> </div> </div> <div id="fs-idm60120304" class="free-response" data-depth="1"><h3 data-type="title">HOMEWORK</h3> <div id="fs-idp10277168" data-type="exercise"><div id="fs-idm105952656" data-type="problem"><p id="fs-idp1332192">1) A theater group holds a fund-raiser. It sells 100 raffle tickets for \$5 apiece. Suppose you purchase four tickets. The prize is two passes to a Broadway show, worth a total of \$150.</p> <ol type="a"><li>What are you interested in here?</li> <li>In words, define the random variable <em data-effect="italics">X</em>.</li> <li>List the values that <em data-effect="italics">X</em> may take on.</li> <li>Construct a PDF.</li> <li>If this fund-raiser is repeated often and you always purchase four tickets, what would be your expected average winnings per raffle?</li> </ol> </div> <p>solution I am interested in the average profit or loss. Let X = the return from the raffle Win(💲150) or Lose (💲0)</p> <table id="fs-idp121516080" summary=""><thead></thead> </table> <p>150( 1 100 )+0( 99 100 )−20=−💲18.50</p> </div> <div data-type="exercise"><div id="eip-id1164884114007" data-type="problem"><p id="eip-id1164893406210">2) A game involves selecting a card from a regular 52-card deck and tossing a coin. The coin is a fair coin and is equally likely to land on heads or tails.</p> <ul id="eip-id1164886929394"><li>If the card is a face card, and the coin lands on Heads, you win \$6</li> <li>If the card is a face card, and the coin lands on Tails, you win \$2</li> <li>If the card is not a face card, you lose \$2, no matter what the coin shows.</li> </ul> <ol id="eip-idm175700160" type="a"><li>Find the expected value for this game (expected net gain or loss).</li> <li>Explain what your calculations indicate about your long-term average profits and losses on this game.</li> <li>Should you play this game to win money?</li> </ol> </div> <div id="eip-id1164886786848" data-type="solution"><p id="eip-id1164900324701">The variable of interest is <em data-effect="italics">X</em>, or the gain or loss, in dollars.</p> <p id="eip-id1164893101611">The face cards jack, queen, and king. There are (3)(4) = 12 face cards and 52 – 12 = 40 cards that are not face cards.</p> <p id="eip-id1164889621862">We first need to construct the probability distribution for <em data-effect="italics">X</em>. We use the card and coin events to determine the probability for each outcome, but we use the monetary value of <em data-effect="italics">X</em> to determine the expected value.</p> <table id="eip-id1164878717794" summary="Table.."><thead><tr><th>Card Event</th> <th><em data-effect="italics">X</em> net gain/loss</th> <th><em data-effect="italics">P</em>(<em data-effect="italics">X</em>)</th> </tr> </thead> <tbody><tr><td>Face Card and Heads</td> <td>6</td> <td>\(\left(\frac{12}{52}\right)\left(\frac{1}{2}\right)=\left(\frac{6}{52}\right)\)</td> </tr> <tr><td>Face Card and Tails</td> <td>2</td> <td>\(\left(\frac{12}{52}\right)\left(\frac{1}{2}\right)=\left(\frac{6}{52}\right)\)</td> </tr> <tr><td>(Not Face Card) and (H or T)</td> <td>–2</td> <td>\(\left(\frac{40}{52}\right)\left(1\right)=\left(\frac{40}{52}\right)\)</td> </tr> </tbody> </table> <ul id="eip-id1164900807448"><li>\(\text{Expected value}=\left(6\right)\left(\frac{6}{52}\right)+\left(2\right)\left(\frac{6}{52}\right)+\left(-2\right)\left(\frac{40}{52}\right)=–\frac{32}{52}\)</li> <li>Expected value = –\$0.62, rounded to the nearest cent</li> <li>If you play this game repeatedly, over a long string of games, you would expect to lose 62 cents per game, on average.</li> <li>You should not play this game to win money because the expected value indicates an expected average loss.</li> </ul> </div> </div> <div data-type="exercise"><div id="eip-id1170608817668" data-type="problem"><p id="eip-id1170601642785">You buy a lottery ticket to a lottery that costs \$10 per ticket. There are only 100 tickets available to be sold in this lottery. In this lottery there are one \$500 prize, two ?100 prizes, and four \$25 prizes. Find your expected gain or loss.</p> </div> <p>solution  Start by writing the probability distribution. X is net gain or loss = prize (if any) less 💲10 cost of ticket</p> <table id="eip-id1170611110640" summary=""><thead></thead> </table> <p>Expected Value =(490)( 1 100 )+(90)( 2 100 )+(15)( 4 100 )+(−10)( 93 100 )=−💲2. There is an expected loss of 💲2 per ticket, on average. &#8211;&gt;</p> </div> <div id="fs-idm153056144" data-type="exercise"><div id="fs-idm153056016" data-type="problem"><p id="element-640">Complete the PDF and answer the questions.</p> <table id="id4858kj4983" summary=""><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> <th><em data-effect="italics">x</em><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>0</td> <td>0.3</td> <td></td> </tr> <tr><td>1</td> <td>0.2</td> <td></td> </tr> <tr><td>2</td> <td></td> <td></td> </tr> <tr><td>3</td> <td>0.4</td> <td></td> </tr> </tbody> </table> <ol id="element-229" type="a"><li>Find the probability that <em data-effect="italics">x</em> = 2.</li> <li>Find the expected value.</li> </ol> </div> <div id="fs-idm10292512" data-type="solution"><ol type="a"><li>0.1</li> <li>1.6</li> </ol> </div> </div> <div id="element-360" data-type="exercise"><div id="id18132670" data-type="problem"><p>Suppose that you are offered the following “deal.” You roll a die. If you roll a six, you win \$10. If you roll a four or five, you win \$5. If you roll a one, two, or three, you pay \$6.</p> <ol id="element-615" type="a"><li>What are you ultimately interested in here (the value of the roll or the money you win)?</li> <li>In words, define the Random Variable <em data-effect="italics">X</em>.</li> <li>List the values that <em data-effect="italics">X</em> may take on.</li> <li>Construct a PDF.</li> <li>Over the long run of playing this game, what are your expected average winnings per game?</li> <li>Based on numerical values, should you take the deal? Explain your decision in complete sentences.</li> </ol> </div> <p>solution  the money won X = the amount of money won or lost 💲5, –💲6, 💲10</p> <table id="fs-idp41914528" summary=""><thead></thead> </table> <p>Expected Value = (10) 1 6 + (5) 2 6 – (6) 3 6 = 0.33 Yes, the expected value is 33 cents</p> </div> <div data-type="exercise"><div id="id19288031" data-type="problem"><p id="element-866">A venture capitalist, willing to invest \$1,000,000, has three investments to choose from. The first investment, a software company, has a 10% chance of returning \$5,000,000 profit, a 30% chance of returning \$1,000,000 profit, and a 60% chance of losing the million dollars. The second company, a hardware company, has a 20% chance of returning \$3,000,000 profit, a 40% chance of returning \$1,000,000 profit, and a 40% chance of losing the million dollars. The third company, a biotech firm, has a 10% chance of returning \$6,000,000 profit, a 70% of no profit or loss, and a 20% chance of losing the million dollars.</p> <ol type="a"><li>Construct a PDF for each investment.</li> <li>Find the expected value for each investment.</li> <li>Which is the safest investment? Why do you think so?</li> <li>Which is the riskiest investment? Why do you think so?</li> <li>Which investment has the highest expected return, on average?</li> </ol> </div> <div id="id19291349" data-type="solution"><ol id="element-751" type="a"><li><table id="fs-idp178643952" summary=""><thead><tr><th colspan="2">Software Company</th> </tr> <tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>5,000,000</td> <td>0.10</td> </tr> <tr><td>1,000,000</td> <td>0.30</td> </tr> <tr><td>–1,000,000</td> <td>0.60</td> </tr> </tbody> </table> <table id="fs-idp140742016" summary=""><thead><tr><th colspan="2">Hardware Company</th> </tr> <tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>3,000,000</td> <td>0.20</td> </tr> <tr><td>1,000,000</td> <td>0.40</td> </tr> <tr><td>–1,000,00</td> <td>0.40</td> </tr> </tbody> </table> <table id="fs-idp135856704" summary=""><thead><tr><th colspan="2">Biotech Firm</th> </tr> <tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>6,00,000</td> <td>0.10</td> </tr> <tr><td>0</td> <td>0.70</td> </tr> <tr><td>–1,000,000</td> <td>0.20</td> </tr> </tbody> </table> </li> <li>\$200,000; \$600,000; \$400,000</li> <li>third investment because it has the lowest probability of loss</li> <li>first investment because it has the highest probability of loss</li> <li>second investment</li> </ol> </div> </div> <div id="fs-idm21991696" data-type="exercise"><div id="eip-idm5350592" data-type="problem"><p>Suppose that 20,000 married adults in the United States were randomly surveyed as to the number of children they have. The results are compiled and are used as theoretical probabilities. Let <em data-effect="italics">X</em> = the number of children married people have.</p> <table id="id48895k006" summary=""><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> <th><em data-effect="italics">x</em><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>0</td> <td>0.10</td> <td></td> </tr> <tr><td>1</td> <td>0.20</td> <td></td> </tr> <tr><td>2</td> <td>0.30</td> <td></td> </tr> <tr><td>3</td> <td></td> <td></td> </tr> <tr><td>4</td> <td>0.10</td> <td></td> </tr> <tr><td>5</td> <td>0.05</td> <td></td> </tr> <tr><td>6 (or more)</td> <td>0.05</td> <td></td> </tr> </tbody> </table> <ol id="eip-idm15856832" type="a"><li>Find the probability that a married adult has three children.</li> <li>In words, what does the expected value in this example represent?</li> <li>Find the expected value.</li> <li>Is it more likely that a married adult will have two to three children or four to six children? How do you know?</li> </ol> </div> <p>solution  0.2 The average number of children married adults have. 2.35 two of three children</p> </div> <div id="fs-idp33692128" data-type="exercise"><div id="fs-idm139132608" data-type="problem"><p id="fs-idp13631152">Suppose that the PDF for the number of years it takes to earn a Bachelor of Science (B.S.) degree is given as in <a class="autogenerated-content" href="#id488jhjhj72479">(Figure)</a>.</p> <table id="id488jhjhj72479" summary=""><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>3</td> <td>0.05</td> </tr> <tr><td>4</td> <td>0.40</td> </tr> <tr><td>5</td> <td>0.30</td> </tr> <tr><td>6</td> <td>0.15</td> </tr> <tr><td>7</td> <td>0.10</td> </tr> </tbody> </table> <p id="eip-idp142551264">On average, how many years do you expect it to take for an individual to earn a B.S.?</p> </div> <div id="fs-idp175439088" data-type="solution"><p id="fs-idp126181696">4.85 years</p> </div> </div> <div id="fs-idp126182336" data-type="exercise"><div id="fs-idp126182592" data-type="problem"><p id="fs-idp126182848">People visiting video rental stores often rent more than one DVD at a time. The probability distribution for DVD rentals per customer at Video To Go is given in the following table. There is a five-video limit per customer at this store, so nobody ever rents more than five DVDs.</p> <table id="fs-idp126183520" summary="."><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>0</td> <td>0.03</td> </tr> <tr><td>1</td> <td>0.50</td> </tr> <tr><td>2</td> <td>0.24</td> </tr> <tr><td>3</td> <td></td> </tr> <tr><td>4</td> <td>0.07</td> </tr> <tr><td>5</td> <td>0.04</td> </tr> </tbody> </table> <ol id="fs-idp118390688" type="a"><li>Describe the random variable <em data-effect="italics">X</em> in words.</li> <li>Find the probability that a customer rents three DVDs.</li> <li>Find the probability that a customer rents at least four DVDs.</li> <li>Find the probability that a customer rents at most two DVDs. <span data-type="newline"><br /> </span>Another shop, Entertainment Headquarters, rents DVDs and video games. The probability distribution for DVD rentals per customer at this shop is given as follows. They also have a five-DVD limit per customer. <table id="fs-idp138577888" summary=""><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>0</td> <td>0.35</td> </tr> <tr><td>1</td> <td>0.25</td> </tr> <tr><td>2</td> <td>0.20</td> </tr> <tr><td>3</td> <td>0.10</td> </tr> <tr><td>4</td> <td>0.05</td> </tr> <tr><td>5</td> <td>0.05</td> </tr> </tbody> </table> </li> <li>At which store is the expected number of DVDs rented per customer higher?</li> <li>If Video to Go estimates that they will have 300 customers next week, how many DVDs do they expect to rent next week? Answer in sentence form.</li> <li>If Video to Go expects 300 customers next week, and Entertainment HQ projects that they will have 420 customers, for which store is the expected number of DVD rentals for next week higher? Explain.</li> <li>Which of the two video stores experiences more variation in the number of DVD rentals per customer? How do you know that?</li> </ol> </div> <p>solution X = the number of video rentals per costumer 0.12 0.11 0.77 Video To Go (1.82 expected value vs. 1.4 for Entertainment Headquarters) The expected number of videos rented to 300 Video To Go customers is 546. The expected number of videos rented to 420 Entertainment Headquarters customers is 588. Entertainment Headquarters will rent more videos. The standard deviation for the number of videos rented at Video To Go is 1.1609. The standard deviation for the number of videos rented at Entertainment Headquarters is 1.4293. Entertainment Headquarters has more variation.</p> </div> <p>Review Questions</p> <div data-type="exercise"><div id="id18734941" data-type="problem"><p id="element-633">A “friend” offers you the following “deal.” For a \$10 fee, you may pick an envelope from a box containing 100 seemingly identical envelopes. However, each envelope contains a coupon for a free gift.</p> <ul><li>Ten of the coupons are for a free gift worth ?6.</li> <li>Eighty of the coupons are for a free gift worth ?8.</li> <li>Six of the coupons are for a free gift worth ?12.</li> <li>Four of the coupons are for a free gift worth ?40.</li> </ul> <p id="element-380">Based upon the financial gain or loss over the long run, should you play the game?</p> <ol id="eip-idp53219808" type="a"><li>Yes, I expect to come out ahead in money.</li> <li>No, I expect to come out behind in money.</li> <li>It doesn’t matter. I expect to break even.</li> </ol> </div> <div id="id7842142" data-type="solution"><p id="element-970">b</p> </div> </div> <div id="fs-idm63057856" data-type="exercise"><div id="fs-idp361456" data-type="problem"><p id="fs-idm71157472">Florida State University has 14 statistics classes scheduled for its Summer 2013 term. One class has space available for 30 students, eight classes have space for 60 students, one class has space for 70 students, and four classes have space for 100 students.</p> <ol id="fs-idm80083648" type="a"><li>What is the average class size assuming each class is filled to capacity?</li> <li>Space is available for 980 students. Suppose that each class is filled to capacity and select a statistics student at random. Let the random variable <em data-effect="italics">X</em> equal the size of the student’s class. Define the PDF for <em data-effect="italics">X</em>.</li> <li>Find the mean of <em data-effect="italics">X</em>.</li> <li>Find the standard deviation of <em data-effect="italics">X</em>.</li> </ol> </div> <p>solution   The average class size is:  30+8(60)+70+4(100) 14 =70 P(x=30)= 1 14 P(x=60)= 8 14 P(x=70)= 1 14 P(x=100)= 4 14 Complete the following table to find the mean and standard deviation of X.</p> <table id="fs-idp893824" summary="Table..."><thead></thead> </table> <p><label>c</label> Mean of X= 30 14 + 480 14 + 70 14 + 400 14 = 980 14 =70 <label>d</label> Standard Deviation of X= 114.2857+57.1429+0+257.1429 =20.702</p> </div> <div id="fs-idp29002528" data-type="exercise"><div id="fs-idp10512912" data-type="problem"><p id="fs-idm46146656">In a lottery, there are 250 prizes of ?5, 50 prizes of ?25, and ten prizes of \$100. Assuming that 10,000 tickets are to be issued and sold, what is a fair price to charge to break even?</p> </div> <div id="fs-idp37641408" data-type="solution"><p id="fs-idp31192112">Let <em data-effect="italics">X</em> = the amount of money to be won on a ticket. The following table shows the PDF for <em data-effect="italics">X</em>.</p> <table id="fs-idm110559056" summary=""><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>0</td> <td>0.969</td> </tr> <tr><td>5</td> <td>\(\frac{\text{250}}{\text{10,000}}\) = 0.025</td> </tr> <tr><td>25</td> <td>\(\frac{\text{50}}{\text{10,000}}\) = 0.005</td> </tr> <tr><td>100</td> <td>\(\frac{\text{10}}{\text{10,000}}\) = 0.001</td> </tr> </tbody> </table> <p id="fs-idm41233712">Calculate the expected value of <em data-effect="italics">X</em>.</p> <p id="fs-idm86179776">0(0.969) + 5(0.025) + 25(0.005) + 100(0.001) = 0.35</p> <p id="fs-idm45715696">A fair price for a ticket is \$0.35. Any price over \$0.35 will enable the lottery to raise money.</p> </div> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="expectedv"><dt>Expected Value</dt> <dd id="id10864394">expected arithmetic average when an experiment is repeated many times; also called the mean. Notations: <em data-effect="italics">μ</em>. For a discrete random variable (RV) with probability distribution function <em data-effect="italics">P</em>(<em data-effect="italics">x</em>),the definition can also be written in the form <em data-effect="italics">μ</em> = \(\sum \)<em data-effect="italics">x</em><em data-effect="italics">P</em>(<em data-effect="italics">x</em>).</dd> </dl> <dl><dt>Mean</dt> <dd>a number that measures the central tendency; a common name for mean is ‘average.’ The term ‘mean’ is a shortened form of ‘arithmetic mean.’ By definition, the mean for a sample (detonated by \(\overline{x}\)) is \(\overline{x}=\frac{\mathrm{Sum} \mathrm{of} \mathrm{all} \mathrm{values} \mathrm{in} \mathrm{the} \mathrm{sample}}{\mathrm{Number} \mathrm{of} \mathrm{values} \mathrm{in} \mathrm{the} \mathrm{sample}}\) and the mean for a population (denoted by <em data-effect="italics">μ</em>) is <em data-effect="italics">μ</em> = \(\frac{\mathrm{Sum} \mathrm{of} \mathrm{all} \mathrm{values} \mathrm{in} \mathrm{the} \mathrm{population}}{\mathrm{Number} \mathrm{of} \mathrm{values} \mathrm{in} \mathrm{the} \mathrm{population}}\).</dd> </dl> <dl id="fs-idm144391840"><dt>Mean of a Probability Distribution</dt> <dd id="fs-idp39061776">the long-term average of many trials of a statistical experiment</dd> </dl> <dl id="fs-idm82850224"><dt>Standard Deviation of a Probability Distribution</dt> <dd id="fs-idm7483264">a number that measures how far the outcomes of a statistical experiment are from the mean of the distribution \(\sigma =\sqrt{\sum \left[{\left(x – \mu \right)}^{2} \bullet  Ρ\left(x\right)\right] }\)</dd> </dl> <dl id="fs-idm140669088"><dt>The Law of Large Numbers</dt> <dd id="fs-idp31486048">As the number of trials in a probability experiment increases, the difference between the theoretical probability of an event and the relative frequency probability approaches zero.</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-binomial-distribution" title="Chapter 5.4: Binomial Distribution"><div class="chapter-title-wrap"><h3 class="chapter-number">34</h3><h2 class="chapter-title"><span class="display-none">Chapter 5.4: Binomial Distribution</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p id="element-999">There are three characteristics of a binomial experiment.</p> <ol><li>There are a fixed number of trials. Think of trials as repetitions of an experiment. The letter <em data-effect="italics">n</em> denotes the number of trials.</li> <li>There are only two possible outcomes, called &#8220;success&#8221; and &#8220;failure,&#8221; for each trial. The letter <em data-effect="italics">p</em> denotes the probability of a success on one trial, and <em data-effect="italics">q</em> denotes the probability of a failure on one trial. <em data-effect="italics">p</em> + <em data-effect="italics">q</em> = 1.</li> <li>The <em data-effect="italics">n</em> trials are independent and are repeated using identical conditions. Because the <em data-effect="italics">n</em> trials are independent, the outcome of one trial does not help in predicting the outcome of another trial. Another way of saying this is that for each individual trial, the probability, <em data-effect="italics">p</em>, of a success and probability, <em data-effect="italics">q</em>, of a failure remain the same. For example, randomly guessing at a true-false statistics question has only two outcomes. If a success is guessing correctly, then a failure is guessing incorrectly. Suppose Joe always guesses correctly on any statistics true-false question with probability <em data-effect="italics">p</em> = 0.6. Then, <em data-effect="italics">q</em> = 0.4. This means that for every true-false statistics question Joe answers, his probability of success (<em data-effect="italics">p</em> = 0.6) and his probability of failure (<em data-effect="italics">q</em> = 0.4) remain the same.</li> </ol> <p>The outcomes of a binomial experiment fit a <span data-type="term">binomial probability distribution</span>. The random variable <em data-effect="italics">X</em> = the number of successes obtained in the <em data-effect="italics">n</em> independent trials.</p> <p>The mean, <em data-effect="italics">μ</em>, and variance, <em data-effect="italics">σ</em><sup>2</sup>, for the binomial probability distribution are <em data-effect="italics">μ</em> = <em data-effect="italics">np</em> and <em data-effect="italics">σ</em><sup>2</sup> = <em data-effect="italics">npq</em>. The standard deviation, <em data-effect="italics">σ</em>, is then <em data-effect="italics">σ</em> = \(\sqrt{npq}\).</p> <p id="element-612">Any experiment that has characteristics two and three and where <em data-effect="italics">n</em> = 1 is called a <span data-type="term">Bernoulli Trial</span> (named after Jacob Bernoulli who, in the late 1600s, studied them extensively). A binomial experiment takes place when the number of successes is counted in one or more Bernoulli Trials.</p> <div id="element-375" class="textbox textbox--examples" data-type="example"><p>At ABC College, the withdrawal rate from an elementary physics course is 30% for any given term. This implies that, for any given term, 70% of the students stay in the class for the entire term. A &#8220;success&#8221; could be defined as an individual who withdrew. The random variable <em data-effect="italics">X</em> = the number of students who withdraw from the randomly selected elementary physics class.</p> </div> <div id="fs-idm41846320" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm40705376" data-type="exercise"><div id="fs-idm68456256" data-type="problem"><p id="fs-idm44525856">The state health board is concerned about the amount of fruit available in school lunches. Forty-eight percent of schools in the state offer fruit in their lunches every day. This implies that 52% do not. What would a &#8220;success&#8221; be in this case?</p> </div> </div> </div> <div id="fs-idm62623744" class="textbox textbox--examples" data-type="example"><p>Suppose you play a game that you can only either win or lose. The probability that you win any game is 55%, and the probability that you lose is 45%. Each game you play is independent. If you play the game 20 times, write the function that describes the probability that you win 15 of the 20 times. Here, if you define <em data-effect="italics">X</em> as the number of wins, then <em data-effect="italics">X</em> takes on the values 0, 1, 2, 3, &#8230;, 20. The probability of a success is <em data-effect="italics">p</em> = 0.55. The probability of a failure is <em data-effect="italics">q</em> = 0.45. The number of trials is <em data-effect="italics">n</em> = 20. The probability question can be stated mathematically as <em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 15).</p> </div> <div id="fs-idm55323584" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm35562192" data-type="exercise"><div id="fs-idm43643408" data-type="problem"><p id="fs-idm43925792">A trainer is teaching a dolphin to do tricks. The probability that the dolphin successfully performs the trick is 35%, and the probability that the dolphin does not successfully perform the trick is 65%. Out of 20 attempts, you want to find the probability that the dolphin succeeds 12 times. State the probability question mathematically.</p> </div> </div> </div> <div id="element-167" class="textbox textbox--examples" data-type="example"><div id="fs-idm61431456" data-type="exercise"><div id="fs-idm3565248" data-type="problem"><p>A fair coin is flipped 15 times. Each flip is independent. What is the probability of getting more than ten heads? Let <em data-effect="italics">X</em> = the number of heads in 15 flips of the fair coin. <em data-effect="italics">X</em> takes on the values 0, 1, 2, 3, &#8230;, 15. Since the coin is fair, <em data-effect="italics">p</em> = 0.5 and <em data-effect="italics">q</em> = 0.5. The number of trials is <em data-effect="italics">n</em> = 15. State the probability question mathematically.</p> </div> </div> </div> <div id="fs-idm26333440" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm77060880" data-type="exercise"><div id="fs-idm123194928" data-type="problem"><p id="fs-idm39905824">A fair, six-sided die is rolled ten times. Each roll is independent. You want to find the probability of rolling a one more than three times. State the probability question mathematically.</p> </div> </div> </div> <div id="element-807" class="textbox textbox--examples" data-type="example"><p>Approximately 70% of statistics students do their homework in time for it to be collected and graded. Each student does homework independently. In a statistics class of 50 students, what is the probability that at least 40 will do their homework on time? Students are selected randomly.</p> <p>&nbsp;</p> <div id="element-1751" data-type="exercise"><div id="id5317083" data-type="problem"><p id="element-732">a. This is a binomial problem because there is only a success or a __________, there are a fixed number of trials, and the probability of a success is 0.70 for each trial.</p> </div> <div id="id5321627" data-type="solution"><p>a. failure</p> <p>&nbsp;</p> </div> </div> <div id="element-1752" data-type="exercise"><div id="id5318862" data-type="problem"><p id="element-73124">b. If we are interested in the number of students who do their homework on time, then how do we define <em data-effect="italics">X</em>?</p> </div> <div id="id5318289" data-type="solution"><p id="element-235352">b. <em data-effect="italics">X</em> = the number of statistics students who do their homework on time</p> <p>&nbsp;</p> </div> </div> <div id="element-17530" data-type="exercise"><div id="id5318085" data-type="problem"><p id="element-732324">c. What values does <em data-effect="italics">x</em> take on?</p> </div> <div id="id5318055" data-type="solution"><p id="element-23352">c. 0, 1, 2, …, 50</p> <p>&nbsp;</p> </div> </div> <div id="element-1753" data-type="exercise"><div id="id5317581" data-type="problem"><p id="element-7324">d. What is a &#8220;failure,&#8221; in words?</p> </div> <div id="id5317186" data-type="solution"><p>d. Failure is defined as a student who does not complete his or her homework on time.</p> <p id="element-8865">The probability of a success is <em data-effect="italics">p</em> = 0.70. The number of trials is <em data-effect="italics">n</em> = 50.</p> <p>&nbsp;</p> </div> </div> <div id="element-1753232" data-type="exercise"><div id="id5317836" data-type="problem"><p id="element-74">e. If <em data-effect="italics">p</em> + <em data-effect="italics">q</em> = 1, then what is <em data-effect="italics">q</em>?</p> </div> <div id="id5317308" data-type="solution"><p id="element-5352">e. <em data-effect="italics">q</em> = 0.30</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id5317006" data-type="problem"><p>f. The words &#8220;at least&#8221; translate as what kind of inequality for the probability question <em data-effect="italics">P</em>(<em data-effect="italics">x</em> ____ 40).</p> </div> <div id="id5319380" data-type="solution"><p>f. greater than or equal to (≥) <span data-type="newline" data-count="1"><br /> </span>The probability question is <em data-effect="italics">P</em>(<em data-effect="italics">x</em> ≥ 40).</p> </div> </div> </div> <div id="fs-idm63118928" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp231440" data-type="exercise"><div id="fs-idm386096" data-type="problem"><p id="fs-idm59210304">Sixty-five percent of people pass the state driver’s exam on the first try. A group of 50 individuals who have taken the driver’s exam is randomly selected. Give two reasons why this is a binomial problem.</p> </div> </div> </div> <div id="element-501" class="bc-section section" data-depth="1"><h3 data-type="title">Notation for the Binomial: <em data-effect="italics">B</em> = Binomial Probability Distribution Function</h3> <p><em data-effect="italics">X</em> ~ <em data-effect="italics">B</em>(<em data-effect="italics">n</em>, <em data-effect="italics">p</em>)</p> <p>Read this as &#8220;<em data-effect="italics">X</em> is a random variable with a binomial distribution.&#8221; The parameters are <em data-effect="italics">n</em> and <em data-effect="italics">p</em>; <em data-effect="italics">n</em> = number of trials, <em data-effect="italics">p</em> = probability of a success on each trial.</p> </div> <div class="textbox textbox--examples" data-type="example"><p id="fs-idp328960">It has been stated that about 41% of adult workers have a high school diploma but do not pursue any further education. If 20 adult workers are randomly selected, find the probability that at most 12 of them have a high school diploma but do not pursue any further education. How many adult workers do you expect to have a high school diploma but do not pursue any further education?</p> <p>Let <em data-effect="italics">X</em> = the number of workers who have a high school diploma but do not pursue any further education.</p> <p id="element-570"><em data-effect="italics">X</em> takes on the values 0, 1, 2, &#8230;, 20 where <em data-effect="italics">n</em> = 20, <em data-effect="italics">p</em> = 0.41, and <em data-effect="italics">q</em> = 1 – 0.41 = 0.59. <em data-effect="italics">X</em> ~ <em data-effect="italics">B</em>(20, 0.41)</p> <p id="fs-idm48754320" class="finger">Find <em data-effect="italics">P</em>(<em data-effect="italics">x</em> ≤ 12). <em data-effect="italics">P</em>(<em data-effect="italics">x</em> ≤ 12) = 0.9738. (calculator or computer)</p> <div id="fs-idm73072400" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="element-224">Go into 2<sup>nd</sup> DISTR. The syntax for the instructions are as follows:</p> <p id="element-853"><strong>To calculate (<em data-effect="italics">x</em> = value): binompdf(<em data-effect="italics">n</em>, <em data-effect="italics">p</em>, number)</strong> if &#8220;number&#8221; is left out, the result is the binomial probability table. <span data-type="newline"><br /> </span><strong>To calculate <em data-effect="italics">P</em>(<em data-effect="italics">x</em> ≤ value): binomcdf(<em data-effect="italics">n</em>, <em data-effect="italics">p</em>, number)</strong> if &#8220;number&#8221; is left out, the result is the cumulative binomial probability table. <span data-type="newline"><br /> </span><strong>For this problem: After you are in 2<sup>nd</sup> DISTR, arrow down to binomcdf. Press ENTER. Enter 20,0.41,12). The result is <em data-effect="italics">P</em>(<em data-effect="italics">x</em> ≤ 12) = 0.9738.</strong></p> </div> <div id="id5442988" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="fs-idm93661872">If you want to find <em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 12), use the pdf (binompdf). If you want to find <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; 12), use 1 &#8211; binomcdf(20,0.41,12).</p> </div> <p>The probability that at most 12 workers have a high school diploma but do not pursue any further education is 0.9738.</p> <p id="element-586">The graph of <em data-effect="italics">X</em> ~ <em data-effect="italics">B</em>(20, 0.41) is as follows:</p> <div id="fs-idm42104448" class="bc-figure figure"><span id="id5403497" data-type="media" data-display="block" data-alt="This histogram shows a binomial probability distribution. It is made up of bars that are fairly normally distributed. The x-axis shows values from 0 to 20. The y-axis shows values from 0 to 0.2 in increments of 0.05."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch04_05_01N-1.jpg" alt="This histogram shows a binomial probability distribution. It is made up of bars that are fairly normally distributed. The x-axis shows values from 0 to 20. The y-axis shows values from 0 to 0.2 in increments of 0.05." width="450" data-media-type="image/jpg" /></span></div> <p id="element-313">The <em data-effect="italics">y</em>-axis contains the probability of <em data-effect="italics">x</em>, where <em data-effect="italics">X</em> = the number of workers who have only a high school diploma.</p> <p>The number of adult workers that you expect to have a high school diploma but not pursue any further education is the mean, <em data-effect="italics">μ</em> = <em data-effect="italics">np</em> = (20)(0.41) = 8.2.</p> <p>The formula for the variance is σ<sup>2</sup> = <em data-effect="italics">npq</em>. The standard deviation is <em data-effect="italics">σ</em> = \(\sqrt{npq}\). <span data-type="newline" data-count="1"><br /> </span><em data-effect="italics">σ</em> = \(\sqrt{\left(20\right)\left(0.41\right)\left(0.59\right)}\) = 2.20.</p> </div> <div id="fs-idm83776992" class="statistics try finger" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm41424160" data-type="exercise"><div id="fs-idp5897264" data-type="problem"><p id="fs-idm41014848">About 32% of students participate in a community volunteer program outside of school. If 30 students are selected at random, find the probability that at most 14 of them participate in a community volunteer program outside of school. Use the TI-83+ or TI-84 calculator to find the answer.</p> </div> </div> </div> <div id="fs-idp7239760" class="textbox textbox--examples" data-type="example"><div id="fs-idm18828000" data-type="exercise"><div id="fs-idm39970896" data-type="problem"><p id="fs-idm52160320">In the 2013 <em data-effect="italics">Jerry’s Artarama</em> art supplies catalog, there are 560 pages. Eight of the pages feature signature artists. Suppose we randomly sample 100 pages. Let <em data-effect="italics">X</em> = the number of pages that feature signature artists.</p> <ol id="fs-idm60857344" type="a"><li>What values does <em data-effect="italics">x</em> take on?</li> <li>What is the probability distribution? Find the following probabilities: <ol id="fs-idm43999408" type="i"><li>the probability that two pages feature signature artists</li> <li>the probability that at most six pages feature signature artists</li> <li>the probability that more than three pages feature signature artists.</li> </ol> </li> <li>Using the formulas, calculate the (i) mean and (ii) standard deviation.</li> </ol> </div> <div id="fs-idm14196128" data-type="solution"><ol id="fs-idm1096144" type="a"><li><em data-effect="italics">x</em> = 0, 1, 2, 3, 4, 5, 6, 7, 8</li> <li><em data-effect="italics">X</em> ~ <em data-effect="italics">B</em>\(\left(100,\frac{8}{560}\right)\) <ol id="fs-idm2065488" type="i"><li><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 2) = binompdf\(\left(100,\frac{8}{560},2\right)\) = 0.2466</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">x</em> ≤ 6) = binomcdf\(\left(100,\frac{8}{560},6\right)\) = 0.9994</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; 3) = 1 – <em data-effect="italics">P</em>(<em data-effect="italics">x</em> ≤ 3) = 1 – binomcdf\(\left(100,\frac{8}{560},3\right)\) = 1 – 0.9443 = 0.0557</li> </ol> </li> <li><ol id="fs-idp38091616" type="i"><li>Mean = <em data-effect="italics">np</em> = (100)\(\left(\frac{8}{560}\right)\) = \(\frac{800}{560}\) ≈ 1.4286</li> <li>Standard Deviation = \(\sqrt{npq}\) = \(\sqrt{\left(100\right)\left(\frac{8}{560}\right)\left(\frac{552}{560}\right)}\) ≈ 1.1867</li> </ol> </li> </ol> </div> </div> </div> <div id="fs-idm76778272" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div id="eip-idm151880496" data-type="problem"><p>According to a Gallup poll, 60% of American adults prefer saving over spending. Let <em data-effect="italics">X</em> = the number of American adults out of a random sample of 50 who prefer saving to spending.</p> <ol id="eip-idm151880240" type="a"><li>What is the probability distribution for <em data-effect="italics">X</em>?</li> <li>Use your calculator to find the following probabilities: <ol id="eip-idm65992864" type="i"><li>the probability that 25 adults in the sample prefer saving over spending</li> <li>the probability that at most 20 adults prefer saving</li> <li>the probability that more than 30 adults prefer saving</li> </ol> </li> <li>Using the formulas, calculate the (i) mean and (ii) standard deviation of <em data-effect="italics">X</em>.</li> </ol> </div> </div> </div> <div id="fs-idm51513664" class="textbox textbox--examples" data-type="example"><p id="fs-idm124136240">The lifetime risk of developing pancreatic cancer is about one in 78 (1.28%). Suppose we randomly sample 200 people. Let <em data-effect="italics">X</em> = the number of people who will develop pancreatic cancer.</p> <div id="fs-idm129798576" data-type="exercise"><div id="fs-idm14459520" data-type="problem"><ol id="fs-idm74773968" type="a"><li>What is the probability distribution for <em data-effect="italics">X</em>?</li> <li>Using the formulas, calculate the (i) mean and (ii) standard deviation of <em data-effect="italics">X</em>.</li> <li class="finger">Use your calculator to find the probability that at most eight people develop pancreatic cancer</li> <li>Is it more likely that five or six people will develop pancreatic cancer? Justify your answer numerically.</li> </ol> </div> </div> </div> <div id="fs-idm113648656" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp4381328" data-type="exercise"><div id="fs-idm2089568" data-type="problem"><p id="fs-idm52769008">During the 2013 regular NBA season, DeAndre Jordan of the Los Angeles Clippers had the highest field goal completion rate in the league. DeAndre scored with 61.3% of his shots. Suppose you choose a random sample of 80 shots made by DeAndre during the 2013 season. Let <em data-effect="italics">X</em> = the number of shots that scored points.</p> <ol id="fs-idm69954960" type="a"><li>What is the probability distribution for <em data-effect="italics">X</em>?</li> <li>Using the formulas, calculate the (i) mean and (ii) standard deviation of <em data-effect="italics">X</em>.</li> <li class="finger">Use your calculator to find the probability that DeAndre scored with 60 of these shots.</li> <li>Find the probability that DeAndre scored with more than 50 of these shots.</li> </ol> </div> </div> </div> <div id="element-678" class="textbox textbox--examples" data-type="example"><p>The following example illustrates a problem that is <strong>not</strong> binomial. It violates the condition of independence. ABC College has a student advisory committee made up of ten staff members and six students. The committee wishes to choose a chairperson and a recorder. What is the probability that the chairperson and recorder are both students? The names of all committee members are put into a box, and two names are drawn <strong>without replacement</strong>. The first name drawn determines the chairperson and the second name the recorder. There are two trials. However, the trials are not independent because the outcome of the first trial affects the outcome of the second trial. The probability of a student on the first draw is \(\frac{6}{16}\). The probability of a student on the second draw is \(\frac{5}{15}\), when the first draw selects a student. The probability is \(\frac{6}{15}\), when the first draw selects a staff member. The probability of drawing a student&#8217;s name changes for each of the trials and, therefore, violates the condition of independence.</p> </div> <div id="fs-idm127539840" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm74755088" data-type="exercise"><div id="fs-idp7533680" data-type="problem"><p id="fs-idm63476304">A lacrosse team is selecting a captain. The names of all the seniors are put into a hat, and the first three that are drawn will be the captains. The names are not replaced once they are drawn (one person cannot be two captains). You want to see if the captains all play the same position. State whether this is binomial or not and state why.</p> </div> </div> </div> <div id="fs-idm225879408" class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="fs-idm41404544">“Access to electricity (% of population),” The World Bank, 2013. Available online at http://data.worldbank.org/indicator/EG.ELC.ACCS.ZS?order=wbapi_data_value_2009%20wbapi_data_value%20wbapi_data_value-first&amp;sort=asc (accessed May 15, 2015).</p> <p id="eip-idp13740384">“Distance Education.” Wikipedia. Available online at http://en.wikipedia.org/wiki/Distance_education (accessed May 15, 2013).</p> <p id="fs-idm119442592">“NBA Statistics – 2013,” ESPN NBA, 2013. Available online at http://espn.go.com/nba/statistics/_/seasontype/2 (accessed May 15, 2013).</p> <p id="fs-idm15300352">Newport, Frank. “Americans Still Enjoy Saving Rather than Spending: Few demographic differences seen in these views other than by income,” GALLUP® Economy, 2013. Available online at http://www.gallup.com/poll/162368/americans-enjoy-saving-rather-spending.aspx (accessed May 15, 2013).</p> <p id="eip-idm5042896">Pryor, John H., Linda DeAngelo, Laura Palucki Blake, Sylvia Hurtado, Serge Tran. <em data-effect="italics">The American Freshman: National Norms Fall 2011</em>. Los Angeles: Cooperative Institutional Research Program at the Higher Education Research Institute at UCLA, 2011. Also available online at http://heri.ucla.edu/PDFs/pubs/TFS/Norms/Monographs/TheAmericanFreshman2011.pdf (accessed May 15, 2013).</p> <p id="fs-idm165639552">“The World FactBook,” Central Intelligence Agency. Available online at https://www.cia.gov/library/publications/the-world-factbook/geos/af.html (accessed May 15, 2013).</p> <p id="fs-idm147318720">“What are the key statistics about pancreatic cancer?” American Cancer Society, 2013. Available online at http://www.cancer.org/cancer/pancreaticcancer/detailedguide/pancreatic-cancer-key-statistics (accessed May 15, 2013).</p> </div> <div id="fs-idm59813888" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idp48275616">A statistical experiment can be classified as a binomial experiment if the following conditions are met:</p> <ol id="fs-idm159853952"><li>There are a fixed number of trials, <em data-effect="italics">n</em>.</li> <li>There are only two possible outcomes, called &#8220;success&#8221; and, &#8220;failure&#8221; for each trial. The letter <em data-effect="italics">p</em> denotes the probability of a success on one trial and <em data-effect="italics">q</em> denotes the probability of a failure on one trial.</li> <li>The <em data-effect="italics">n</em> trials are independent and are repeated using identical conditions.</li> </ol> <p id="fs-idp5771264">The outcomes of a binomial experiment fit a binomial probability distribution. The random variable <em data-effect="italics">X</em> = the number of successes obtained in the <em data-effect="italics">n</em> independent trials. The mean of <em data-effect="italics">X</em> can be calculated using the formula <em data-effect="italics">μ</em> = <em data-effect="italics">np</em>, and the standard deviation is given by the formula σ = \(\text{ }\sqrt{npq}\).</p> </div> <div id="fs-idm5605264" class="formula-review" data-depth="1"><h3 data-type="title">Formula Review</h3> <p id="fs-idm115441184"><em data-effect="italics">X</em> ~ <em data-effect="italics">B</em>(<em data-effect="italics">n</em>, <em data-effect="italics">p</em>) means that the discrete random variable <em data-effect="italics">X</em> has a binomial probability distribution with <em data-effect="italics">n</em> trials and probability of success <em data-effect="italics">p</em>.</p> <p id="fs-idm36014560"><em data-effect="italics">X</em> = the number of successes in <em data-effect="italics">n</em> independent trials</p> <p id="fs-idm13182048"><em data-effect="italics">n</em> = the number of independent trials</p> <p id="fs-idp63510496"><em data-effect="italics">X</em> takes on the values <em data-effect="italics">x</em> = 0, 1, 2, 3, &#8230;, <em data-effect="italics">n</em></p> <p id="fs-idp14543168"><em data-effect="italics">p</em> = the probability of a success for any trial</p> <p id="fs-idm14043088"><em data-effect="italics">q</em> = the probability of a failure for any trial</p> <p id="eip-707"><em data-effect="italics">p</em> + <em data-effect="italics">q</em> = 1</p> <p id="fs-idm79078144"><em data-effect="italics">q</em> = 1 – <em data-effect="italics">p</em></p> <p id="fs-idp23879248">The mean of <em data-effect="italics">X</em> is <em data-effect="italics">μ</em> = <em data-effect="italics">np</em>. The standard deviation of <em data-effect="italics">X</em> is <em data-effect="italics">σ</em> = \(\sqrt{npq}\).</p> </div> <div id="fs-idp75382928" class="practice" data-depth="1"><p id="fs-idm98093824"><em data-effect="italics">Use the following information to answer the next eight exercises:</em> The Higher Education Research Institute at UCLA collected data from 203,967 incoming first-time, full-time freshmen from 270 four-year colleges and universities in the U.S. 71.3% of those students replied that, yes, they believe that same-sex couples should have the right to legal marital status. Suppose that you randomly pick eight first-time, full-time freshmen from the survey. You are interested in the number that believes that same sex-couples should have the right to legal marital status.</p> <div data-type="exercise"><div id="id8424885" data-type="problem"><p>In words, define the random variable <em data-effect="italics">X</em>.</p> </div> <div id="id8424902" data-type="solution"><p id="element-88758234"><em data-effect="italics">X</em> = the number that reply “yes”</p> </div> </div> <div data-type="exercise"><div id="id8424932" data-type="problem"><p><em data-effect="italics">X</em> ~ _____(_____,_____)</p> </div> <p>solution B(8,0.713)</p> </div> <div data-type="exercise"><div id="id7617633" data-type="problem"><p>What values does the random variable <em data-effect="italics">X</em> take on?</p> </div> <div id="id7617656" data-type="solution"><p id="element-88758223434">0, 1, 2, 3, 4, 5, 6, 7, 8</p> </div> </div> <div data-type="exercise"><div id="id7617683" data-type="problem"><p>Construct the probability distribution function (PDF).</p> <table id="id7383862" summary="PDF table with number of full-time freshmen that believes that same sex-couples should have the right to legal marital status and probabilities."><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> <tr><td></td> <td></td> </tr> </tbody> </table> </div> <p>solution</p> <table id="fs-idm129196976" summary=""><thead></thead> </table> <p>&nbsp;</p> </div> <div data-type="exercise"><div id="id2585221" data-type="problem"><p>On average (<em data-effect="italics">μ</em>), how many would you expect to answer yes?</p> </div> <div id="id2585250" data-type="solution"><p id="element-88758233254">5.7</p> </div> </div> <div data-type="exercise"><div id="id8562214" data-type="problem"><p>What is the standard deviation (<em data-effect="italics">σ</em>)?</p> </div> <p>solution  1.2795</p> </div> <div id="exerciseseven" data-type="exercise"><div id="id8562271" data-type="problem"><p>What is the probability that at most five of the freshmen reply “yes”?</p> </div> <div id="id8562288" data-type="solution"><p id="element-88758232344">0.4151</p> </div> </div> <div data-type="exercise"><div id="id8562314" data-type="problem"><p>What is the probability that at least two of the freshmen reply “yes”?</p> </div> <p>solution  0.9990</p> </div> </div> <div id="fs-idp12065056" class="free-response" data-depth="1"><h3 data-type="title">HOMEWORK</h3> <div id="eip-idm67916816" data-type="exercise"><div id="eip-idm67916560" data-type="problem"><p id="eip-idm67916304">According to a recent article the average number of babies born with significant hearing loss (deafness) is approximately two per 1,000 babies in a healthy baby nursery. The number climbs to an average of 30 per 1,000 babies in an intensive care nursery.</p> <p id="eip-idm92091200">Suppose that 1,000 babies from healthy baby nurseries were randomly surveyed. Find the probability that exactly two babies were born deaf.</p> </div> <p>solution 0.2709</p> </div> <p>Use the following information to answer the next four exercises. Recently, a nurse commented that when a patient calls the medical advice line claiming to have the flu, the chance that he or she truly has the flu (and not just a nasty cold) is only about 4%. Of the next 25 patients calling in claiming to have the flu, we are interested in how many actually have the flu.</p> <div id="eip-idm71407792" data-type="exercise"><div id="eip-idm71407536" data-type="problem"><p id="eip-idm71406576">Define the random variable and list its possible values.</p> </div> <div id="eip-idm72550784" data-type="solution"><p id="eip-idm72550528"><em data-effect="italics">X</em> = the number of patients calling in claiming to have the flu, who actually have the flu.</p> <p id="eip-idm72550032"><em data-effect="italics">X</em> = 0, 1, 2, &#8230;25</p> </div> </div> <div id="eip-idm5233360" data-type="exercise"><div id="eip-idm5233104" data-type="problem"><p id="eip-idm5232848">State the distribution of <em data-effect="italics">X</em>.</p> </div> <p>solution  B(25,0.04)</p> </div> <div id="eip-idm142417552" data-type="exercise"><div id="eip-idm142417296" data-type="problem"><p id="eip-idm142417040">Find the probability that at least four of the 25 patients actually have the flu.</p> </div> <div id="eip-idm188198112" data-type="solution"><p id="eip-idm188197856">0.0165</p> </div> </div> <div id="eip-idm188197216" data-type="exercise"><div id="eip-idm188196960" data-type="problem"><p id="eip-idm188196704">On average, for every 25 patients calling in, how many do you expect to have the flu?</p> </div> <p>solution  one</p> </div> <div id="fs-idp24682912" data-type="exercise"><div id="fs-idp24683040" data-type="problem"><p id="fs-idm1731712">People visiting video rental stores often rent more than one DVD at a time. The probability distribution for DVD rentals per customer at Video To Go is given <a class="autogenerated-content" href="#M04_Ch04_tbl005">(Figure)</a>. There is five-video limit per customer at this store, so nobody ever rents more than five DVDs.</p> <table id="M04_Ch04_tbl005" summary="The table lists the number of video rentals (1, 2, 3, 4, 5) and the respective probabilities."><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>0</td> <td>0.03</td> </tr> <tr><td>1</td> <td>0.50</td> </tr> <tr><td>2</td> <td>0.24</td> </tr> <tr><td>3</td> <td></td> </tr> <tr><td>4</td> <td>0.07</td> </tr> <tr><td>5</td> <td>0.04</td> </tr> </tbody> </table> <ol id="eip-id1169667979628" type="a" data-mark-suffix="."><li>Describe the random variable <em data-effect="italics">X</em> in words.</li> <li>Find the probability that a customer rents three DVDs.</li> <li>Find the probability that a customer rents at least four DVDs.</li> <li>Find the probability that a customer rents at most two DVDs.</li> </ol> </div> <div id="fs-idm53665904" data-type="solution"><ol id="eip-idm83204976" type="a"><li><em data-effect="italics">X</em> = the number of DVDs a Video to Go customer rents</li> <li>0.12</li> <li>0.11</li> <li>0.77</li> </ol> </div> </div> <div id="element-885" data-type="exercise"><div id="id18885508" data-type="problem"><p>A school newspaper reporter decides to randomly survey 12 students to see if they will attend Tet (Vietnamese New Year) festivities this year. Based on past years, she knows that 18% of students attend Tet festivities. We are interested in the number of students who will attend the festivities.</p> <ol id="element-942" type="a"><li>In words, define the random variable <em data-effect="italics">X</em>.</li> <li>List the values that <em data-effect="italics">X</em> may take on.</li> <li>Give the distribution of <em data-effect="italics">X</em>. <em data-effect="italics">X ~ _____(_____,_____)</em></li> <li>How many of the 12 students do we expect to attend the festivities?</li> <li>Find the probability that at most four students will attend.</li> <li>Find the probability that more than two students will attend.</li> </ol> </div> <p>solution X = the number of students who will attend Tet. 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 X ~ B(12,0.18) 2.16 0.9511 0.3702</p> </div> <p id="fs-idm84170848"><em data-effect="italics">Use the following information to answer the next two exercises:</em> The probability that the San Jose Sharks will win any given game is 0.3694 based on a 13-year win history of 382 wins out of 1,034 games played (as of a certain date). An upcoming monthly schedule contains 12 games.</p> <div id="eip-514" data-type="exercise"><div data-type="problem"><p>The expected number of wins for that upcoming month is:</p> <ol id="eip-idm77891776" type="a"><li>1.67</li> <li>12</li> <li>\(\frac{382}{1043}\)</li> <li>4.43</li> </ol> </div> <div data-type="solution"><p>d. 4.43</p> </div> </div> <p id="eip-490">Let <em data-effect="italics">X</em> = the number of games won in that upcoming month.</p> <div data-type="exercise"><div id="id18490769" data-type="problem"><p id="element-374">What is the probability that the San Jose Sharks win six games in that upcoming month?</p> <ol type="a"><li>0.1476</li> <li>0.2336</li> <li>0.7664</li> <li>0.8903</li> </ol> </div> <p>solution  a</p> </div> <div data-type="exercise"><div id="id13466783" data-type="problem"><p id="element-757">What is the probability that the San Jose Sharks win at least five games in that upcoming month</p> <ol type="a"><li>0.3694</li> <li>0.5266</li> <li>0.4734</li> <li>0.2305</li> </ol> </div> <div id="id19072980" data-type="solution"><p>c</p> </div> </div> <div id="eip-122" data-type="exercise"><div id="eip-id1472188" data-type="problem"><p id="eip-id1171734673192">A student takes a ten-question true-false quiz, but did not study and randomly guesses each answer. Find the probability that the student passes the quiz with a grade of at least 70% of the questions correct.</p> </div> <p>solution X = number of questions answered correctly X ~ B(10, 0.5) We are interested in AT LEAST 70% of ten questions correct. 70% of ten is seven. We want to find the probability that X is greater than or equal to seven. The event &#8220;at least seven&#8221; is the complement of &#8220;less than or equal to six&#8221;. Using your calculator&#8217;s distribution menu: 1 – binomcdf(10, .5, 6) gives 0.171875 The probability of getting at least 70% of the ten questions correct when randomly guessing is approximately 0.172.</p> </div> <div id="eip-421" data-type="exercise"><div id="eip-id1172769915831" data-type="problem"><p id="eip-id1172772751393">A student takes a 32-question multiple-choice exam, but did not study and randomly guesses each answer. Each question has three possible choices for the answer. Find the probability that the student guesses <strong>more than</strong> 75% of the questions correctly.</p> </div> <div id="eip-id1172771134070" data-type="solution"><ul id="eip-id1172768406348"><li><em data-effect="italics">X</em> = number of questions answered correctly</li> <li><em data-effect="italics">X</em> ~ <em data-effect="italics">B</em>\(\left(\text{32, }\frac{\text{1}}{\text{3}}\right)\)</li> <li>We are interested in MORE THAN 75% of 32 questions correct. 75% of 32 is 24. We want to find <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; 24). The event &#8220;more than 24&#8221; is the complement of &#8220;less than or equal to 24.&#8221;</li> <li>Using your calculator&#8217;s distribution menu: 1 – binomcdf\(\left(\text{32, }\frac{\text{1}}{\text{3}},\text{ 24}\right)\)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; 24) = 0</li> <li>The probability of getting more than 75% of the 32 questions correct when randomly guessing is very small and practically zero.</li> </ul> </div> </div> <div id="element-300" data-type="exercise"><div id="id18835726" data-type="problem"><p>Six different colored dice are rolled. Of interest is the number of dice that show a one.</p> <ol id="element-722" type="a"><li>In words, define the random variable <em data-effect="italics">X</em>.</li> <li>List the values that <em data-effect="italics">X</em> may take on.</li> <li>Give the distribution of <em data-effect="italics">X</em>. <em data-effect="italics">X</em> ~ _____(_____,_____)</li> <li>On average, how many dice would you expect to show a one?</li> <li>Find the probability that all six dice show a one.</li> <li>Is it more likely that three or that four dice will show a one? Use numbers to justify your answer numerically.</li> </ol> </div> <p>solution X = the number of dice that show a one 0, 1, 2, 3, 4, 5, 6 X ~ B ( 6, 1 6 ) 1 0.00002 three dice &#8211;&gt;</p> </div> <div data-type="exercise"><div id="id19066240" data-type="problem"><p id="element-439">More than 96 percent of the very largest colleges and universities (more than 15,000 total enrollments) have some online offerings. Suppose you randomly pick 13 such institutions. We are interested in the number that offer distance learning courses.</p> <ol id="fs-idm8844720" type="a"><li>In words, define the random variable <em data-effect="italics">X</em>.</li> <li>List the values that <em data-effect="italics">X</em> may take on.</li> <li>Give the distribution of <em data-effect="italics">X</em>. <em data-effect="italics">X</em> ~ _____(_____,_____)</li> <li>On average, how many schools would you expect to offer such courses?</li> <li>Find the probability that at most ten offer such courses.</li> <li>Is it more likely that 12 or that 13 will offer such courses? Use numbers to justify your answer numerically and answer in a complete sentence.</li> </ol> </div> <div id="fs-idm24618656" data-type="solution"><ol id="fs-idm63350320" type="a"><li><em data-effect="italics">X</em> = the number of college and universities that offer online offerings.</li> <li>0, 1, 2, …, 13</li> <li><em data-effect="italics">X</em> ~ <em data-effect="italics">B</em>(13, 0.96)</li> <li>12.48</li> <li>0.0135</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 12) = 0.3186 <em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 13) = 0.5882 More likely to get 13.</li> </ol> </div> </div> <div data-type="exercise"><div id="id15690689" data-type="problem"><p>Suppose that about 85% of graduating students attend their graduation. A group of 22 graduating students is randomly chosen.</p> <ol id="fs-idm129637152" type="a"><li>In words, define the random variable <em data-effect="italics">X</em>.</li> <li>List the values that <em data-effect="italics">X</em> may take on.</li> <li>Give the distribution of <em data-effect="italics">X</em>. <em data-effect="italics">X</em> ~ _____(_____,_____)</li> <li>How many are expected to attend their graduation?</li> <li>Find the probability that 17 or 18 attend.</li> <li>Based on numerical values, would you be surprised if all 22 attended graduation? Justify your answer numerically.</li> </ol> </div> <p>solution X = the number of students who attend their graduation 0, 1, 2, …, 22 X ~ B(22, 0.85) 18.7 0.3249 P(x = 22) = 0.0280 (less than 3%) which is unusual &#8211;&gt;</p> </div> <div data-type="exercise"><div id="id15848913" data-type="problem"><p id="element-174">At The Fencing Center, 60% of the fencers use the foil as their main weapon. We randomly survey 25 fencers at The Fencing Center. We are interested in the number of fencers who do <strong>not</strong> use the foil as their main weapon.</p> <ol id="element-79" type="a"><li>In words, define the random variable <em data-effect="italics">X</em>.</li> <li>List the values that <em data-effect="italics">X</em> may take on.</li> <li>Give the distribution of <em data-effect="italics">X</em>. <em data-effect="italics">X</em> ~ _____(_____,_____)</li> <li>How many are expected to <strong>not</strong> to use the foil as their main weapon?</li> <li>Find the probability that six do <strong>not</strong> use the foil as their main weapon.</li> <li>Based on numerical values, would you be surprised if all 25 did <strong>not</strong> use foil as their main weapon? Justify your answer numerically.</li> </ol> </div> <div id="id9368624" data-type="solution"><ol id="element-578" type="a"><li><em data-effect="italics">X</em> = the number of fencers who do <strong>not</strong> use the foil as their main weapon</li> <li>0, 1, 2, 3,&#8230; 25</li> <li><em data-effect="italics">X</em> ~ <em data-effect="italics">B</em>(25,0.40)</li> <li>10</li> <li>0.0442</li> <li>The probability that all 25 not use the foil is almost zero. Therefore, it would be very surprising.</li> </ol> </div> </div> <div id="element-816" data-type="exercise"><div id="id3189118" data-type="problem"><p id="element-741">Approximately 8% of students at a local high school participate in after-school sports all four years of high school. A group of 60 seniors is randomly chosen. Of interest is the number who participated in after-school sports all four years of high school.</p> <ol type="a"><li>In words, define the random variable <em data-effect="italics">X</em>.</li> <li>List the values that <em data-effect="italics">X</em> may take on.</li> <li>Give the distribution of <em data-effect="italics">X</em>. <em data-effect="italics">X</em> ~ _____(_____,_____)</li> <li>How many seniors are expected to have participated in after-school sports all four years of high school?</li> <li>Based on numerical values, would you be surprised if none of the seniors participated in after-school sports all four years of high school? Justify your answer numerically.</li> <li>Based upon numerical values, is it more likely that four or that five of the seniors participated in after-school sports all four years of high school? Justify your answer numerically.</li> </ol> </div> <p>solution  X = the number of high school students who participate in after school sports all four years of high school. 0, 1, 2, …, 60 X ~ B(60, 0.08) 4.8 Yes, P(x = 0) = 0.0067, which is a small probability P(x = 4) = 0.1873, P(x = 5) = 0.1824. More likely to get four. &#8211;&gt;</p> </div> <div data-type="exercise"><div id="id15835890" data-type="problem"><p id="element-334">The chance of an IRS audit for a tax return with over ?25,000 in income is about 2% per year. We are interested in the expected number of audits a person with that income has in a 20-year period. Assume each year is independent.</p> <ol type="a"><li>In words, define the random variable <em data-effect="italics">X</em>.</li> <li>List the values that <em data-effect="italics">X</em> may take on.</li> <li>Give the distribution of <em data-effect="italics">X</em>. <em data-effect="italics">X</em> ~ _____(_____,_____)</li> <li>How many audits are expected in a 20-year period?</li> <li>Find the probability that a person is not audited at all.</li> <li>Find the probability that a person is audited more than twice.</li> </ol> </div> <div id="fs-idm54319888" data-type="solution"><ol id="fs-idm54319632" type="a"><li><em data-effect="italics">X</em> = the number of audits in a 20-year period</li> <li>0, 1, 2, …, 20</li> <li><em data-effect="italics">X</em> ~ <em data-effect="italics">B</em>(20, 0.02)</li> <li>0.4</li> <li>0.6676</li> <li>0.0071</li> </ol> </div> </div> <div id="element-962" data-type="exercise"><div id="id18991543" data-type="problem"><p id="element-458">It has been estimated that only about 30% of California residents have adequate earthquake supplies. Suppose you randomly survey 11 California residents. We are interested in the number who have adequate earthquake supplies.</p> <ol type="a"><li>In words, define the random variable <em data-effect="italics">X</em>.</li> <li>List the values that <em data-effect="italics">X</em> may take on.</li> <li>Give the distribution of <em data-effect="italics">X</em>. <em data-effect="italics">X</em> ~ _____(_____,_____)</li> <li>What is the probability that at least eight have adequate earthquake supplies?</li> <li>Is it more likely that none or that all of the residents surveyed will have adequate earthquake supplies? Why?</li> <li>How many residents do you expect will have adequate earthquake supplies?</li> </ol> </div> <p>solution X = the number of California residents who do have adequate earthquake supplies. 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 B(11, 0.30) 0.0043 P(x = 0) = 0.0198. P(x = 11) = 0 or none 3.3 &#8211;&gt;</p> </div> <div data-type="exercise"><div id="id16155444" data-type="problem"><p>There are two similar games played for Chinese New Year and Vietnamese New Year. In the Chinese version, fair dice with numbers 1, 2, 3, 4, 5, and 6 are used, along with a board with those numbers. In the Vietnamese version, fair dice with pictures of a gourd, fish, rooster, crab, crayfish, and deer are used. The board has those six objects on it, also. We will play with bets being ?1. The player places a bet on a number or object. The “house” rolls three dice. If none of the dice show the number or object that was bet, the house keeps the ?1 bet. If one of the dice shows the number or object bet (and the other two do not show it), the player gets back his or her ?1 bet, plus ?1 profit. If two of the dice show the number or object bet (and the third die does not show it), the player gets back his or her ?1 bet, plus ?2 profit. If all three dice show the number or object bet, the player gets back his or her ?1 bet, plus ?3 profit. Let <em data-effect="italics">X</em> = number of matches and <em data-effect="italics">Y</em> = profit per game.</p> <ol id="element-82" type="a"><li>In words, define the random variable <em data-effect="italics">X</em>.</li> <li>List the values that <em data-effect="italics">X</em> may take on.</li> <li>Give the distribution of <em data-effect="italics">X</em>. <em data-effect="italics">X</em> ~ _____(_____,_____)</li> <li>List the values that <em data-effect="italics">Y</em> may take on. Then, construct one PDF table that includes both <em data-effect="italics">X</em> and <em data-effect="italics">Y</em> and their probabilities.</li> <li>Calculate the average expected matches over the long run of playing this game for the player.</li> <li>Calculate the average expected earnings over the long run of playing this game for the player.</li> <li>Determine who has the advantage, the player or the house.</li> </ol> </div> <div id="eip-idm75680048" data-type="solution"><ol id="eip-idm75679792"><li><em data-effect="italics">X</em> = the number of matches</li> <li>0, 1, 2, 3</li> <li><em data-effect="italics">X</em> ~ <em data-effect="italics">B</em>\(\left(3,\frac{1}{6}\right)\)</li> <li>In dollars: −1, 1, 2, 3</li> <li>\(\frac{1}{2}\)</li> <li>Multiply each <em data-effect="italics">Y</em> value by the corresponding <em data-effect="italics">X</em> probability from the PDF table. The answer is −0.0787. You lose about eight cents, on average, per game.</li> <li>The house has the advantage.</li> </ol> </div> </div> <div id="fs-idm40828128" data-type="exercise"><div id="fs-idm51462656" data-type="problem"><p id="fs-idp72203552">According to The World Bank, only 9% of the population of Uganda had access to electricity as of 2009. Suppose we randomly sample 150 people in Uganda. Let <em data-effect="italics">X</em> = the number of people who have access to electricity.</p> <ol id="fs-idm81777984" type="a"><li>What is the probability distribution for <em data-effect="italics">X</em>?</li> <li>Using the formulas, calculate the mean and standard deviation of <em data-effect="italics">X</em>.</li> <li>Use your calculator to find the probability that 15 people in the sample have access to electricity.</li> <li>Find the probability that at most ten people in the sample have access to electricity.</li> <li>Find the probability that more than 25 people in the sample have access to electricity.</li> </ol> </div> <p>solution  X ~ B(150,0.09) Mean = np = 150(0.09) = 13.5 Standard Deviation = npq =   150(0.09)(0.91) ≈ 3.5050 P(x = 15) = binompdf(150, 0.09, 15) = 0.0988 P(x ≤ 10) = binomcdf(150, 0.09, 10) = 0.1987 P(x &gt; 25) = 1 – P(x ≤ 25) = 1 – binomcdf(150, 0.09, 25) = 1 – 0.9991 = 0.0009 &#8211;&gt;</p> </div> <div id="fs-idm104093808" data-type="exercise"><div id="fs-idp20161024" data-type="problem"><p id="fs-idp99891040">The literacy rate for a nation measures the proportion of people age 15 and over that can read and write. The literacy rate in Afghanistan is 28.1%. Suppose you choose 15 people in Afghanistan at random. Let <em data-effect="italics">X</em> = the number of people who are literate.</p> <ol id="fs-idm37302880" type="a"><li>Sketch a graph of the probability distribution of <em data-effect="italics">X</em>.</li> <li>Using the formulas, calculate the (i) mean and (ii) standard deviation of <em data-effect="italics">X</em>.</li> <li>Find the probability that more than five people in the sample are literate. Is it is more likely that three people or four people are literate.</li> </ol> </div> <div id="fs-idp49095408" data-type="solution"><ol id="fs-idm15010256" type="a"><li><em data-effect="italics">X</em> ~ <em data-effect="italics">B</em>(15, 0.281) <div id="fs-idm99183632" class="bc-figure figure"><span id="fs-idm76247600" data-type="media" data-alt="This histogram shows a binomial probability distribution. It is made up of bars that are fairly normally distributed. The x-axis shows values from 0 to 15, with bars from 0 to 9. The y-axis shows values from 0 to 0.25 in increments of 0.05." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C04_M05_001anno-1.jpg" alt="This histogram shows a binomial probability distribution. It is made up of bars that are fairly normally distributed. The x-axis shows values from 0 to 15, with bars from 0 to 9. The y-axis shows values from 0 to 0.25 in increments of 0.05." width="450" data-media-type="image/png" /></span></div> </li> <li><ol id="fs-idm155337312" type="i"><li>Mean = <em data-effect="italics">μ</em> = <em data-effect="italics">np</em> = 15(0.281) = 4.215</li> <li>Standard Deviation = <em data-effect="italics">σ</em> = \(\sqrt{npq}\) = \(\sqrt{15\left(0.281\right)\left(0.719\right)}\) = 1.7409</li> </ol> </li> <li><em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; 5) = 1 – <em data-effect="italics">P</em>(<em data-effect="italics">x</em> ≤ 5) = 1 – binomcdf(15, 0.281, 5) = 1 – 0.7754 = 0.2246 <span data-type="newline"><br /> </span><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 3) = binompdf(15, 0.281, 3) = 0.1927 <span data-type="newline"><br /> </span><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 4) = binompdf(15, 0.281, 4) = 0.2259 <span data-type="newline"><br /> </span>It is more likely that four people are literate that three people are.</li> </ol> </div> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="fs-idm523840"><dt>Binomial Experiment</dt> <dd id="fs-idp17997392">a statistical experiment that satisfies the following three conditions: <ol id="fs-idp38764304"><li>There are a fixed number of trials, <em data-effect="italics">n</em>.</li> <li>There are only two possible outcomes, called &#8220;success&#8221; and, &#8220;failure,&#8221; for each trial. The letter <em data-effect="italics">p</em> denotes the probability of a success on one trial, and <em data-effect="italics">q</em> denotes the probability of a failure on one trial.</li> <li>The <em data-effect="italics">n</em> trials are independent and are repeated using identical conditions.</li> </ol> </dd> </dl> <dl id="bernoullitr"><dt>Bernoulli Trials</dt> <dd id="id5444014">an experiment with the following characteristics: <ol id="gloslst1"><li>There are only two possible outcomes called “success” and “failure” for each trial.</li> <li>The probability <em data-effect="italics">p</em> of a success is the same for any trial (so the probability <em data-effect="italics">q</em> = 1 − <em data-effect="italics">p</em> of a failure is the same for any trial).</li> </ol> </dd> </dl> <dl id="bidist"><dt>Binomial Probability Distribution</dt> <dd id="id8181257">a discrete random variable (RV) that arises from Bernoulli trials; there are a fixed number, <em data-effect="italics">n</em>, of independent trials. “Independent” means that the result of any trial (for example, trial one) does not affect the results of the following trials, and all trials are conducted under the same conditions. Under these circumstances the binomial RV <em data-effect="italics">X</em> is defined as the number of successes in <em data-effect="italics">n</em> trials. The notation is: <em data-effect="italics">X</em> ~ <em data-effect="italics">B</em>(<em data-effect="italics">n</em>, <em data-effect="italics">p</em>). The mean is <em data-effect="italics">μ</em> = <em data-effect="italics">np</em> and the standard deviation is <em data-effect="italics">σ</em> = \(\sqrt{npq}\). The probability of exactly <em data-effect="italics">x</em> successes in <em data-effect="italics">n</em> trials is <span data-type="newline"><br /> </span><em data-effect="italics">P</em>(<em data-effect="italics">X</em> = <em data-effect="italics">x</em>) = \(\left(\begin{array}{l}n\\ x\end{array}\right)\)<em data-effect="italics">p</em><sup>x</sup><em data-effect="italics">q</em><sup>n − x</sup>.</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-discrete-distribution-playing-card-experiment" title="Activity 5.5: Discrete Distribution (Playing Card Experiment)"><div class="chapter-title-wrap"><h3 class="chapter-number">35</h3><h2 class="chapter-title"><span class="display-none">Activity 5.5: Discrete Distribution (Playing Card Experiment)</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-id1167902738757" class="statistics lab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Discrete Distribution (Playing Card Experiment)</div> <p id="id6097603">Class Time:</p> <p id="id3675915">Names:</p> <div id="id11235020" data-type="list"><div data-type="title">Student Learning Outcomes</div> <ul><li>The student will compare empirical data and a theoretical distribution to determine if an everyday experiment fits a discrete distribution.</li> <li>The student will compare technology-generated simulation and a theoretical distribution.</li> <li>The student will demonstrate an understanding of long-term probabilities.</li> </ul> </div> <div id="id3383537" data-type="list"><div data-type="title">Supplies</div> <ul><li>One full deck of playing cards</li> <li>One programming calculator</li> </ul> </div> <p id="proceduresec"><span data-type="title">Procedure</span>The experimental procedure for empirical data is to pick one card from a deck of shuffled cards.</p> <ol id="list-982763895"><li>The theoretical probability of picking a diamond from a deck is _________.</li> <li>Shuffle a deck of cards.</li> <li>Pick one card from it.</li> <li>Record whether it was a diamond or not a diamond.</li> <li>Put the card back and reshuffle.</li> <li>Do this a total of ten times.</li> <li>Record the number of diamonds picked.</li> <li>Let <em data-effect="italics">X</em> = number of diamonds. Theoretically, <em data-effect="italics">X</em> ~ <em data-effect="italics">B</em>(_____,_____)</li> </ol> <div id="list-23562621" data-type="list"><div id="OrgData" data-type="title">Organize the Data</div> <ol><li>Record the number of diamonds picked for your class with playing cards in <a class="autogenerated-content" href="#lab1_tbl001">(Figure)</a>. Then calculate the relative frequency.<br /> <table id="lab1_tbl001" summary="Table for recording values with. The first column is for number of diamonds picked (x) from 0-10, blank second column is for frequency, and the blank third column is for relative frequency."><thead><tr><th><em data-effect="italics">x</em></th> <th>Frequency</th> <th>Relative Frequency</th> </tr> </thead> <tbody><tr><td>0</td> <td>__________</td> <td data-align="center">__________</td> </tr> <tr><td>1</td> <td>__________</td> <td data-align="center">__________</td> </tr> <tr><td>2</td> <td>__________</td> <td data-align="center">__________</td> </tr> <tr><td>3</td> <td>__________</td> <td data-align="center">__________</td> </tr> <tr><td>4</td> <td>__________</td> <td data-align="center">__________</td> </tr> <tr><td>5</td> <td>__________</td> <td data-align="center">__________</td> </tr> <tr><td>6</td> <td>__________</td> <td data-align="center">__________</td> </tr> <tr><td>7</td> <td>__________</td> <td data-align="center">__________</td> </tr> <tr><td>8</td> <td>__________</td> <td data-align="center">__________</td> </tr> <tr><td>9</td> <td>__________</td> <td data-align="center">__________</td> </tr> <tr><td>10</td> <td>__________</td> <td data-align="center">__________</td> </tr> </tbody> </table> </li> <li>Calculate the following: <ol id="list-2356" type="a"><li>\(\overline{x}\) = ________</li> <li><em data-effect="italics">s</em> = ________</li> </ol> </li> <li>Construct a histogram of the empirical data. <div id="figreld" class="bc-figure figure"><span id="id5973001" data-type="media" data-alt="This is a blank graph template. The x-axis is labeled Number of diamonds. The y-axis is labeled Relative frequency."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch04_17_01-1.png" alt="This is a blank graph template. The x-axis is labeled Number of diamonds. The y-axis is labeled Relative frequency." width="400" data-media-type="image/png" data-print-width="4in" /></span></div> </li> </ol> </div> <div id="list-742309876" data-type="list"><div id="TheoDist" data-type="title">Theoretical Distribution</div> <ol type="a"><li>Build the theoretical PDF chart based on the distribution in the <a href="#proceduresec">Procedure</a> section.<br /> <table id="lab1_tbl002" class="unnumbered" summary="The experiment is for each class member or group to shuffle a 52-card deck of cards, pick one card and record if it is a diamond. This is done 10 times. The table shows the results of the number of diamonds picked and the relative frequency." data-label=""><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>)</th> </tr> </thead> <tbody><tr><td>0</td> <td></td> </tr> <tr><td>1</td> <td></td> </tr> <tr><td>2</td> <td></td> </tr> <tr><td>3</td> <td></td> </tr> <tr><td>4</td> <td></td> </tr> <tr><td>5</td> <td></td> </tr> <tr><td>6</td> <td></td> </tr> <tr><td>7</td> <td></td> </tr> <tr><td>8</td> <td></td> </tr> <tr><td>9</td> <td></td> </tr> <tr><td>10</td> <td></td> </tr> </tbody> </table> </li> <li>Calculate the following: <ol id="element-11885" type="a"><li><em data-effect="italics">μ</em> = ____________</li> <li><em data-effect="italics">σ</em> = ____________</li> </ol> </li> <li>Construct a histogram of the theoretical distribution. <div id="id5962835" class="bc-figure figure"><span id="id5962839" data-type="media" data-alt="This is a blank graph template. The x-axis is labeled Number of diamonds. The y-axis is labeled Probability."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch04_17_02-1.jpg" alt="This is a blank graph template. The x-axis is labeled Number of diamonds. The y-axis is labeled Probability." width="400" data-media-type="image/png" /></span></div> </li> </ol> </div> <p id="fs-idp120856768"><span data-type="title">Using the Data</span></p> <div id="id5378437" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="eip-idp140551535389872"><em data-effect="italics">RF</em> = relative frequency</p> </div> <p id="element-982735">Use the table from the <a href="#TheoDist">Theoretical Distribution</a> section to calculate the following answers. Round your answers to four decimal places.</p> <ul id="list-928735"><li><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 3) = _______________________</li> <li><em data-effect="italics">P</em>(1 &lt; <em data-effect="italics">x</em> &lt; 4) = _______________________</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">x</em> ≥ 8) = _______________________</li> </ul> <p>Use the data from the <a href="#OrgData">Organize the Data</a> section to calculate the following answers. Round your answers to four decimal places.</p> <ul id="list-92763594"><li><em data-effect="italics">RF</em>(<em data-effect="italics">x</em> = 3) = _______________________</li> <li><em data-effect="italics">RF</em>(1 &lt; <em data-effect="italics">x</em> &lt; 4) = _______________________</li> <li><em data-effect="italics">RF</em>(<em data-effect="italics">x</em> ≥ 8) = _______________________</li> </ul> <p id="element-420"><span data-type="title">Discussion Questions</span>For questions 1 and 2, think about the shapes of the two graphs, the probabilities, the relative frequencies, the means, and the standard deviations.</p> <ol id="list-23526"><li id="q1">Knowing that data vary, describe three similarities between the graphs and distributions of the theoretical, empirical, and simulation distributions. Use complete sentences.</li> <li id="q2">Describe the three most significant differences between the graphs or distributions of the theoretical, empirical, and simulation distributions.</li> <li id="q3">Using your answers from questions 1 and 2, does it appear that the two sets of data fit the theoretical distribution? In complete sentences, explain why or why not.</li> <li id="q4">Suppose that the experiment had been repeated 500 times. Would you expect <a class="autogenerated-content" href="#lab1_tbl001">(Figure)</a> or <a class="autogenerated-content" href="#lab1_tbl002">(Figure)</a> to change, and how would it change? Why? Why wouldn’t the other table(s) change?</li> </ol> </div> </div></div>
<div class="chapter standard" id="chapter-discrete-distribution-lucky-dice-experiment" title="Activity 5.6: Discrete Distribution (Lucky Dice Experiment)"><div class="chapter-title-wrap"><h3 class="chapter-number">36</h3><h2 class="chapter-title"><span class="display-none">Activity 5.6: Discrete Distribution (Lucky Dice Experiment)</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-id1172782738808" class="statistics lab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Discrete Distribution (Lucky Dice Experiment)</div> <p id="id8143321">Class Time:</p> <p id="id8332839">Names:</p> <div id="id8107935" data-type="list"><div data-type="title">Student Learning Outcomes</div> <ul><li>The student will compare empirical data and a theoretical distribution to determine if a Tet gambling game fits a discrete distribution.</li> <li>The student will demonstrate an understanding of long-term probabilities.</li> </ul> </div> <div id="element-254" data-type="list"><div data-type="title">Supplies</div> <ul><li>one “Lucky Dice” game or three regular dice</li> </ul> </div> <p id="fs-idm6846032"><span id="procedure" data-type="title">Procedure</span><span data-type="newline"><br /> </span>Round answers to relative frequency and probability problems to four decimal places.</p> <ol id="list-2397564"><li>The experimental procedure is to bet on one object. Then, roll three Lucky Dice and count the number of matches. The number of matches will decide your profit.</li> <li>What is the theoretical probability of one die matching the object?</li> <li>Choose one object to place a bet on. Roll the three Lucky Dice. Count the number of matches.</li> <li>Let <em data-effect="italics">X</em> = number of matches. Theoretically, <em data-effect="italics">X</em> ~ <em data-effect="italics">B</em>(______,______)</li> <li>Let <em data-effect="italics">Y</em> = profit per game.</li> </ol> <p id="element-234597"><span id="OrgData2" data-type="title">Organize the Data</span>In <a class="autogenerated-content" href="#lab2_tbl001">(Figure)</a>, fill in the <em data-effect="italics">y</em> value that corresponds to each <em data-effect="italics">x</em> value. Next, record the number of matches picked for your class. Then, calculate the relative frequency.</p> <ol id="list-23985798265" data-mark-suffix="."><li>Complete the table.<br /> <table id="lab2_tbl001" summary=""><thead><tr><th>x</th> <th>y</th> <th>Frequency</th> <th>Relative Frequency</th> </tr> </thead> <tbody><tr><td>0</td> <td></td> <td></td> <td></td> </tr> <tr><td>1</td> <td></td> <td></td> <td></td> </tr> <tr><td>2</td> <td></td> <td></td> <td></td> </tr> <tr><td>3</td> <td></td> <td></td> <td></td> </tr> </tbody> </table> </li> <li>Calculate the following: <ol type="a"><li>\(\overline{x}\) = _______</li> <li><em data-effect="italics">s<sub>x</sub></em> = ________</li> <li>\(\overline{y}\) = _______</li> <li><em data-effect="italics">s<sub>y</sub></em> = _______</li> </ol> </li> <li>Explain what \(\overline{x}\) represents.</li> <li>Explain what \(\overline{y}\) represents.</li> <li>Based upon the experiment: <ol id="list-9876876586" type="a"><li>What was the average profit per game?</li> <li>Did this represent an average win or loss per game?</li> <li>How do you know? Answer in complete sentences.</li> </ol> </li> <li>Construct a histogram of the empirical data. <div id="id18572910" class="bc-figure figure"><span id="id18596063" data-type="media" data-alt="This is a blank graph template. The x-axis is labeled Number of matches. The y-axis is labeled Relative frequency."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch04_18_01-1.png" alt="This is a blank graph template. The x-axis is labeled Number of matches. The y-axis is labeled Relative frequency." width="400" data-media-type="image/png" /></span></div> </li> </ol> <p id="element-89757864587624"><span id="TheoDist2" data-type="title">Theoretical Distribution</span>Build the theoretical PDF chart for <em data-effect="italics">x</em> and <em data-effect="italics">y</em> based on the distribution from the <a href="#procedure">Procedure</a> section.</p> <ol id="list-529786925"><li><table id="lab2_tbl002" summary="This table is similar to the previous table except it only has three columns. The first column has the values of X, 0-3, the blank second column is for values of Y to be entered, and the blank third column is for values of P(X=x)=P(Y=y)."><thead><tr><th><em data-effect="italics">x</em></th> <th><em data-effect="italics">y</em></th> <th><em data-effect="italics">P</em>(<em data-effect="italics">x</em>) = <em data-effect="italics">P</em>(<em data-effect="italics">y</em>)</th> </tr> </thead> <tbody><tr><td>0</td> <td></td> <td></td> </tr> <tr><td>1</td> <td></td> <td></td> </tr> <tr><td>2</td> <td></td> <td></td> </tr> <tr><td>3</td> <td></td> <td></td> </tr> </tbody> </table> </li> <li>Calculate the following: <ol id="list-82735687268357" type="a"><li><em data-effect="italics">μ<sub>x</sub></em> = _______</li> <li><em data-effect="italics">σ<sub>x</sub></em> = _______</li> <li><em data-effect="italics">μ<sub>x</sub></em> = _______</li> </ol> </li> <li>Explain what <em data-effect="italics">μ<sub>x</sub></em> represents.</li> <li>Explain what <em data-effect="italics">μ<sub>y</sub></em> represents.</li> <li>Based upon theory: <ol type="a"><li>What was the expected profit per game?</li> <li>Did the expected profit represent an average win or loss per game?</li> <li>How do you know? Answer in complete sentences.</li> </ol> </li> <li>Construct a histogram of the theoretical distribution. <div id="id18402808" class="bc-figure figure"><span id="id18810469" data-type="media" data-alt="This is a blank graph template. The x-axis is labeled Number of diamonds. The y-axis is labeled Probability."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch04_18_02-1.png" alt="This is a blank graph template. The x-axis is labeled Number of diamonds. The y-axis is labeled Probability." width="400" data-media-type="image/png" data-print-width="4in" /></span></div> </li> </ol> <p><span data-type="title">Use the Data</span></p> <div data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="eip-idp94792224"><em data-effect="italics">RF</em> = relative frequency</p> </div> <p id="element-23525">Use the data from the <a href="#TheoDist2">Theoretical Distribution</a> section to calculate the following answers. Round your answers to four decimal places.</p> <ol id="listular"><li><em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 3) = _________________</li> <li><em data-effect="italics">P</em>(0 &lt; <em data-effect="italics">x</em> &lt; 3) = _________________</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">x</em> ≥ 2) = _________________</li> </ol> <p id="element-872568354">Use the data from the <a href="#OrgData2">Organize the Data</a> section to calculate the following answers. Round your answers to four decimal places.</p> <ol id="element-997"><li><em data-effect="italics">RF</em>(x = 3) = _________________</li> <li><em data-effect="italics">RF</em>(0 &lt; <em data-effect="italics">x</em> &lt; 3) = _________________</li> <li><em data-effect="italics">RF</em>(<em data-effect="italics">x</em> ≥ 2) = _________________</li> </ol> <p><span data-type="title">Discussion Question</span>For questions 1 and 2, consider the graphs, the probabilities, the relative frequencies, the means, and the standard deviations.</p> <ol id="list-2359768725"><li id="q01">Knowing that data vary, describe three similarities between the graphs and distributions of the theoretical and empirical distributions. Use complete sentences.</li> <li id="q02">Describe the three most significant differences between the graphs or distributions of the theoretical and empirical distributions.</li> <li id="q03">Thinking about your answers to questions 1 and 2, does it appear that the data fit the theoretical distribution? In complete sentences, explain why or why not.</li> <li id="q04">Suppose that the experiment had been repeated 500 times. Would you expect <a class="autogenerated-content" href="#lab2_tbl001">(Figure)</a> or <a class="autogenerated-content" href="#lab2_tbl002">(Figure)</a> to change, and how would it change? Why? Why wouldn’t the other table change?</li> </ol> </div> </div></div>
<div class="part " id="part-continuous-random-variables"><div class="part-title-wrap"><h3 class="part-number">VI</h3><h1 class="part-title">Chapter 6: Continuous Random Variables</h1></div><div class="ugc part-ugc"></div></div>
<div class="chapter standard" id="chapter-introduction-17" title="Chapter 6.1: Introduction"><div class="chapter-title-wrap"><h3 class="chapter-number">37</h3><h2 class="chapter-title"><span class="display-none">Chapter 6.1: Introduction</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-idp107728704" class="splash"><div class="bc-figcaption figcaption">The heights of these radish plants are continuous random variables. (Credit: Rev Stan)</div> <p><span id="fs-idp88279424" data-type="media" data-alt="The image shows radish plants of various heights sprouting out of dirt."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/CNX_Stats_C05_CO-1.jpg" alt="The image shows radish plants of various heights sprouting out of dirt." width="380" data-media-type="image/jpeg" /></span></p> </div> <div id="fs-idp13104688" class="chapter-objectives" data-type="note" data-has-label="true" data-label=""><div data-type="title">Chapter Objectives</div> <p id="element-103">By the end of this chapter, the student should be able to:</p> <ul><li>Recognize and understand continuous probability density functions in general.</li> <li>Recognize the uniform probability distribution and apply it appropriately.</li> <li>Recognize the exponential probability distribution and apply it appropriately.</li> </ul> </div> <p id="eip-idp101194160">Continuous random variables have many applications. Baseball batting averages, IQ scores, the length of time a long distance telephone call lasts, the amount of money a person carries, the length of time a computer chip lasts, and SAT scores are just a few. The field of reliability depends on a variety of continuous random variables.</p> <div id="fs-idp89655056" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="fs-idm83037888">The values of discrete and continuous random variables can be ambiguous. For example, if <em data-effect="italics">X</em> is equal to the number of miles (to the nearest mile) you drive to work, then <em data-effect="italics">X</em> is a discrete random variable. You count the miles. If <em data-effect="italics">X</em> is the distance you drive to work, then you measure values of <em data-effect="italics">X</em> and <em data-effect="italics">X</em> is a continuous random variable. For a second example, if <em data-effect="italics">X</em> is equal to the number of books in a backpack, then <em data-effect="italics">X</em> is a discrete random variable. If <em data-effect="italics">X</em> is the weight of a book, then <em data-effect="italics">X</em> is a continuous random variable because weights are measured. How the random variable is defined is very important.</p> </div> <div id="eip-606" class="bc-section section" data-depth="1"><h3 data-type="title">Properties of Continuous Probability Distributions</h3> <p>The graph of a continuous probability distribution is a curve. Probability is represented by area under the curve.</p> <p>The curve is called the <span data-type="term">probability density function</span> (abbreviated as <strong>pdf</strong>). We use the symbol <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) to represent the curve. <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) is the function that corresponds to the graph; we use the density function <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) to draw the graph of the probability distribution.</p> <p><strong>Area under the curve</strong> is given by a different function called the <strong>cumulative distribution function </strong> (abbreviated as <strong>cdf</strong>). The cumulative distribution function is used to evaluate probability as area.</p> <ul id="eip-id1170587055152"><li>The outcomes are measured, not counted.</li> <li>The entire area under the curve and above the x-axis is equal to one.</li> <li>Probability is found for intervals of <em data-effect="italics">x</em> values rather than for individual <em data-effect="italics">x</em> values.</li> <li><em data-effect="italics">P(c &lt; x &lt; d)</em> is the probability that the random variable <em data-effect="italics">X</em> is in the interval between the values <em data-effect="italics">c</em> and <em data-effect="italics">d</em>. <em data-effect="italics">P(c &lt; x &lt; d)</em> is the area under the curve, above the <em data-effect="italics">x</em>-axis, to the right of <em data-effect="italics">c</em> and the left of <em data-effect="italics">d</em>.</li> <li><em data-effect="italics">P(x = c) =</em> 0 The probability that <em data-effect="italics">x</em> takes on any single individual value is zero. The area below the curve, above the <em data-effect="italics">x</em>-axis, and between <em data-effect="italics">x</em> = <em data-effect="italics">c</em> and <em data-effect="italics">x</em> = <em data-effect="italics">c</em> has no width, and therefore no area (area = 0). Since the probability is equal to the area, the probability is also zero.</li> <li><em data-effect="italics">P(c &lt; x &lt; d)</em> is the same as <em data-effect="italics">P(c ≤ x ≤ d)</em> because probability is equal to area.</li> </ul> <p>We will find the area that represents probability by using geometry, formulas, technology, or probability tables. In general, calculus is needed to find the area under the curve for many probability density functions. When we use formulas to find the area in this textbook, the formulas were found by using the techniques of integral calculus. However, because most students taking this course have not studied calculus, we will not be using calculus in this textbook.</p> <p id="eip-335">There are many continuous probability distributions. When using a continuous probability distribution to model probability, the distribution used is selected to model and fit the particular situation in the best way.</p> <p>In this chapter and the next, we will study the uniform distribution, the exponential distribution, and the normal distribution. The following graphs illustrate these distributions.</p> <div id="id4131243" class="bc-figure figure"><div class="bc-figcaption figcaption">The graph shows a Uniform Distribution with the area between <em data-effect="italics">x</em> = 3 and <em data-effect="italics">x</em> = 6 shaded to represent the probability that the value of the random variable <em data-effect="italics">X</em> is in the interval between three and six.</div> <div class="wp-caption alignnone" style="width: 487px"><img class="size-medium" src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch05_01_01-1.jpg" alt="Uniform Distribution" width="487" height="240" /><div class="wp-caption-text">This graph shows a uniform distribution. The horizontal axis ranges from 0 to 10. The distribution is modeled by a rectangle extending from x = 2 to x = 8.8. A region from x = 3 to x = 6 is shaded inside the rectangle. The shaded area represents P(3 &lt; x &lt; 6)</div></div> <p>&nbsp;</p> </div> <div id="id3243513" class="bc-figure figure"><div class="bc-figcaption figcaption">The graph shows an Exponential Distribution with the area between <em data-effect="italics">x</em> = 2 and <em data-effect="italics">x</em> = 4 shaded to represent the probability that the value of the random variable <em data-effect="italics">X</em> is in the interval between two and four.</div> <p><span id="id1164323972015" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch05_01_02-1.jpg" alt="" width="380" data-media-type="image/jpg" /></span></p> </div> <div id="eip-id1164898429541" class="bc-figure figure"><div class="bc-figcaption figcaption">The graph shows the Standard Normal Distribution with the area between <em data-effect="italics">x</em> = 1 and <em data-effect="italics">x</em> = 2 shaded to represent the probability that the value of the random variable <em data-effect="italics">X</em> is in the interval between one and two.</div> <div class="wp-caption alignnone" style="width: 487px"><img class="size-medium" src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch05_01_03-1.jpg" alt="The Normal Distribution" width="487" height="162" /><div class="wp-caption-text">This graph shows an exponential distribution. The graph slopes downward. It begins at a point on the y-axis and approaches the x-axis at the right edge of the graph. The region under the graph from x = 2 to x = 4 is shaded to represent P(1 &lt; x &lt;2)</div></div> <p>&nbsp;</p> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="fs-idm138335264"><dt>Uniform Distribution</dt> <dd id="fs-idm102638224">a continuous random variable (RV) that has equally likely outcomes over the domain, <em data-effect="italics">a</em> &lt; <em data-effect="italics">x</em> &lt; <em data-effect="italics">b</em>. Notation: <em data-effect="italics">X</em> ~ <em data-effect="italics">U</em>(<em data-effect="italics">a</em>,<em data-effect="italics">b</em>). The mean is <em data-effect="italics">μ</em> = \(\frac{a+b}{2}\) and the standard deviation is \(\sigma =\sqrt{\frac{{\left(b-a\right)}^{2}}{12}}\). The probability density function is <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = \(\frac{1}{b-a}\) for <em data-effect="italics">a</em> &lt; <em data-effect="italics">x</em> &lt; <em data-effect="italics">b</em> or <em data-effect="italics">a</em> ≤ <em data-effect="italics">x</em> ≤ <em data-effect="italics">b</em>. The cumulative distribution is <em data-effect="italics">P</em>(<em data-effect="italics">X</em> ≤ <em data-effect="italics">x</em>) = \(\frac{x-a}{b-a}\).</dd> </dl> <dl id="fs-idm29977280"><dt>Exponential Distribution</dt> <dd id="fs-idm115662720">a continuous random variable (RV) that appears when we are interested in the intervals of time between some random events, for example, the length of time between emergency arrivals at a hospital; the notation is <em data-effect="italics">X</em> ~ <em data-effect="italics">Exp</em>(<em data-effect="italics">m</em>). The mean is <em data-effect="italics">μ</em> = \(\frac{1}{m}\) and the standard deviation is σ = \(\frac{1}{m}\). The probability density function is <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = <em data-effect="italics">me<sup>−mx</sup></em>, <em data-effect="italics">x</em> ≥ 0 and the cumulative distribution function is <em data-effect="italics">P</em>(<em data-effect="italics">X</em> ≤ <em data-effect="italics">x</em>) = 1 − <em data-effect="italics">e<sup>−mx</sup></em>.</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-continuous-probability-functions" title="Chapter 6.2: Continuous Probability Functions"><div class="chapter-title-wrap"><h3 class="chapter-number">38</h3><h2 class="chapter-title"><span class="display-none">Chapter 6.2: Continuous Probability Functions</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p>We begin by defining a continuous probability density function. We use the function notation <em data-effect="italics">f</em>(<em data-effect="italics">x</em>). Intermediate algebra may have been your first formal introduction to functions. In the study of probability, the functions we study are special. We define the function <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) so that the area between it and the x-axis is equal to a probability. Since the maximum probability is one, the maximum area is also one. <strong>For continuous probability distributions, PROBABILITY = AREA.</strong></p> <div class="textbox textbox--examples" data-type="example"><p id="element-630">Consider the function <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = \(\frac{1}{20}\) for 0 ≤ <em data-effect="italics">x</em> ≤ 20. <em data-effect="italics">x</em> = a real number. The graph of <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = \(\frac{1}{20}\) is a horizontal line. However, since 0 ≤ <em data-effect="italics">x</em> ≤ 20, <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) is restricted to the portion between <em data-effect="italics">x</em> = 0 and <em data-effect="italics">x</em> = 20, inclusive.</p> <div id="fs-idm76319168" class="bc-figure figure"><span id="id39758796" data-type="media" data-alt="This shows the graph of the function f(x) = 1/20. A horiztonal line ranges from the point (0, 1/20) to the point (20, 1/20). A vertical line extends from the x-axis to the end of the line at point (20, 1/20) creating a rectangle."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch05_02_01-1.jpg" alt="This shows the graph of the function f(x) = 1/20. A horiztonal line ranges from the point (0, 1/20) to the point (20, 1/20). A vertical line extends from the x-axis to the end of the line at point (20, 1/20) creating a rectangle." width="380" data-media-type="image/jpg" /></span></div> <p id="element-37"><em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = \(\frac{1}{20}\)<strong>for</strong> 0 ≤ <em data-effect="italics">x</em> ≤ 20.</p> <p>The graph of <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = \(\frac{1}{20}\) is a horizontal line segment when 0 ≤ <em data-effect="italics">x</em> ≤ 20.</p> <p>The area between <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = \(\frac{1}{20}\) where 0 ≤ <em data-effect="italics">x</em> ≤ 20 and the <em data-effect="italics">x</em>-axis is the area of a rectangle with base = 20 and height = \(\frac{1}{20}\).</p> <div data-type="equation">\(\text{AREA}=20\left(\frac{1}{20}\right)=1\)</div> <p><strong>Suppose we want to find the area between <em data-effect="italics">f(</em><em data-effect="italics">x</em>) = \(\frac{1}{20}\) and the <em data-effect="italics">x</em>-axis where 0 &lt; <em data-effect="italics">x</em> &lt; 2.</strong></p> <div id="fs-idp91224976" class="bc-figure figure"><span id="id40073479" data-type="media" data-alt="This shows the graph of the function f(x) = 1/20. A horiztonal line ranges from the point (0, 1/20) to the point (20, 1/20). A vertical line extends from the x-axis to the end of the line at point (20, 1/20) creating a rectangle. A region is shaded inside the rectangle from x = 0 to x = 2."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch05_02_02-1.jpg" alt="This shows the graph of the function f(x) = 1/20. A horiztonal line ranges from the point (0, 1/20) to the point (20, 1/20). A vertical line extends from the x-axis to the end of the line at point (20, 1/20) creating a rectangle. A region is shaded inside the rectangle from x = 0 to x = 2." width="380" data-media-type="image/jpg" /></span></div> <p>\(\text{AREA }=\text{ }\left(2\text{ }–\text{ }0\right)\left(\frac{1}{20}\right)\text{ }=\text{ }0.1\)</p> <p>\(\left(2\text{}–\text{}0\right)\text{}=\text{}2\text{}=\text{base of a rectangle}\)</p> <div data-type="note" data-has-label="true" data-label="" data-element-type="Reminder"><div data-type="title">Reminder</div> <p id="eip-idp74119024">area of a rectangle = (base)(height).</p> </div> <p>The area corresponds to a probability. The probability that <em data-effect="italics">x</em> is between zero and two is 0.1, which can be written mathematically as <em data-effect="italics">P</em>(0 &lt; <em data-effect="italics">x</em> &lt; 2) = <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &lt; 2) = 0.1.</p> <p id="eip-553"><strong>Suppose we want to find the area between <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = \(\frac{1}{20}\) and the <em data-effect="italics">x</em>-axis where 4 &lt; <em data-effect="italics">x</em> &lt; 15.</strong></p> <div id="fs-idm39582736" class="bc-figure figure"><span id="id40137735" data-type="media" data-alt="This shows the graph of the function f(x) = 1/20. A horiztonal line ranges from the point (0, 1/20) to the point (20, 1/20). A vertical line extends from the x-axis to the end of the line at point (20, 1/20) creating a rectangle. A region is shaded inside the rectangle from x = 4 to x = 15."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch05_02_03-1.jpg" alt="This shows the graph of the function f(x) = 1/20. A horiztonal line ranges from the point (0, 1/20) to the point (20, 1/20). A vertical line extends from the x-axis to the end of the line at point (20, 1/20) creating a rectangle. A region is shaded inside the rectangle from x = 4 to x = 15." width="380" data-media-type="image/jpg" /></span></div> <p id="fs-idm75475104">\(\text{AREA }=\text{ }\left(15\text{ }–\text{ }4\right)\left(\frac{1}{20}\right)\text{ }=\text{ }0.55\)</p> <p id="element-376">\(\left(15\text{ }–\text{ }4\right)\text{ }=\text{ }11\text{ }=\text{ the base of a rectangle}\)</p> <p>The area corresponds to the probability <em data-effect="italics">P</em>(4 &lt; <em data-effect="italics">x</em> &lt; 15) = 0.55.</p> <p>Suppose we want to find <em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 15). On an x-y graph, <em data-effect="italics">x</em> = 15 is a vertical line. A vertical line has no width (or zero width). Therefore, <em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 15) = (base)(height) = (0)\(\left(\frac{1}{20}\right)\) = 0</p> <div id="fs-idm37512432" class="bc-figure figure"><span id="id40076640" data-type="media" data-alt="This shows the graph of the function f(x) = 1/20. A horiztonal line ranges from the point (0, 1/20) to the point (20, 1/20). A vertical line extends from the x-axis to the end of the line at point (20, 1/20) creating a rectangle. A vertical line extends from the horizontal axis to the graph at x = 15."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch05_02_04-1.jpg" alt="This shows the graph of the function f(x) = 1/20. A horiztonal line ranges from the point (0, 1/20) to the point (20, 1/20). A vertical line extends from the x-axis to the end of the line at point (20, 1/20) creating a rectangle. A vertical line extends from the horizontal axis to the graph at x = 15." width="380" data-media-type="image/jpg" /></span></div> <p><em data-effect="italics">P</em>(<em data-effect="italics">X</em> ≤ <em data-effect="italics">x</em>), which can also be written as <em data-effect="italics">P</em>(<em data-effect="italics">X</em> &lt; <em data-effect="italics">x</em>) for continuous distributions, is called the cumulative distribution function or CDF. Notice the &#8220;less than or equal to&#8221; symbol. We can also use the CDF to calculate <em data-effect="italics">P</em>(<em data-effect="italics">X</em> &gt; <em data-effect="italics">x</em>). The CDF gives &#8220;area to the left&#8221; and <em data-effect="italics">P</em>(<em data-effect="italics">X</em> &gt; <em data-effect="italics">x</em>) gives &#8220;area to the right.&#8221; We calculate <em data-effect="italics">P</em>(<em data-effect="italics">X</em> &gt; <em data-effect="italics">x</em>) for continuous distributions as follows: <em data-effect="italics">P</em>(<em data-effect="italics">X</em> &gt; <em data-effect="italics">x</em>) = 1 – <em data-effect="italics">P</em> (<em data-effect="italics">X</em> &lt; <em data-effect="italics">x</em>).</p> <div id="fs-idp12815264" class="bc-figure figure"><span id="id39508301" data-type="media" data-alt="This shows the graph of the function f(x) = 1/20. A horiztonal line ranges from the point (0, 1/20) to the point (20, 1/20). A vertical line extends from the x-axis to the end of the line at point (20, 1/20) creating a rectangle. The area to the left of a value, x, is shaded."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch05_02_05-1.jpg" alt="This shows the graph of the function f(x) = 1/20. A horiztonal line ranges from the point (0, 1/20) to the point (20, 1/20). A vertical line extends from the x-axis to the end of the line at point (20, 1/20) creating a rectangle. The area to the left of a value, x, is shaded." width="380" data-media-type="image/jpg" /></span></div> <p id="element-473">Label the graph with <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) and <em data-effect="italics">x</em>. Scale the <em data-effect="italics">x</em> and <em data-effect="italics">y</em> axes with the maximum <em data-effect="italics">x</em> and <em data-effect="italics">y</em> values. <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = \(\frac{1}{20}\), 0 ≤ <em data-effect="italics">x</em> ≤ 20.</p> <p id="fs-idp36326688">To calculate the probability that <em data-effect="italics">x</em> is between two values, look at the following graph. Shade the region between <em data-effect="italics">x</em> = 2.3 and <em data-effect="italics">x</em> = 12.7. Then calculate the shaded area of a rectangle.</p> <div id="fs-idp96750640" class="bc-figure figure"><span id="id40140418" data-type="media" data-alt="This shows the graph of the function f(x) = 1/20. A horiztonal line ranges from the point (0, 1/20) to the point (20, 1/20). A vertical line extends from the x-axis to the end of the line at point (20, 1/20) creating a rectangle. A region is shaded inside the rectangle from x = 2.3 to x = 12.7"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch05_02_06-1.jpg" alt="This shows the graph of the function f(x) = 1/20. A horiztonal line ranges from the point (0, 1/20) to the point (20, 1/20). A vertical line extends from the x-axis to the end of the line at point (20, 1/20) creating a rectangle. A region is shaded inside the rectangle from x = 2.3 to x = 12.7" width="380" data-media-type="image/jpg" /></span></div> <p id="element-979">\(P\left(2.3&lt;x&lt;12.7\right)=\left(\text{base}\right)\left(\text{height}\right)=\left(12.7-2.3\right)\left(\frac{1}{20}\right)=0.52\)</p> </div> <div id="fs-idm47598048" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm81426464" data-type="exercise"><div id="fs-idm96151408" data-type="problem"><p id="fs-idp147477712">Consider the function <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = \(\frac{\text{1}}{8}\) for 0 ≤ <em data-effect="italics">x</em> ≤ 8. Draw the graph of <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) and find <em data-effect="italics">P</em>(2.5 &lt; <em data-effect="italics">x</em> &lt; 7.5).</p> </div> </div> </div> <div id="fs-idp147304960" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idp147305600">The probability density function (pdf) is used to describe probabilities for continuous random variables. The area under the density curve between two points corresponds to the probability that the variable falls between those two values. In other words, the area under the density curve between points <em data-effect="italics">a</em> and <em data-effect="italics">b</em> is equal to <em data-effect="italics">P</em>(<em data-effect="italics">a</em> &lt; <em data-effect="italics">x</em> &lt; <em data-effect="italics">b</em>). The cumulative distribution function (cdf) gives the probability as an area. If <em data-effect="italics">X</em> is a continuous random variable, the probability density function (pdf), <em data-effect="italics">f</em>(<em data-effect="italics">x</em>), is used to draw the graph of the probability distribution. The total area under the graph of <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) is one. The area under the graph of <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) and between values <em data-effect="italics">a</em> and <em data-effect="italics">b</em> gives the probability <em data-effect="italics">P</em>(<em data-effect="italics">a</em> &lt; <em data-effect="italics">x</em> &lt; <em data-effect="italics">b</em>).</p> <div id="fs-idm60245824" class="bc-figure figure"><span id="fs-idp150656848" data-type="media" data-alt="The graph on the left shows a general density curve, y = f(x). The region under the curve and above the x-axis is shaded. The area of the shaded region is equal to 1. This shows that all possible outcomes are represented by the curve. The graph on the right shows the same density curve. Vertical lines x = a and x = b extend from the axis to the curve, and the area between the lines is shaded. The area of the shaded region represents the probabilit ythat a value x falls between a and b."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C05_M02_001-1.jpg" alt="The graph on the left shows a general density curve, y = f(x). The region under the curve and above the x-axis is shaded. The area of the shaded region is equal to 1. This shows that all possible outcomes are represented by the curve. The graph on the right shows the same density curve. Vertical lines x = a and x = b extend from the axis to the curve, and the area between the lines is shaded. The area of the shaded region represents the probabilit ythat a value x falls between a and b." width="380" data-media-type="image/jpg" /></span></div> <p id="fs-idp108663296">The cumulative distribution function (cdf) of <em data-effect="italics">X</em> is defined by <em data-effect="italics">P</em> (<em data-effect="italics">X</em> ≤ <em data-effect="italics">x</em>). It is a function of <em data-effect="italics">x</em> that gives the probability that the random variable is less than or equal to <em data-effect="italics">x</em>.</p> </div> <div id="fs-idp108664192" class="formula-review" data-depth="1"><h3 data-type="title">Formula Review</h3> <p id="fs-idp125293440">Probability density function (pdf) <em data-effect="italics">f</em>(<em data-effect="italics">x</em>):</p> <ul id="fs-idp125293824"><li><em data-effect="italics">f</em>(<em data-effect="italics">x</em>) ≥ 0</li> <li>The total area under the curve <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) is one.</li> </ul> <p id="fs-idp131806432">Cumulative distribution function (cdf): <em data-effect="italics">P</em>(<em data-effect="italics">X</em> ≤ <em data-effect="italics">x</em>)</p> </div> <div id="fs-idm30315072" class="practice" data-depth="1"><div id="eip-idm97087120" data-type="exercise"><div id="eip-idm97086864" data-type="problem"><p id="eip-idm97086608">Which type of distribution does the graph illustrate?</p> <div id="eip-idm123385824" class="bc-figure figure"><span id="eip-idm123385568" data-type="media" data-alt="The horizontal axis ranges from 0 to 10. The distribution is modeled by a rectangle extending from x = 3 to x =8."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C05_M01_item001-1.jpg" alt="The horizontal axis ranges from 0 to 10. The distribution is modeled by a rectangle extending from x = 3 to x =8." width="380" data-media-type="image/jpg" /></span></div> </div> <div id="eip-idm180418672" data-type="solution"><p id="eip-idm180418416">Uniform Distribution</p> </div> </div> <div id="eip-idm18256352" data-type="exercise"><div id="eip-idm18256096" data-type="problem"><p id="eip-idm199212800">Which type of distribution does the graph illustrate?</p> <div id="eip-idm199212416" class="bc-figure figure"><span id="eip-idm199212160" data-type="media" data-alt="This graph slopes downward. It begins at a point on the y-axis and approaches the x-axis at the right edge of the graph."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C05_M01_item002-1.jpg" alt="This graph slopes downward. It begins at a point on the y-axis and approaches the x-axis at the right edge of the graph." width="380" data-media-type="image/jpg" /></span></div> </div> </div> <div id="eip-idm137825120" data-type="exercise"><div id="eip-idm143442320" data-type="problem"><p id="eip-idm143442064">Which type of distribution does the graph illustrate?</p> <div id="eip-idm143441680" class="bc-figure figure"><span id="eip-idp45343472" data-type="media" data-alt="This graph shows a bell-shaped graph. The symmetric graph reaches maximum height at x = 0 and slopes downward gradually to the x-axis on each side of the peak."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C05_M01_item003-1.jpg" alt="This graph shows a bell-shaped graph. The symmetric graph reaches maximum height at x = 0 and slopes downward gradually to the x-axis on each side of the peak." width="380" data-media-type="image/jpg" /></span></div> </div> <div id="eip-idm191766752" data-type="solution"><p id="eip-idm70312624">Normal Distribution</p> </div> </div> <div id="eip-idm170433264" data-type="exercise"><div id="eip-idm170433008" data-type="problem"><p id="eip-idm170432752">What does the shaded area represent? <em data-effect="italics">P</em>(___&lt; <em data-effect="italics">x</em> &lt; ___)</p> <div id="eip-idp7874608" class="bc-figure figure"><span id="eip-idm79769968" data-type="media" data-alt="This graph shows a uniform distribution. The horizontal axis ranges from 0 to 10. The distribution is modeled by a rectangle extending from x = 1 to x = 8. A region from x = 2 to x = 5 is shaded inside the rectangle."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C05_M01_item004-1.jpg" alt="This graph shows a uniform distribution. The horizontal axis ranges from 0 to 10. The distribution is modeled by a rectangle extending from x = 1 to x = 8. A region from x = 2 to x = 5 is shaded inside the rectangle." width="380" data-media-type="image/jpg" /></span></div> </div> </div> <div id="eip-idm160222016" data-type="exercise"><div id="eip-idm40131600" data-type="problem"><p id="eip-idm40131344">What does the shaded area represent? <em data-effect="italics">P</em>(___&lt; <em data-effect="italics">x</em> &lt; ___)</p> <div id="eip-idm126372688" class="bc-figure figure"><span id="eip-idm126372432" data-type="media" data-alt="This graph shows an exponential distribution. The graph slopes downward. It begins at a point on the y-axis and approaches the x-axis at the right edge of the graph. The region under the graph from x = 6 to x = 7 is shaded."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C05_M01_item005-1.jpg" alt="This graph shows an exponential distribution. The graph slopes downward. It begins at a point on the y-axis and approaches the x-axis at the right edge of the graph. The region under the graph from x = 6 to x = 7 is shaded." width="380" data-media-type="image/jpg" /></span></div> </div> <div id="eip-idm164759504" data-type="solution"><p id="eip-idm59247088"><em data-effect="italics">P</em>(6 &lt; <em data-effect="italics">x</em> &lt; 7)</p> </div> </div> <div id="fs-idm118866416" data-type="exercise"><div id="fs-idm94113008" data-type="problem"><p id="fs-idm186932672">For a continuous probablity distribution, 0 ≤ <em data-effect="italics">x</em> ≤ 15. What is <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; 15)?</p> </div> </div> <div id="fs-idm154014752" data-type="exercise"><div id="fs-idm158384896" data-type="problem"><p id="fs-idm174567168">What is the area under <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) if the function is a continuous probability density function?</p> </div> <div id="fs-idm96621808" data-type="solution"><p id="fs-idm101963728">one</p> </div> </div> <div id="fs-idm159828512" data-type="exercise"><div id="fs-idm28635392" data-type="problem"><p id="fs-idm44266992">For a continuous probability distribution, 0 ≤ <em data-effect="italics">x</em> ≤ 10. What is <em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 7)?</p> </div> </div> <div id="fs-idm89415120" data-type="exercise"><div id="fs-idm119661456" data-type="problem"><p id="fs-idm114444880">A <strong>continuous</strong> probability function is restricted to the portion between <em data-effect="italics">x</em> = 0 and 7. What is <em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 10)?</p> </div> <div id="fs-idm29287760" data-type="solution"><p id="fs-idm160336672">zero</p> </div> </div> <div id="fs-idm185391792" data-type="exercise"><div id="fs-idm70578720" data-type="problem"><p id="fs-idm83804704"><em data-effect="italics">f</em>(<em data-effect="italics">x</em>) for a continuous probability function is \(\frac{1}{5}\), and the function is restricted to 0 ≤ <em data-effect="italics">x</em> ≤ 5. What is <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &lt; 0)?</p> </div> </div> <div id="fs-idm78277152" data-type="exercise"><div id="fs-idm119655680" data-type="problem"><p id="fs-idm107695264"><em data-effect="italics">f</em>(<em data-effect="italics">x</em>), a continuous probability function, is equal to \(\frac{1}{12}\), and the function is restricted to 0 ≤ <em data-effect="italics">x</em> ≤ 12. What is <em data-effect="italics">P</em> (0 &lt; <em data-effect="italics">x</em> &lt; 12)?</p> </div> <div id="fs-idm109882608" data-type="solution"><p id="fs-idm185543600">one</p> </div> </div> <div id="fs-idm31115936" data-type="exercise"><div id="fs-idm126442096" data-type="problem"><p id="fs-idm79659008">Find the probability that <em data-effect="italics">x</em> falls in the shaded area.</p> <div id="fs-idm29576816" class="bc-figure figure"><span id="fs-idm168532640" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C05_M02_item001-1.jpg" alt="" width="380" data-media-type="image/jpg" /></span></div> </div> </div> <div id="fs-idm29257104" data-type="exercise"><div id="fs-idm70564992" data-type="problem"><p id="fs-idm5797792">Find the probability that <em data-effect="italics">x</em> falls in the shaded area.</p> <div id="fs-idm120132416" class="bc-figure figure"><span id="fs-idm17039824" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C05_M02_item002-1.jpg" alt="" width="380" data-media-type="image/jpg" /></span></div> </div> <div id="fs-idm192509264" data-type="solution"><p id="fs-idm100002320">0.625</p> </div> </div> <div id="fs-idm144329792" data-type="exercise"><div id="fs-idm109804912" data-type="problem"><p id="fs-idp1777728">Find the probability that <em data-effect="italics">x</em> falls in the shaded area.</p> <div id="fs-idm167436128" class="bc-figure figure"><span id="fs-idm163441712" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C05_M02_item003-1.jpg" alt="" width="380" data-media-type="image/jpg" /></span></div> </div> </div> <div id="fs-idm170928704" data-type="exercise"><div id="fs-idm76313888" data-type="problem"><p id="fs-idm73539776"><em data-effect="italics">f</em>(<em data-effect="italics">x</em>), a continuous probability function, is equal to \(\frac{1}{3}\) and the function is restricted to 1 ≤ <em data-effect="italics">x</em> ≤ 4. Describe \(P\left(x&gt;\frac{3}{2}\right).\)</p> </div> <div id="fs-idm63500944" data-type="solution"><p id="fs-idm21180864">The probability is equal to the area from <em data-effect="italics">x</em> = \(\frac{3}{2}\) to <em data-effect="italics">x</em> = 4 above the x-axis and up to <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = \(\frac{1}{3}\).</p> </div> </div> </div> <div id="fs-idp5416000" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <p id="eip-idm57931200"><em data-effect="italics">For each probability and percentile problem, draw the picture.</em></p> <div id="eip-idp84147744" data-type="exercise"><div id="eip-idp84148000" data-type="problem"><p id="eip-idp36661712">1) <span style="font-size: 1em">When age is rounded to the nearest year, do the data stay continuous, or do they become discrete?  Why?</span></p> </div> </div> <div id="eip-idm16922624" data-type="exercise"><div id="eip-idm19380464" data-type="solution"><p>2) Consider the following experiment. You are one of 100 people enlisted to take part in a study to determine the percent of nurses in America with an R.N. (registered nurse) degree.  You ask nurses if they have an R.N. degree.  The nurses answer “yes” or “no.”  You then calculate the percentage of nurses with an R.N. degree.  You give that percentage to your supervisor.</p> <ol id="eip-idp36662096" type="a"><li>What part of the experiment will yield discrete data?</li> <li>What part of the experiment will yield continuous data?</li> </ol> <p><strong>Answers to odd questions</strong></p> <p>1) Age is a measurement, regardless of the accuracy used.</p> </div> </div> </div> </div></div>
<div class="chapter standard" id="chapter-the-uniform-distribution" title="Chapter 6.3: The Uniform Distribution"><div class="chapter-title-wrap"><h3 class="chapter-number">39</h3><h2 class="chapter-title"><span class="display-none">Chapter 6.3: The Uniform Distribution</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p id="eip-957">The uniform distribution is a continuous probability distribution and is concerned with events that are equally likely to occur. When working out problems that have a uniform distribution, be careful to note if the data is inclusive or exclusive of endpoints.</p> <div class="textbox textbox--examples" data-type="example"><p>The data in <a class="autogenerated-content" href="#element-41">(Figure)</a> are 55 smiling times, in seconds, of an eight-week-old baby.</p> <table summary=""><tbody><tr><td>10.4</td> <td>19.6</td> <td>18.8</td> <td>13.9</td> <td>17.8</td> <td>16.8</td> <td>21.6</td> <td>17.9</td> <td>12.5</td> <td>11.1</td> <td>4.9</td> </tr> <tr><td>12.8</td> <td>14.8</td> <td>22.8</td> <td>20.0</td> <td>15.9</td> <td>16.3</td> <td>13.4</td> <td>17.1</td> <td>14.5</td> <td>19.0</td> <td>22.8</td> </tr> <tr><td>1.3</td> <td>0.7</td> <td>8.9</td> <td>11.9</td> <td>10.9</td> <td>7.3</td> <td>5.9</td> <td>3.7</td> <td>17.9</td> <td>19.2</td> <td>9.8</td> </tr> <tr><td>5.8</td> <td>6.9</td> <td>2.6</td> <td>5.8</td> <td>21.7</td> <td>11.8</td> <td>3.4</td> <td>2.1</td> <td>4.5</td> <td>6.3</td> <td>10.7</td> </tr> <tr><td>8.9</td> <td>9.4</td> <td>9.4</td> <td>7.6</td> <td>10.0</td> <td>3.3</td> <td>6.7</td> <td>7.8</td> <td>11.6</td> <td>13.8</td> <td>18.6</td> </tr> </tbody> </table> <p>The sample mean = 11.49 and the sample standard deviation = 6.23.</p> <p id="element-60">We will assume that the smiling times, in seconds, follow a uniform distribution between zero and 23 seconds, inclusive. This means that any smiling time from zero to and including 23 seconds is <span data-type="term">equally likely</span>. The histogram that could be constructed from the sample is an empirical distribution that closely matches the theoretical uniform distribution.</p> <p>Let <em data-effect="italics">X</em> = length, in seconds, of an eight-week-old baby&#8217;s smile.</p> <p>The notation for the uniform distribution is</p> <p><em data-effect="italics">X</em> ~ <em data-effect="italics">U</em>(<em data-effect="italics">a</em>, <em data-effect="italics">b</em>) where <em data-effect="italics">a</em> = the lowest value of <em data-effect="italics">x</em> and <em data-effect="italics">b</em> = the highest value of <em data-effect="italics">x</em>.</p> <p>The probability density function is <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = \(\frac{1}{b-a}\) for <em data-effect="italics">a</em> ≤ <em data-effect="italics">x</em> ≤ <em data-effect="italics">b</em>.</p> <p>For this example, <em data-effect="italics">X</em> ~ <em data-effect="italics">U</em>(0, 23) and <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = \(\frac{1}{23-0}\) for 0 ≤ <em data-effect="italics">X</em> ≤ 23.</p> <p id="element-771">Formulas for the theoretical mean and standard deviation are</p> <p>\(\mu =\frac{a+b}{2}\) and \(\sigma =\sqrt{\frac{{\left(b-a\right)}^{2}}{12}}\)</p> <p id="element-729">For this problem, the theoretical mean and standard deviation are</p> <p><em data-effect="italics">μ</em> = \(\frac{0\text{ }+\text{ }23}{2}\) = 11.50 seconds and <em data-effect="italics">σ</em> = \(\sqrt{\frac{{\left(23\text{ }-\text{ }0\right)}^{2}}{12}}\) = 6.64 seconds.</p> <p>Notice that the theoretical mean and standard deviation are close to the sample mean and standard deviation in this example.</p> </div> <div id="fs-idp70845248" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp158465216" data-type="exercise"><div id="fs-idp127611104" data-type="problem"><p id="fs-idp116051680">The data that follow are the number of passengers on 35 different charter fishing boats. The sample mean = 7.9 and the sample standard deviation = 4.33. The data follow a uniform distribution where all values between and including zero and 14 are equally likely. State the values of <em data-effect="italics">a</em> and <em data-effect="italics">b</em>. Write the distribution in proper notation, and calculate the theoretical mean and standard deviation.</p> <table id="fs-idp22107440" summary=""><colgroup><col data-width="1*" /> <col data-width="1*" /> <col data-width="1*" /> <col data-width="1*" /> <col data-width="1*" /> <col data-width="1*" /> <col data-width="1*" /></colgroup> <tbody><tr><td data-align="center">1</td> <td data-align="center">12</td> <td data-align="center">4</td> <td data-align="center">10</td> <td data-align="center">4</td> <td data-align="center">14</td> <td data-align="center">11</td> </tr> <tr><td data-align="center">7</td> <td data-align="center">11</td> <td data-align="center">4</td> <td data-align="center">13</td> <td data-align="center">2</td> <td data-align="center">4</td> <td data-align="center">6</td> </tr> <tr><td data-align="center">3</td> <td data-align="center">10</td> <td data-align="center">0</td> <td data-align="center">12</td> <td data-align="center">6</td> <td data-align="center">9</td> <td data-align="center">10</td> </tr> <tr><td data-align="center">5</td> <td data-align="center">13</td> <td data-align="center">4</td> <td data-align="center">10</td> <td data-align="center">14</td> <td data-align="center">12</td> <td data-align="center">11</td> </tr> <tr><td data-align="center">6</td> <td data-align="center">10</td> <td data-align="center">11</td> <td data-align="center">0</td> <td data-align="center">11</td> <td data-align="center">13</td> <td data-align="center">2</td> </tr> </tbody> </table> </div> </div> </div> <div id="example-170" class="textbox textbox--examples" data-type="example"><div data-type="exercise"><div id="id10265850" data-type="problem"><p>a. Refer to <a class="autogenerated-content" href="#element-229">(Figure)</a>. What is the probability that a randomly chosen eight-week-old baby smiles between two and 18 seconds?</p> </div> <div id="id10265871" data-type="solution"><p id="element-178"><em data-effect="italics">P</em>(2 &lt; <em data-effect="italics">x</em> &lt; 18) = (base)(height) = (18 – 2)\(\left(\frac{1}{23}\right)\) = \(\frac{16}{23}\).</p> <div id="eip-idp133938240" class="bc-figure figure"><span id="id15176560" data-type="media" data-alt="This graph shows a uniform distribution. The horizontal axis ranges from 0 to 15. The distribution is modeled by a rectangle extending from x = 0 to x = 15. A region from x = 2 to x = 18 is shaded inside the rectangle."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch05_03_01N-1.jpg" alt="This graph shows a uniform distribution. The horizontal axis ranges from 0 to 15. The distribution is modeled by a rectangle extending from x = 0 to x = 15. A region from x = 2 to x = 18 is shaded inside the rectangle." width="380" data-media-type="image/jpg" /></span></div> </div> </div> <div id="element-329" data-type="exercise"><div id="id14797719" data-type="problem"><p>b. Find the 90<sup>th</sup> percentile for an eight-week-old baby&#8217;s smiling time.</p> </div> <div id="id14797739" data-type="solution"><p>b. Ninety percent of the smiling times fall below the 90<sup>th</sup> percentile, <em data-effect="italics">k</em>, so <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &lt; <em data-effect="italics">k</em>) = 0.90.</p> <p>\(P\left(x&lt;k\right)=0.90\)</p> <p id="element-182">\(\left(\text{base}\right)\left(\text{height}\right)=0.90\)</p> <p>\(\text{(}k-0\text{)}\left(\frac{1}{23}\right)=0.90\)</p> <p>\(k=\left(23\right)\left(0.90\right)=20.7\)</p> <div class="wp-caption alignnone" style="width: 487px"><img class="size-medium" src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch05_03_02N-1.jpg" alt="Shaded area represents" width="487" height="240" /><div class="wp-caption-text">This shows the graph of the function f(x) = 1/15. A horiztonal line ranges from the point (0, 1/15) to the point (15, 1/15). A vertical line extends from the x-axis to the end of the line at point (15, 1/15) creating a rectangle. A region is shaded inside the rectangle from x = 0 to x = k. The shaded area represents P(x &lt; k)</div></div> </div> </div> <div id="element-412" data-type="exercise"><div id="id9694925" data-type="problem"><p>c. Find the probability that a random eight-week-old baby smiles more than 12 seconds <strong>KNOWING</strong> that the baby smiles <strong>MORE THAN EIGHT SECONDS</strong>.</p> </div> <div id="id15390803" data-type="solution"><p id="fs-idp106518000">c. This probability question is a <strong>conditional</strong>. You are asked to find the probability that an eight-week-old baby smiles more than 12 seconds when you <strong>already know</strong> the baby has smiled for more than eight seconds.</p> <p id="element-836">Find <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; 12|<em data-effect="italics">x</em> &gt; 8) There are two ways to do the problem. <strong>For the first way</strong>, use the fact that this is a <strong>conditional</strong> and changes the sample space. The graph illustrates the new sample space. You already know the baby smiled more than eight seconds.</p> <p id="element-837"><strong>Write a new</strong><em data-effect="italics">f</em>(<em data-effect="italics">x</em>): <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = \(\frac{1}{23\text{ }-\text{ 8}}\) = \(\frac{1}{15}\) for 8 &lt; <em data-effect="italics">x</em> &lt; 23</p> <p><em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; 12|<em data-effect="italics">x</em> &gt; 8) = (23 − 12)\(\left(\frac{1}{15}\right)\) = \(\frac{11}{15}\)</p> <div id="eip-idm134042368" class="bc-figure figure"><span id="id15318622" data-type="media" data-alt="f(X)=1/15 graph displaying a boxed region consisting of a horizontal line extending to the right from point 1/15 on the y-axis, a vertical upward line from points 8 and 23 on the x-axis, and the x-axis. A shaded region from points 12-23 occurs within this area."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch05_03_03N-1.jpg" alt="f(X)=1/15 graph displaying a boxed region consisting of a horizontal line extending to the right from point 1/15 on the y-axis, a vertical upward line from points 8 and 23 on the x-axis, and the x-axis. A shaded region from points 12-23 occurs within this area." width="380" data-media-type="image/jpg" /></span></div> <p><strong>For the second way</strong>, use the conditional formula from <a href="/contents/326ee2e0-0ccd-46ae-a776-f8857a5dad4c">Probability Topics</a> with the original distribution <em data-effect="italics">X</em> ~ <em data-effect="italics">U</em> (0, 23):</p> <p><em data-effect="italics">P</em>(<em data-effect="italics">A</em>|<em data-effect="italics">B</em>) = \(\frac{P\left(A\text{ AND }B\right)}{P\left(B\right)}\)</p> <p id="fs-idp12091856">For this problem, <em data-effect="italics">A</em> is (<em data-effect="italics">x</em> &gt; 12) and <em data-effect="italics">B</em> is (<em data-effect="italics">x</em> &gt; 8).</p> <p>So, <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; <em data-effect="italics">12</em>|<em data-effect="italics">x</em> &gt; 8) = \(\frac{\left(x&gt;12\text{ AND }x&gt;8\right)}{P\left(x&gt;8\right)}=\frac{P\left(x&gt;12\right)}{P\left(x&gt;8\right)}=\frac{\frac{11}{23}}{\frac{15}{23}}=\frac{11}{15}\)</p> <div id="eip-idm5414416" class="bc-figure figure"><span id="id15928918" data-type="media" data-alt="This diagram shows a horizontal X axis that intersects a vertical F of x axis at the origin. The X axis runs from 0 to 24 while the Y axis only has the fraction one twenty third located about two thirds of the way to the top. A rectangular box extends horizontally from 0 to about 23.7 on the X axis. The box extends vertically up to the fraction one twenty third on the F of x axis. The area of the box between 8 and 12 on the X axis is shaded."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch_05_03_04-1.jpg" alt="This diagram shows a horizontal X axis that intersects a vertical F of x axis at the origin. The X axis runs from 0 to 24 while the Y axis only has the fraction one twenty third located about two thirds of the way to the top. A rectangular box extends horizontally from 0 to about 23.7 on the X axis. The box extends vertically up to the fraction one twenty third on the F of x axis. The area of the box between 8 and 12 on the X axis is shaded." width="380" data-media-type="image/jpg" /></span></div> </div> </div> </div> <div id="fs-idp61080304" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp178484384" data-type="exercise"><div id="fs-idp81255648" data-type="problem"><p id="fs-idp10982640">A distribution is given as <em data-effect="italics">X</em> ~ <em data-effect="italics">U</em> (0, 20). What is <em data-effect="italics">P</em>(2 &lt; <em data-effect="italics">x</em> &lt; 18)? Find the 90<sup>th</sup> percentile.</p> </div> </div> </div> <div id="element-158" class="textbox textbox--examples" data-type="example"><p>The amount of time, in minutes, that a person must wait for a bus is uniformly distributed between zero and 15 minutes, inclusive.<span data-type="newline"><br /> </span></p> <div id="element-351" data-type="exercise"><div id="id12074907" data-type="problem"><p id="element-447">a. What is the probability that a person waits fewer than 12.5 minutes?</p> </div> <div id="id12074926" data-type="solution"><p>a. Let <em data-effect="italics">X</em> = the number of minutes a person must wait for a bus. <em data-effect="italics">a</em> = 0 and <em data-effect="italics">b</em> = 15. <em data-effect="italics">X</em> ~ <em data-effect="italics">U</em>(0, 15). Write the probability density function. <em data-effect="italics">f</em> (<em data-effect="italics">x</em>) = \(\frac{1}{15\text{ }-\text{ }0}\) = \(\frac{1}{15}\) for 0 ≤ <em data-effect="italics">x</em> ≤ 15.</p> <p>Find <em data-effect="italics">P</em> (<em data-effect="italics">x</em> &lt; 12.5). Draw a graph.</p> <p>\(P\left(x&lt;k\right)=\left(\text{base}\right)\left(\text{height}\right)=\left(12.5-0\right)\left(\frac{1}{15}\right)=0.8333\)</p> <p id="element-748">The probability a person waits less than 12.5 minutes is 0.8333.</p> <div id="eip-idp99340768" class="bc-figure figure"><span id="id17238530" data-type="media" data-alt="This shows the graph of the function f(x) = 1/15. A horiztonal line ranges from the point (0, 1/15) to the point (15, 1/15). A vertical line extends from the x-axis to the end of the line at point (15, 1/15) creating a rectangle. A region is shaded inside the rectangle from x = 0 to x = 12.5."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch05_03_05N-1.jpg" alt="This shows the graph of the function f(x) = 1/15. A horiztonal line ranges from the point (0, 1/15) to the point (15, 1/15). A vertical line extends from the x-axis to the end of the line at point (15, 1/15) creating a rectangle. A region is shaded inside the rectangle from x = 0 to x = 12.5." width="380" data-media-type="image/jpg" /></span></div> </div> </div> <div id="element-786" data-type="exercise"><div id="id15267471" data-type="problem"><p id="element-1323">b. On the average, how long must a person wait? Find the mean, <em data-effect="italics">μ</em>, and the standard deviation, <em data-effect="italics">σ</em>.</p> </div> <div id="id15267511" data-type="solution"><p id="element-275">b. <em data-effect="italics">μ</em> = \(\frac{a\text{ }+\text{ }b}{2}\) = \(\frac{15\text{ }+\text{ }0}{2}\) = 7.5. On the average, a person must wait 7.5 minutes. <span data-type="newline"><br /> </span> <span data-type="newline"><br /> </span> <em data-effect="italics">σ</em> = \(\sqrt{\frac{\left(b-a{\right)}^{2}}{12}}=\sqrt{\frac{\left(\mathrm{15}-0{\right)}^{2}}{12}}\) = 4.3. The Standard deviation is 4.3 minutes. <span data-type="newline"><br /> </span></p> </div> </div> <div data-type="exercise"><div id="id14859813" data-type="problem"><p>c. Ninety percent of the time, the time a person must wait falls below what value?</p> <div id="eip-idm560268464" data-type="note">This asks for the 90<sup>th</sup> percentile.</div> </div> <div id="id15337330" data-type="solution"><p>c. Find the 90<sup>th</sup> percentile. Draw a graph. Let <em data-effect="italics">k</em> = the 90<sup>th</sup> percentile. <span data-type="newline"><br /> </span> <span data-type="newline"><br /> </span>\(P\left(x&lt;k\right)=\left(\text{base}\right)\left(\text{height}\right)=\left(k-0\right)\left(\frac{1}{15}\right)\) <span data-type="newline"><br /> </span> <span data-type="newline"><br /> </span> \(0.90=\left(k\right)\left(\frac{1}{15}\right)\) <span data-type="newline"><br /> </span> <span data-type="newline"><br /> </span>\(k=\left(0.90\right)\left(15\right)=13.5\) <span data-type="newline"><br /> </span> <span data-type="newline"><br /> </span> <em data-effect="italics">k</em> is sometimes called a critical value. <span data-type="newline"><br /> </span> <span data-type="newline"><br /> </span>The 90<sup>th</sup> percentile is 13.5 minutes. Ninety percent of the time, a person must wait at most 13.5 minutes.</p> <div id="eip-idp75195312" class="bc-figure figure"><span id="id16334803" data-type="media" data-alt="f(X)=1/15 graph displaying a boxed region consisting of a horizontal line extending to the right from point 1/15 on the y-axis, a vertical upward line from an arbitrary point on the x-axis, and the x and y-axes. A shaded region from points 0-k occurs within this area. The area of this probability region is equal to 0.90."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch05_03_06N-1.jpg" alt="f(X)=1/15 graph displaying a boxed region consisting of a horizontal line extending to the right from point 1/15 on the y-axis, a vertical upward line from an arbitrary point on the x-axis, and the x and y-axes. A shaded region from points 0-k occurs within this area. The area of this probability region is equal to 0.90." width="380" data-media-type="image/jpg" /></span></div> </div> </div> </div> <div id="fs-idp13010016" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp51205632" data-type="exercise"><div id="fs-idp103521216" data-type="problem"><p id="fs-idp80584752">The total duration of baseball games in the major league in the 2011 season is uniformly distributed between 447 hours and 521 hours inclusive.</p> <ol id="fs-idp80942784" type="a"><li>Find <em data-effect="italics">a</em> and <em data-effect="italics">b</em> and describe what they represent.</li> <li>Write the distribution.</li> <li>Find the mean and the standard deviation.</li> <li>What is the probability that the duration of games for a team for the 2011 season is between 480 and 500 hours?</li> <li>What is the 65<sup>th</sup> percentile for the duration of games for a team for the 2011 season?</li> </ol> </div> </div> </div> <div id="element-321" class="textbox textbox--examples" data-type="example"><p id="element-637">Suppose the time it takes a nine-year old to eat a donut is between 0.5 and 4 minutes, inclusive. Let <em data-effect="italics">X</em> = the time, in minutes, it takes a nine-year old child to eat a donut. Then <em data-effect="italics">X</em> ~ <em data-effect="italics">U</em> (0.5, 4).<span data-type="newline"><br /> </span></p> <div id="eip-idp102263568" data-type="exercise" data-label=""><div id="eip-idp95256048" data-type="problem" data-label=""><p id="eip-idp95256304">a. The probability that a randomly selected nine-year old child eats a donut in at least two minutes is _______.</p> </div> <div id="eip-idp95256944" data-type="solution"><p id="eip-idp95257200">a. 0.5714<span data-type="newline"><br /> </span></p> </div> </div> <div id="eip-idp95257840" data-type="exercise"><div id="eip-idp95258096" data-type="problem"><p>b. Find the probability that a different nine-year old child eats a donut in more than two minutes given that the child has already been eating the donut for more than 1.5 minutes.</p> <p id="element-938">The second question has a <span data-type="term">conditional probability</span>. You are asked to find the probability that a nine-year old child eats a donut in more than two minutes given that the child has already been eating the donut for more than 1.5 minutes. Solve the problem two different ways (see <a class="autogenerated-content" href="#element-156">(Figure)</a>). You must reduce the sample space. <strong>First way</strong>: Since you know the child has already been eating the donut for more than 1.5 minutes, you are no longer starting at <em data-effect="italics">a</em> = 0.5 minutes. Your starting point is 1.5 minutes.</p> <p id="element-69"><strong>Write a new <em data-effect="italics">f</em>(<em data-effect="italics">x</em>):</strong></p> <p id="element-269"><em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = \(\frac{1}{4-1.5}\) = \(\frac{2}{5}\) for 1.5 ≤ <em data-effect="italics">x</em> ≤ 4.</p> <p id="eip-idm60988032">Find <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; 2|<em data-effect="italics">x</em> &gt; 1.5). Draw a graph.</p> <div id="eip-idp101580400" class="bc-figure figure"><span id="id13790135" data-type="media" data-alt="f(X)=2/5 graph displaying a boxed region consisting of a horizontal line extending to the right from point 2/5 on the y-axis, a vertical upward line from points 1.5 and 4 on the x-axis, and the x-axis. A shaded region from points 2-4 occurs within this area."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch05_03_07N-1.jpg" alt="f(X)=2/5 graph displaying a boxed region consisting of a horizontal line extending to the right from point 2/5 on the y-axis, a vertical upward line from points 1.5 and 4 on the x-axis, and the x-axis. A shaded region from points 2-4 occurs within this area." width="380" data-media-type="image/jpg" /></span></div> <p id="eip-idm48182144"><em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; <em data-effect="italics">2</em>|<em data-effect="italics">x</em> &gt; 1.5) = (base)(new height) = (4 − 2)\(\left(\frac{2}{5}\right)=\frac{4}{5}\)</p> </div> <div id="eip-idp15119952" data-type="solution"><p id="eip-idp15120208">b. \(\frac{4}{5}\)</p> </div> </div> <p id="eip-idm70960944">The probability that a nine-year old child eats a donut in more than two minutes given that the child has already been eating the donut for more than 1.5 minutes is \(\frac{4}{5}\).</p> <p id="eip-idm39713200"><strong>Second way:</strong> Draw the original graph for <em data-effect="italics">X</em> ~ <em data-effect="italics">U</em> (0.5, 4). Use the conditional formula</p> <p id="eip-idm39711312"><em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; 2|<em data-effect="italics">x</em> &gt; 1.5) = \( \frac{P\left(x&gt;2\text{ AND }x&gt;1.5\right)}{P\left(x&gt;\text{1}\text{.5}\right)}=\frac{P\left(x&gt;2\right)}{P\left(x&gt;1.5\right)}=\frac{\frac{2}{3.5}}{\frac{2.5}{3.5}}=\text{0}\text{.8}=\frac{4}{5}\)</p> </div> <div id="fs-idp42235264" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp177663840" data-type="exercise"><div id="fs-idp7455808" data-type="problem"><p id="fs-idp125513712">Suppose the time it takes a student to finish a quiz is uniformly distributed between six and 15 minutes, inclusive. Let <em data-effect="italics">X</em> = the time, in minutes, it takes a student to finish a quiz. Then <em data-effect="italics">X</em> ~ <em data-effect="italics">U</em> (6, 15).</p> <p id="fs-idp133097024">Find the probability that a randomly selected student needs at least eight minutes to complete the quiz. Then find the probability that a different student needs at least eight minutes to finish the quiz given that she has already taken more than seven minutes.</p> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><p id="eip-id1170213489898">Ace Heating and Air Conditioning Service finds that the amount of time a repairman needs to fix a furnace is uniformly distributed between 1.5 and four hours. Let <em data-effect="italics">x</em> = the time needed to fix a furnace. Then <em data-effect="italics">x</em> ~ <em data-effect="italics">U</em> (1.5, 4).</p> <div id="eip-idm8252240" data-type="exercise" data-label=""><div id="eip-idp126490560" data-type="problem" data-label=""><ol id="eip-id1170199222063" type="a"><li>Find the probability that a randomly selected furnace repair requires more than two hours.</li> <li>Find the probability that a randomly selected furnace repair requires less than three hours.</li> <li>Find the 30<sup>th</sup> percentile of furnace repair times.</li> <li>The longest 25% of furnace repair times take at least how long? (In other words: find the minimum time for the longest 25% of repair times.) What percentile does this represent?</li> <li>Find the mean and standard deviation</li> </ol> </div> <div id="eip-idp14636112" data-type="solution"><p id="fs-idm59847232">a. To find <em data-effect="italics">f</em>(<em data-effect="italics">x</em>): <em data-effect="italics">f</em> (<em data-effect="italics">x</em>) = \(\frac{1}{4\text{ }-\text{ }1.5}\) = \(\frac{1}{2.5}\) so <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = 0.4</p> <p id="fs-idm19787408"><em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; 2) = (base)(height) = (4 – 2)(0.4) = 0.8</p> <div id="fs-idp54235200" class="bc-figure figure"><div class="bc-figcaption figcaption">Uniform Distribution between 1.5 and four with shaded area between two and four representing the probability that the repair time <em data-effect="italics">x</em> is greater than two</div> <p><span id="fs-idm38845872" data-type="media" data-alt="This shows the graph of the function f(x) = 0.4. A horiztonal line ranges from the point (1.5, 0.4) to the point (4, 0.4). Vertical lines extend from the x-axis to the graph at x = 1.5 and x = 4 creating a rectangle. A region is shaded inside the rectangle from x = 2 to x = 4."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch05_03_08N-1.jpg" alt="This shows the graph of the function f(x) = 0.4. A horiztonal line ranges from the point (1.5, 0.4) to the point (4, 0.4). Vertical lines extend from the x-axis to the graph at x = 1.5 and x = 4 creating a rectangle. A region is shaded inside the rectangle from x = 2 to x = 4." width="380" data-media-type="image/jpeg" /></span></p> </div> </div> <div id="eip-idm39955136" data-type="solution"><p id="fs-idp16376400">b. <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &lt; 3) = (base)(height) = (3 – 1.5)(0.4) = 0.6</p> <p id="fs-idm70722128">The graph of the rectangle showing the entire distribution would remain the same. However the graph should be shaded between <em data-effect="italics">x</em> = 1.5 and <em data-effect="italics">x</em> = 3. Note that the shaded area starts at <em data-effect="italics">x</em> = 1.5 rather than at <em data-effect="italics">x</em> = 0; since <em data-effect="italics">X</em> ~ <em data-effect="italics">U</em> (1.5, 4), <em data-effect="italics">x</em> can not be less than 1.5.</p> <div id="fs-idm70679104" class="bc-figure figure"><div class="bc-figcaption figcaption">Uniform Distribution between 1.5 and four with shaded area between 1.5 and three representing the probability that the repair time <em data-effect="italics">x</em> is less than three</div> <p><span id="fs-idm6304128" data-type="media" data-alt="This shows the graph of the function f(x) = 0.4. A horiztonal line ranges from the point (1.5, 0.4) to the point (4, 0.4). Vertical lines extend from the x-axis to the graph at x = 1.5 and x = 4 creating a rectangle. A region is shaded inside the rectangle from x = 1.5 to x = 3."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch05_03_09N-1.jpg" alt="This shows the graph of the function f(x) = 0.4. A horiztonal line ranges from the point (1.5, 0.4) to the point (4, 0.4). Vertical lines extend from the x-axis to the graph at x = 1.5 and x = 4 creating a rectangle. A region is shaded inside the rectangle from x = 1.5 to x = 3." width="380" data-media-type="image/jpeg" /></span></p> </div> </div> <div id="eip-idm19525808" data-type="solution"><p id="fs-idp138350704">c.</p> <div id="figure-03" class="bc-figure figure"><div class="bc-figcaption figcaption">Uniform Distribution between 1.5 and 4 with an area of 0.30 shaded to the left, representing the shortest 30% of repair times.</div> <div class="wp-caption alignnone" style="width: 487px"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch05_03_10N-1.jpg" alt="Shaded Area P(x&lt;k)" width="487" height="240" /><div class="wp-caption-text">This shows the graph of the function f(x) = 0.4. A horiztonal line ranges from the point (1.5, 0.4) to the point (4, 0.4). Vertical lines extend from the x-axis to the graph at x = 1.5 and x = 4 creating a rectangle. A region is shaded inside the rectangle from x = 1.5 to x = k. The shaded area represents P(x &lt; k)</div></div> </div> <p id="fs-idm20420640"><span data-type="newline"><br /> </span><em data-effect="italics">P</em> (<em data-effect="italics">x</em> &lt; <em data-effect="italics">k</em>) = 0.30 <span data-type="newline"><br /> </span> <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &lt; <em data-effect="italics">k</em>) = (base)(height) = (<em data-effect="italics">k</em> – 1.5)(0.4) <span data-type="newline"><br /> </span><strong>0.3 = (<em data-effect="italics">k</em> – 1.5) (0.4)</strong>; Solve to find <em data-effect="italics">k</em>: <span data-type="newline"><br /> </span>0.75 = <em data-effect="italics">k</em> – 1.5, obtained by dividing both sides by 0.4 <span data-type="newline"><br /> </span><strong><em data-effect="italics">k</em> = 2.25 </strong>, obtained by adding 1.5 to both sides <span data-type="newline"><br /> </span>The 30<sup>th</sup> percentile of repair times is 2.25 hours. 30% of repair times are 2.5 hours or less.</p> </div> <div id="eip-idp7633120" data-type="solution"><p id="fs-idm25221984">d.</p> <div id="fs-idm51742384" class="bc-figure figure"><div class="bc-figcaption figcaption">Uniform Distribution between 1.5 and 4 with an area of 0.25 shaded to the right representing the longest 25% of repair times.</div> <p><span id="fs-idp77501376" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch05_03_11N-1.jpg" alt="" width="380" data-media-type="image/jpeg" /></span></p> </div> <p id="fs-idp80199408"><span data-type="newline"><br /> </span><em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; <em data-effect="italics">k</em>) = 0.25 <span data-type="newline"><br /> </span> <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; <em data-effect="italics">k</em>) = (base)(height) = (4 – <em data-effect="italics">k</em>)(0.4) <span data-type="newline"><br /> </span><strong>0.25 = (4 – <em data-effect="italics">k</em>)(0.4)</strong>; Solve for <em data-effect="italics">k</em>: <span data-type="newline"><br /> </span>0.625 = 4 − <em data-effect="italics">k</em>, <span data-type="newline"><br /> </span>obtained by dividing both sides by 0.4 <span data-type="newline"><br /> </span>−3.375 = −<em data-effect="italics">k</em>, <span data-type="newline"><br /> </span>obtained by subtracting four from both sides: <strong><em data-effect="italics">k</em> = 3.375</strong> <span data-type="newline"><br /> </span>The longest 25% of furnace repairs take at least 3.375 hours (3.375 hours or longer). <span data-type="newline"><br /> </span><strong>Note:</strong> Since 25% of repair times are 3.375 hours or longer, that means that 75% of repair times are 3.375 hours or less. 3.375 hours is the <strong>75<sup>th</sup> percentile</strong> of furnace repair times.</p> </div> <div id="eip-idm5972976" data-type="solution"><p id="fs-idp136143312">e. \(\mu =\frac{a+b}{2}\) and \(\sigma =\sqrt{\frac{{\left(b-a\right)}^{2}}{12}}\) <span data-type="newline"><br /> </span>\(\mu =\frac{1.5+4}{2}=2.75\) hours and \(\sigma =\sqrt{\frac{{\left(4–1.5\right)}^{2}}{12}}=0.7217\) hours</p> </div> </div> </div> <div id="fs-idm9097392" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp8673008" data-type="exercise"><div id="fs-idp161676800" data-type="problem"><p id="fs-idp28489760">The amount of time a service technician needs to change the oil in a car is uniformly distributed between 11 and 21 minutes. Let <em data-effect="italics">X</em> = the time needed to change the oil on a car.</p> <ol id="fs-idp35516576" type="a"><li>Write the random variable <em data-effect="italics">X</em> in words. <em data-effect="italics">X</em> = __________________.</li> <li>Write the distribution.</li> <li>Graph the distribution.</li> <li>Find <em data-effect="italics">P</em> (<em data-effect="italics">x</em> &gt; 19).</li> <li>Find the 50<sup>th</sup> percentile.</li> </ol> </div> </div> </div> <div id="fs-idp87344528" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idp24264368">If <em data-effect="italics">X</em> has a uniform distribution where <em data-effect="italics">a</em> &lt; <em data-effect="italics">x</em> &lt; <em data-effect="italics">b</em> or <em data-effect="italics">a</em> ≤ <em data-effect="italics">x</em> ≤ <em data-effect="italics">b</em>, then <em data-effect="italics">X</em> takes on values between <em data-effect="italics">a</em> and <em data-effect="italics">b</em> (may include <em data-effect="italics">a</em> and <em data-effect="italics">b</em>). All values <em data-effect="italics">x</em> are equally likely. We write <em data-effect="italics">X</em> ∼ <em data-effect="italics">U</em>(<em data-effect="italics">a</em>, <em data-effect="italics">b</em>). The mean of <em data-effect="italics">X</em> is \(\mu =\frac{a+b}{2}\). The standard deviation of <em data-effect="italics">X</em> is \(\sigma =\sqrt{\frac{{\left(b-a\right)}^{2}}{12}}\). The probability density function of <em data-effect="italics">X</em> is \(f\left(x\right)=\frac{1}{b-a}\) for <em data-effect="italics">a</em> ≤ <em data-effect="italics">x</em> ≤ <em data-effect="italics">b</em>. The cumulative distribution function of <em data-effect="italics">X</em> is <em data-effect="italics">P</em>(<em data-effect="italics">X</em> ≤ <em data-effect="italics">x</em>) = \(\frac{x-a}{b-a}\). <em data-effect="italics">X</em> is continuous.</p> <div id="fs-idp30139216" class="bc-figure figure"><span id="fs-idp10170496" data-type="media" data-alt="The graph shows a rectangle with total area equal to 1. The rectangle extends from x = a to x = b on the x-axis and has a height of 1/(b-a)." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C05_M03_001N-1.jpg" alt="The graph shows a rectangle with total area equal to 1. The rectangle extends from x = a to x = b on the x-axis and has a height of 1/(b-a)." width="380" data-media-type="image/jpg" /></span></div> <p id="fs-idm34434992">The probability <em data-effect="italics">P</em>(<em data-effect="italics">c</em> &lt; <em data-effect="italics">X</em> &lt; <em data-effect="italics">d</em>) may be found by computing the area under <em data-effect="italics">f</em>(<em data-effect="italics">x</em>), between <em data-effect="italics">c</em> and <em data-effect="italics">d</em>. Since the corresponding area is a rectangle, the area may be found simply by multiplying the width and the height.</p> </div> <div id="fs-idp29641328" class="formula-review" data-depth="1"><h3 data-type="title">Formula Review</h3> <p><em data-effect="italics">X</em> = a real number between <em data-effect="italics">a</em> and <em data-effect="italics">b</em> (in some instances, <em data-effect="italics">X</em> can take on the values <em data-effect="italics">a</em> and <em data-effect="italics">b</em>). <em data-effect="italics">a</em> = smallest <em data-effect="italics">X</em>; <em data-effect="italics">b</em> = largest <em data-effect="italics">X</em></p> <p id="element-912"><em data-effect="italics">X</em> ~ <em data-effect="italics">U</em> (a, b)</p> <p>The mean is \(\mu =\frac{a+b}{2}\)</p> <p>The standard deviation is \(\sigma =\sqrt{\frac{{\left(b\text{ – }a\right)}^{2}}{12}}\)</p> <p><strong>Probability density function:</strong>\(f\left(x\right)=\frac{1}{b-a}\) for \(a\le X\le b\)</p> <p><strong>Area to the Left of <em data-effect="italics">x</em>:</strong><em data-effect="italics">P</em>(<em data-effect="italics">X</em> &lt; <em data-effect="italics">x</em>) = (<em data-effect="italics">x</em> – <em data-effect="italics">a</em>)\(\left(\frac{1}{b-a}\right)\)</p> <p><strong>Area to the Right of <em data-effect="italics">x</em>:</strong><em data-effect="italics">P</em>(<em data-effect="italics">X</em> &gt; <em data-effect="italics">x</em>) = (<em data-effect="italics">b</em> – <em data-effect="italics">x</em>)\(\left(\frac{1}{b-a}\right)\)</p> <p id="element-315"><strong>Area Between <em data-effect="italics">c</em> and <em data-effect="italics">d</em>:</strong><em data-effect="italics">P</em>(<em data-effect="italics">c</em> &lt; <em data-effect="italics">x</em> &lt; <em data-effect="italics">d</em>) = (base)(height) = (<em data-effect="italics">d</em> – <em data-effect="italics">c</em>)\(\left(\frac{1}{b-a}\right)\)</p> <p id="fs-idp10903536">Uniform: <em data-effect="italics">X</em> ~ <em data-effect="italics">U</em>(<em data-effect="italics">a</em>, <em data-effect="italics">b</em>) where <em data-effect="italics">a</em> &lt; <em data-effect="italics">x</em> &lt; <em data-effect="italics">b</em></p> <ul id="fs-idm1322784"><li>pdf: \(f\left(x\right)=\frac{1}{b-a}\) for <em data-effect="italics">a ≤ x ≤ b</em></li> <li>cdf: <em data-effect="italics">P</em>(<em data-effect="italics">X</em> ≤ <em data-effect="italics">x</em>) = \(\frac{x-a}{b-a}\)</li> <li>mean <em data-effect="italics">µ</em> = \(\frac{a+b}{2}\)</li> <li>standard deviation <em data-effect="italics">σ</em> \(=\sqrt{\frac{{\left(b-a\right)}^{2}}{12}}\)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">c</em> &lt; <em data-effect="italics">X</em> &lt; <em data-effect="italics">d</em>) = (<em data-effect="italics">d</em> – <em data-effect="italics">c</em>)\(\left(\frac{1}{b–a}\right)\)</li> </ul> </div> <div id="eip-534" class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p>McDougall, John A. The McDougall Program for Maximum Weight Loss. Plume, 1995.</p> </div> <div id="fs-idp174729104" class="practice" data-depth="1"><p id="fs-idp8065584"><em data-effect="italics">Use the following information to answer the next ten questions.</em> The data that follow are the square footage (in 1,000 feet squared) of 28 homes.</p> <table id="fs-idp58203216" summary=""><tbody><tr><td>1.5</td> <td>2.4</td> <td>3.6</td> <td>2.6</td> <td>1.6</td> <td>2.4</td> <td>2.0</td> </tr> <tr><td>3.5</td> <td>2.5</td> <td>1.8</td> <td>2.4</td> <td>2.5</td> <td>3.5</td> <td>4.0</td> </tr> <tr><td>2.6</td> <td>1.6</td> <td>2.2</td> <td>1.8</td> <td>3.8</td> <td>2.5</td> <td>1.5</td> </tr> <tr><td>2.8</td> <td>1.8</td> <td>4.5</td> <td>1.9</td> <td>1.9</td> <td>3.1</td> <td>1.6</td> </tr> </tbody> </table> <p id="fs-idp126502144">The sample mean = 2.50 and the sample standard deviation = 0.8302.</p> <p id="fs-idp1739824">The distribution can be written as <em data-effect="italics">X</em> ~ <em data-effect="italics">U</em>(1.5, 4.5).</p> <div id="fs-idp4769248" data-type="exercise"><div id="fs-idp79412928" data-type="problem"><p id="fs-idp146904144">What type of distribution is this?</p> </div> </div> <div id="fs-idm13169200" data-type="exercise"><div id="fs-idp172811088" data-type="problem"><p id="fs-idp55674304">In this distribution, outcomes are equally likely. What does this mean?</p> </div> <div id="fs-idm10294640" data-type="solution"><p id="fs-idp50465824">It means that the value of <em data-effect="italics">x</em> is just as likely to be any number between 1.5 and 4.5.</p> </div> </div> <div id="fs-idp93850064" data-type="exercise"><div id="fs-idp82135232" data-type="problem"><p id="fs-idp115922160">What is the height of <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) for the continuous probability distribution?</p> </div> </div> <div id="fs-idm8575520" data-type="exercise"><div id="fs-idp9993056" data-type="problem"><p id="fs-idp31246736">What are the constraints for the values of <em data-effect="italics">x</em>?</p> </div> <div id="fs-idp4394640" data-type="solution"><p id="fs-idp85381264">1.5 ≤ <em data-effect="italics">x</em> ≤ 4.5</p> </div> </div> <div id="fs-idp126140448" data-type="exercise"><div id="fs-idp57142400" data-type="problem"><p id="fs-idp47973200">Graph <em data-effect="italics">P</em>(2 &lt; <em data-effect="italics">x</em> &lt; 3).</p> </div> </div> <div id="fs-idp86093264" data-type="exercise"><div id="fs-idm3718720" data-type="problem"><p id="fs-idp73360384">What is <em data-effect="italics">P</em>(2 &lt; <em data-effect="italics">x</em> &lt; 3)?</p> </div> <div id="fs-idm62060864" data-type="solution"><p id="fs-idp178430944">0.3333</p> </div> </div> <div id="fs-idm8617328" data-type="exercise"><div id="fs-idp51075952" data-type="problem"><p id="fs-idp7137536">What is <em data-effect="italics">P</em>(x &lt; 3.5| <em data-effect="italics">x</em> &lt; 4)?</p> </div> </div> <div id="fs-idp159814864" data-type="exercise"><div id="fs-idm12204192" data-type="problem"><p id="fs-idp14507456">What is <em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 1.5)?</p> </div> <div id="fs-idp157025920" data-type="solution"><p id="fs-idp60507184">zero</p> </div> </div> <div id="fs-idp82566640" data-type="exercise"><div id="fs-idm63308704" data-type="problem"><p id="fs-idp130119312">What is the 90<sup>th</sup> percentile of square footage for homes?</p> </div> </div> <div id="fs-idp64873328" data-type="exercise"><div id="fs-idm2051616" data-type="problem"><p id="fs-idp53438752">Find the probability that a randomly selected home has more than 3,000 square feet given that you already know the house has more than 2,000 square feet.</p> </div> <div id="fs-idp127353472" data-type="solution"><p id="fs-idp131464880">0.6</p> </div> </div> <p id="eip-358"><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next eight exercises.</em> A distribution is given as <em data-effect="italics">X</em> ~ <em data-effect="italics">U</em>(0, 12).</p> <div id="fs-idm886512" data-type="exercise"><div id="fs-idp22582240" data-type="problem"><p id="fs-idp64911856">What is <em data-effect="italics">a</em>? What does it represent?</p> </div> </div> <div id="fs-idp67116592" data-type="exercise"><div id="fs-idp4083728" data-type="problem"><p id="fs-idp169265984">What is <em data-effect="italics">b</em>? What does it represent?</p> </div> <div id="fs-idm9759664" data-type="solution"><p id="fs-idp91297792"><em data-effect="italics">b</em> is 12, and it represents the highest value of <em data-effect="italics">x</em>.</p> </div> </div> <div id="fs-idp93994000" data-type="exercise"><div id="fs-idp50844848" data-type="problem"><p id="fs-idp1142320">What is the probability density function?</p> </div> </div> <div id="fs-idp104926176" data-type="exercise"><div id="fs-idp1313680" data-type="problem"><p id="fs-idp57939600">What is the theoretical mean?</p> </div> <div id="fs-idp80682848" data-type="solution"><p id="fs-idp5499456">six</p> </div> </div> <div id="fs-idp97956800" data-type="exercise"><div id="fs-idp31731472" data-type="problem"><p id="fs-idp90613376">What is the theoretical standard deviation?</p> </div> </div> <div id="fs-idp172321904" data-type="exercise"><div id="fs-idp7677536" data-type="problem"><p id="fs-idp81592784">Draw the graph of the distribution for <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; 9).</p> </div> <div id="fs-idp67560352" data-type="solution"><div id="fs-idp140107840" class="bc-figure figure"><span id="fs-idp124624384" data-type="media" data-alt="This graph shows a uniform distribution. The horizontal axis ranges from 0 to 12. The distribution is modeled by a rectangle extending from x = 0 to x = 12. A region from x = 9 to x = 12 is shaded inside the rectangle."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C05_M03_item002annoN-1.jpg" alt="This graph shows a uniform distribution. The horizontal axis ranges from 0 to 12. The distribution is modeled by a rectangle extending from x = 0 to x = 12. A region from x = 9 to x = 12 is shaded inside the rectangle." width="380" data-media-type="image/jpg" /></span></div> </div> </div> <div id="fs-idp49233616" data-type="exercise"><div id="fs-idp8048544" data-type="problem"><p id="fs-idp16092816">Find <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; 9).</p> </div> </div> <div id="fs-idp16246272" data-type="exercise"><div id="fs-idp5135424" data-type="problem"><p id="fs-idp66487744">Find the 40<sup>th</sup> percentile.</p> </div> <div id="fs-idp7986272" data-type="solution"><p id="fs-idp127620736">4.8</p> </div> </div> <p><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next eleven exercises.</em> The age of cars in the staff parking lot of a suburban college is uniformly distributed from six months (0.5 years) to 9.5 years.</p> <div id="fs-idp73318288" data-type="exercise"><div id="fs-idp14811216" data-type="problem"><p id="fs-idp52562736">What is being measured here?</p> </div> </div> <div id="fs-idp109888" data-type="exercise"><div id="fs-idm2561792" data-type="problem"><p id="fs-idp100044096">In words, define the random variable <em data-effect="italics">X</em>.</p> </div> <div id="fs-idp170212480" data-type="solution"><p id="fs-idp168011792"><em data-effect="italics">X</em> = The age (in years) of cars in the staff parking lot</p> </div> </div> <div id="fs-idp80642336" data-type="exercise"><div id="fs-idp48682736" data-type="problem"><p id="fs-idp72776400">Are the data discrete or continuous?</p> </div> </div> <div id="fs-idp74722352" data-type="exercise"><div id="fs-idp95338368" data-type="problem"><p id="fs-idp80394720">The interval of values for <em data-effect="italics">x</em> is ______.</p> </div> <div id="fs-idp91820176" data-type="solution"><p id="fs-idp51959344">0.5 to 9.5</p> </div> </div> <div id="fs-idp64667664" data-type="exercise"><div id="fs-idp29461104" data-type="problem"><p id="fs-idp10189264">The distribution for <em data-effect="italics">X</em> is ______.</p> </div> </div> <div id="fs-idp54832288" data-type="exercise"><div id="fs-idp25932272" data-type="problem"><p id="fs-idp25932528">Write the probability density function.</p> </div> <div id="fs-idp25933040" data-type="solution"><p id="fs-idp25933296"><em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = \(\frac{1}{9}\) where <em data-effect="italics">x</em> is between 0.5 and 9.5, inclusive.</p> </div> </div> <div id="fs-idp50471968" data-type="exercise"><div id="fs-idp138157168" data-type="problem"><p id="fs-idp58117856">Graph the probability distribution.</p> <ol type="a"><li>Sketch the graph of the probability distribution. <div id="element-123987" class="bc-figure figure"><span id="id7261010" data-type="media" data-alt="This is a blank graph template. The vertical and horizontal axes are unlabeled."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch05_06_01N-1.jpg" alt="This is a blank graph template. The vertical and horizontal axes are unlabeled." width="380" data-media-type="jpg/png" /></span></div> </li> <li>Identify the following values: <ol id="element-123321" type="i"><li>Lowest value for \(\overline{x}\): _______</li> <li>Highest value for \(\overline{x}\): _______</li> <li>Height of the rectangle: _______</li> <li>Label for <em data-effect="italics">x</em>-axis (words): _______</li> <li>Label for <em data-effect="italics">y</em>-axis (words): _______</li> </ol> </li> </ol> </div> </div> <div id="eip-97" data-type="exercise"><div id="fs-idp112317152" data-type="problem"><p id="fs-idp112317408">Find the average age of the cars in the lot.</p> </div> <div id="fs-idp92109904" data-type="solution"><p id="fs-idp92110160"><em data-effect="italics">μ</em> = 5</p> </div> </div> <div id="fs-idp4577984" data-type="exercise"><div id="fs-idp129225056" data-type="problem"><p id="fs-idp155781392">Find the probability that a randomly chosen car in the lot was less than four years old.</p> <ol id="element-42230" type="a"><li>Sketch the graph, and shade the area of interest. <div id="element-12987" class="bc-figure figure"><span id="id13706334" data-type="media" data-alt="Blank graph with vertical and horizontal axes."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch05_06_02-1.png" alt="Blank graph with vertical and horizontal axes." width="380" data-media-type="png/jpg" /></span></div> </li> <li>Find the probability. <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &lt; 4) = _______</li> </ol> </div> </div> <div id="eip-709" data-type="exercise"><div id="fs-idp138916000" data-type="problem"><p id="fs-idp87748944">Considering only the cars less than 7.5 years old, find the probability that a randomly chosen car in the lot was less than four years old.</p> <ol id="element-40030" type="a"><li>Sketch the graph, shade the area of interest. <div id="element-10987" class="bc-figure figure"><span id="id15551200" data-type="media" data-alt="This is a blank graph template. The vertical and horizontal axes are unlabeled."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch05_06_02-1.png" alt="This is a blank graph template. The vertical and horizontal axes are unlabeled." width="380" data-media-type="png/jpg" /></span></div> </li> <li>Find the probability. <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &lt; 4|<em data-effect="italics">x</em> &lt; 7.5) = _______</li> </ol> </div> <div id="fs-idp55294432" data-type="solution"><ol id="element-200212" type="a"><li>Check student’s solution.</li> <li>\(\frac{3.5}{7}\)</li> </ol> </div> </div> <div id="eip-174" data-type="exercise"><div id="fs-idp171499040" data-type="problem"><p>What has changed in the previous two problems that made the solutions different?</p> </div> </div> <div data-type="exercise"><div id="fs-idp152836592" data-type="problem"><p id="fs-idp23265776">Find the third quartile of ages of cars in the lot. This means you will have to find the value such that \(\frac{3}{4}\), or 75%, of the cars are at most (less than or equal to) that age.</p> <ol id="element-09898" type="a"><li>Sketch the graph, and shade the area of interest. <div id="element-101987" class="bc-figure figure"><span id="id11206333" data-type="media" data-alt="Blank graph with vertical and horizontal axes."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch05_06_02-1.png" alt="Blank graph with vertical and horizontal axes." width="380" data-media-type="png/jpg" /></span></div> </li> <li>Find the value <em data-effect="italics">k</em> such that <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &lt; <em data-effect="italics">k</em>) = 0.75.</li> <li>The third quartile is _______</li> </ol> </div> <div id="fs-idp167681280" data-type="solution"><ol id="element-12398" type="a"><li>Check student&#8217;s solution.</li> <li><em data-effect="italics">k</em> = 7.25</li> <li>7.25</li> </ol> </div> </div> </div> <div id="fs-idp64788688" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <p><em data-effect="italics">For each probability and percentile problem, draw the picture.</em></p> <div id="fs-idp169878048" data-type="exercise"><div id="fs-idp81982224" data-type="problem"><p id="fs-idp81982480">1) A random number generator picks a number from one to nine in a uniform manner.</p> <ol id="fs-idp176704368" type="a"><li><em data-effect="italics">X</em> ~ _________</li> <li>Graph the probability distribution.</li> <li><em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = _________</li> <li><em data-effect="italics">μ</em> = _________</li> <li><em data-effect="italics">σ</em> = _________</li> <li><em data-effect="italics">P</em>(3.5 &lt; <em data-effect="italics">x</em> &lt; 7.25) = _________</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; 5.67)</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; 5|<em data-effect="italics">x</em> &gt; 3) = _________</li> <li>Find the 90<sup>th</sup> percentile.</li> </ol> </div> <div id="eip-idp101962176" data-type="solution"><p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="fs-idp69073824" data-type="problem"><p id="fs-idp126144656">2) According to a study by Dr. John McDougall of his live-in weight loss program, the people who follow his program lose between six and 15 pounds a month until they approach trim body weight. Let’s suppose that the weight loss is uniformly distributed. We are interested in the weight loss of a randomly selected individual following the program for one month.</p> <ol id="fs-idp80387648" type="a"><li>Define the random variable. <em data-effect="italics">X</em> = _________</li> <li><em data-effect="italics">X</em> ~ _________</li> <li>Graph the probability distribution.</li> <li><em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = _________</li> <li><em data-effect="italics">μ</em> = _________</li> <li><em data-effect="italics">σ</em> = _________</li> <li>Find the probability that the individual lost more than ten pounds in a month.</li> <li>Suppose it is known that the individual lost more than ten pounds in a month. Find the probability that he lost less than 12 pounds in the month.</li> <li><em data-effect="italics">P</em>(7 &lt; <em data-effect="italics">x</em> &lt; 13|<em data-effect="italics">x</em> &gt; 9) = __________. State this in a probability question, similarly to parts g and h, draw the picture, and find the probability.</li> </ol> <p>&nbsp;</p> </div> </div> <div id="eip-106" data-type="exercise"><div id="fs-idp116073984" data-type="problem"><p id="fs-idp116074240">3) A subway train arrives every eight minutes during rush hour. We are interested in the length of time a commuter must wait for a train to arrive. The time follows a uniform distribution.</p> <ol id="fs-idp85264" type="a"><li>Define the random variable. <em data-effect="italics">X</em> = _______</li> <li><em data-effect="italics">X</em> ~ _______</li> <li>Graph the probability distribution.</li> <li><em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = _______</li> <li><em data-effect="italics">μ</em> = _______</li> <li><em data-effect="italics">σ</em> = _______</li> <li>Find the probability that the commuter waits less than one minute.</li> <li>Find the probability that the commuter waits between three and four minutes.</li> <li>Sixty percent of commuters wait more than how long for the train? State this in a probability question, similarly to parts g and h, draw the picture, and find the probability.</li> </ol> <p>&nbsp;</p> </div> <div id="fs-idp53345328" data-type="solution"></div> </div> <div id="eip-819" data-type="exercise"><div id="fs-idp174184320" data-type="problem"><p id="fs-idp125382704">4) The age of a first grader on September 1 at Garden Elementary School is uniformly distributed from 5.8 to 6.8 years. We randomly select one first grader from the class.</p> <ol id="fs-idp137837776" type="a"><li>Define the random variable. <em data-effect="italics">X</em> = _________</li> <li><em data-effect="italics">X</em> ~ _________</li> <li>Graph the probability distribution.</li> <li><em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = _________</li> <li><em data-effect="italics">μ</em> = _________</li> <li><em data-effect="italics">σ</em> = _________</li> <li>Find the probability that she is over 6.5 years old.</li> <li>Find the probability that she is between four and six years old.</li> <li>Find the 70<sup>th</sup> percentile for the age of first graders on September 1 at Garden Elementary School.</li> </ol> <p>&nbsp;</p> </div> </div> <p><em data-effect="italics">5) Use the following information to answer the next three exercises.</em> The Sky Train from the terminal to the rental–car and long–term parking center is supposed to arrive every eight minutes. The waiting times for the train are known to follow a uniform distribution.</p> <div id="eip-518" data-type="exercise"><div id="fs-idp172910688" data-type="problem"><p id="fs-idp114358400">What is the average waiting time (in minutes)?</p> <ol id="fs-idp178644288" type="a"><li>zero</li> <li>two</li> <li>three</li> <li>four</li> </ol> </div> <div id="fs-idp175490192" data-type="solution"><p id="fs-idp31654800"></p></div> </div> <div data-type="exercise"><div id="eip-idp37322992" data-type="problem"><p id="eip-idp37323248">6) Find the 30<sup>th</sup> percentile for the waiting times (in minutes).</p> <ol id="eip-idp82285888" type="a" data-mark-suffix="."><li>two</li> <li>2.4</li> <li>2.75</li> <li>three</li> </ol> <p>&nbsp;</p> </div> </div> <div id="eip-412" data-type="exercise"><div id="fs-idp2555712" data-type="problem"><p id="fs-idp2555968">7) The probability of waiting more than seven minutes given a person has waited more than four minutes is?</p> <ol id="fs-idp93854368" type="a"><li>0.125</li> <li>0.25</li> <li>0.5</li> <li>0.75</li> </ol> </div> <div id="fs-idp94217472" data-type="solution"><p id="fs-idp94217728"></p></div> </div> <div id="fs-idp57726432" data-type="exercise"><div id="fs-idm12724496" data-type="problem"><p id="fs-idm12724240">8) The time (in minutes) until the next bus departs a major bus depot follows a distribution with <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = \(\frac{1}{20}\) where <em data-effect="italics">x</em> goes from 25 to 45 minutes.</p> <ol id="fs-idm12672944" type="a"><li>Define the random variable. <em data-effect="italics">X</em> = ________</li> <li><em data-effect="italics">X</em> ~ ________</li> <li>Graph the probability distribution.</li> <li>The distribution is ______________ (name of distribution). It is _____________ (discrete or continuous).</li> <li><em data-effect="italics">μ</em> = ________</li> <li><em data-effect="italics">σ</em> = ________</li> <li>Find the probability that the time is at most 30 minutes. Sketch and label a graph of the distribution. Shade the area of interest. Write the answer in a probability statement.</li> <li>Find the probability that the time is between 30 and 40 minutes. Sketch and label a graph of the distribution. Shade the area of interest. Write the answer in a probability statement.</li> <li><em data-effect="italics">P</em>(25 &lt; <em data-effect="italics">x</em> &lt; 55) = _________. State this in a probability statement, similarly to parts g and h, draw the picture, and find the probability.</li> <li>Find the 90<sup>th</sup> percentile. This means that 90% of the time, the time is less than _____ minutes.</li> <li>Find the 75<sup>th</sup> percentile. In a complete sentence, state what this means. (See part j.)</li> <li>Find the probability that the time is more than 40 minutes given (or knowing that) it is at least 30 minutes.</li> </ol> <p>&nbsp;</p> </div> </div> <div id="fs-idm19928432" data-type="exercise"><div id="fs-idm19928176" data-type="problem"><p id="fs-idm19928048">9) Suppose that the value of a stock varies each day from \$16 to \$25 with a uniform distribution.</p> <ol id="fs-idp52534224" type="a"><li>Find the probability that the value of the stock is more than \$19.</li> <li>Find the probability that the value of the stock is between \$19 and \$22.</li> <li>Find the upper quartile &#8211; 25% of all days the stock is above what value? Draw the graph.</li> <li>Given that the stock is greater than \$18, find the probability that the stock is more than \$21. <ul id="fs-idp37941792"></ul> </li> </ol> <p>&nbsp;</p> </div> <div id="fs-idp30733472" data-type="solution"></div> </div> <div id="fs-idp90929648" data-type="exercise"><div id="fs-idp90929904" data-type="problem"><p id="fs-idp100719504">10) A fireworks show is designed so that the time between fireworks is between one and five seconds, and follows a uniform distribution.</p> <ol id="fs-idp100720032" type="a"><li>Find the average time between fireworks.</li> <li>Find probability that the time between fireworks is greater than four seconds.</li> </ol> </div> </div> <div id="fs-idp96470224" data-type="exercise"><div id="fs-idp96470480" data-type="problem"><p id="fs-idp182386192">11) The number of miles driven by a truck driver falls between 300 and 700, and follows a uniform distribution.</p> <ol id="fs-idp182386704" type="a"><li>Find the probability that the truck driver goes more than 650 miles in a day.</li> <li>Find the probability that the truck drivers goes between 400 and 650 miles in a day.</li> <li>At least how many miles does the truck driver travel on the furthest 10% of days?</li> </ol> <p>&nbsp;</p> </div> <div id="fs-idp168190128" data-type="solution"><ol id="fs-idp14788896" type="a"></ol> <p id="fs-idp77684160">12) Births are approximately uniformly distributed between the 52 weeks of the year. They can be said to follow a uniform distribution from one to 53 (spread of 52 weeks).</p> <ol id="fs-idp77684416" type="a"><li><em data-effect="italics">X</em> ~ _________</li> <li>Graph the probability distribution.</li> <li><em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = _________</li> <li><em data-effect="italics">μ</em> = _________</li> <li>σ = _________</li> <li>Find the probability that a person is born at the exact moment week 19 starts. That is, find <em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 19) = _________</li> <li><em data-effect="italics">P</em>(2 &lt; <em data-effect="italics">x</em> &lt; 31) = _________</li> <li>Find the probability that a person is born after week 40.</li> <li><em data-effect="italics">P</em>(12 &lt; <em data-effect="italics">x</em>|<em data-effect="italics">x</em> &lt; 28) = _________</li> <li>Find the 70<sup>th</sup> percentile.</li> <li>Find the minimum for the upper quarter.</li> </ol> <p>&nbsp;</p> <p><strong>Answers to odd questions</strong></p> <p>1)</p> <ol id="eip-idp101962432" type="a"><li><em data-effect="italics">X</em> ~ <em data-effect="italics">U</em>(1, 9)</li> <li>Check student’s solution.</li> <li>\(f\left(x\right)=\frac{1}{8}\) where \(1\le x\le 9\)</li> <li>five</li> <li>2.3</li> <li>\(\frac{15}{32}\)</li> <li>\(\frac{333}{800}\)</li> <li>\(\frac{2}{3}\)</li> <li>8.2</li> </ol> <p>3)</p> <ol id="fs-idp164234448" type="a"><li><em data-effect="italics">X</em> represents the length of time a commuter must wait for a train to arrive on the Red Line.</li> <li><em data-effect="italics">X</em> ~ <em data-effect="italics">U</em>(0, 8)</li> <li>Graph the probability distribution.</li> <li>\(f\left(x\right)=\frac{1}{8}\) where \(0\le x\le 8\)</li> <li>four</li> <li>2.31</li> <li>\(\frac{1}{8}\)</li> <li>\(\frac{1}{8}\)</li> <li>3.2</li> </ol> <p>5) d</p> <p>7) b</p> <p>9)</p> <ol id="fs-idp30733728" type="a"><li>The probability density function of <em data-effect="italics">X</em> is \(\frac{1}{25-16}=\frac{1}{9}\). <span data-type="newline"><br /> </span><em data-effect="italics">P</em>(<em data-effect="italics">X</em> &gt; 19) = (25 – 19) \(\left(\frac{1}{9}\right)\) = \(\frac{6}{9}\) = \(\frac{2}{3}\). <div id="fs-idp56920896" class="bc-figure figure"><span id="fs-idp56921152" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/soln_01aF-1.jpg" alt="" width="380" data-media-type="png/jpg" /></span></div> </li> <li><em data-effect="italics">P</em>(19 &lt; <em data-effect="italics">X</em> &lt; 22) = (22 – 19) \(\left(\frac{1}{9}\right)\) = \(\frac{3}{9}\) = \(\frac{1}{3}\). <div id="fs-idp47544128" class="bc-figure figure"><span id="fs-idp47544384" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/soln_01bF-1.jpg" alt="" width="380" data-media-type="png/jpg" /></span></div> </li> <li>The area must be 0.25, and 0.25 = (width)\(\left(\frac{1}{9}\right)\), so width = (0.25)(9) = 2.25. Thus, the value is 25 – 2.25 = 22.75.</li> <li>This is a conditional probability question. P(x &gt; 21| x &gt; 18). You can do this two ways: <ul><li>Draw the graph where a is now 18 and b is still 25. The height is \(\frac{1}{\left(25-18\right)}\) = \(\frac{1}{7}\)<span data-type="newline"><br /> </span>So, <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; 21|<em data-effect="italics">x</em> &gt; 18) = (25 – 21)\(\left(\frac{1}{7}\right)\) = 4/7.</li> <li>Use the formula: <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; 21|<em data-effect="italics">x</em> &gt; 18) = \(\frac{P\left(x&gt;21\text{ AND }x&gt;18\right)}{P\left(x&gt;18\right)}\) <span data-type="newline"><br /> </span>= \(\frac{P\left(x&gt;21\right)}{P\left(x&gt;18\right)}\) = \(\frac{\left(25-21\right)}{\left(25-18\right)}\) = \(\frac{4}{7}\).</li> </ul> </li> </ol> <p>11)</p> <ol type="a"><li><em data-effect="italics">P</em>(<em data-effect="italics">X</em> &gt; 650) = \(\frac{700-650}{700-300}=\frac{50}{400}=\frac{1}{8}\) = 0.125.</li> <li><em data-effect="italics">P</em>(400 &lt; <em data-effect="italics">X</em> &lt; 650) = \(\frac{650-400}{700-300}=\frac{250}{400}\) = 0.625</li> <li>0.10 = \(\frac{\text{width}}{\text{700}-\text{300}}\), so width = 400(0.10) = 40. Since 700 – 40 = 660, the drivers travel at least 660 miles on the furthest 10% of days.</li> </ol> <p>&nbsp;</p> </div> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl><dt>Conditional Probability</dt> <dd id="id13790404">the likelihood that an event will occur given that another event has already occurred.</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-continuous-distribution" title="Activity 6.4: Continuous Distribution"><div class="chapter-title-wrap"><h3 class="chapter-number">40</h3><h2 class="chapter-title"><span class="display-none">Activity 6.4: Continuous Distribution</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-id1167914364668" class="statistics lab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Continuous Distribution</div> <p id="fs-idm14104368">Class Time:</p> <p id="fs-idm14103984">Names:</p> <div id="fs-idm52234672" data-type="list"><div data-type="title">Student Learning Outcomes</div> <ul><li>The student will compare and contrast empirical data from a random number generator with the uniform distribution.</li> </ul> </div> <p id="fs-idp95550544"><span data-type="title">Collect the Data</span>Use a random number generator to generate 50 values between zero and one (inclusive). List them in <a class="autogenerated-content" href="#fs-idm51701040">(Figure)</a>. Round the numbers to four decimal places or set the calculator MODE to four places.</p> <ol id="fs-idp1588960" data-mark-suffix="."><li>Complete the table.<br /> <table id="fs-idm51701040" summary="Blank table with 50 cells for entering in data."><tbody><tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> </tbody> </table> </li> <li>Calculate the following: <ol id="fs-idp99174544" type="a"><li>\(\overline{x}=\) _______</li> <li><em data-effect="italics">s</em> = _______</li> <li>first quartile = _______</li> <li>third quartile = _______</li> <li>median = _______</li> </ol> </li> </ol> <div id="fs-idp37245568" data-type="list"><div data-type="title">Organize the Data</div> <ol><li>Construct a histogram of the empirical data. Make eight bars. <div id="fs-idm34978704" class="bc-figure figure"><span id="fs-idm34978448" data-type="media" data-alt="Blank graph with relative frequency on the vertical axis and X on the horizontal axis."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch05_06_01-1.png" alt="Blank graph with relative frequency on the vertical axis and X on the horizontal axis." width="380" data-media-type="image/png" /></span></div> </li> <li>Construct a histogram of the empirical data. Make five bars. <div id="fs-idp120220128" class="bc-figure figure"><span id="fs-idp120220384" data-type="media" data-alt="Blank graph with relative frequency on the vertical axis and X on the horizontal axis."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch05_06_01-1.png" alt="Blank graph with relative frequency on the vertical axis and X on the horizontal axis." width="380" data-media-type="image/png" /></span></div> </li> </ol> </div> <div id="fs-idm6061312" data-type="list"><div data-type="title">Describe the Data</div> <ol><li>In two to three complete sentences, describe the shape of each graph. (Keep it simple. Does the graph go straight across, does it have a V shape, does it have a hump in the middle or at either end, and so on. One way to help you determine a shape is to draw a smooth curve roughly through the top of the bars.)</li> <li>Describe how changing the number of bars might change the shape.</li> </ol> </div> <div id="fs-idm19839648" data-type="list"><div data-type="title">Theoretical Distribution</div> <ol data-mark-suffix="."><li>In words, <em data-effect="italics">X</em> = _____________________________________.</li> <li>The theoretical distribution of <em data-effect="italics">X</em> is <em data-effect="italics">X</em> ~ <em data-effect="italics">U</em>(0,1).</li> <li>In theory, based upon the distribution <em data-effect="italics">X</em> ~ <em data-effect="italics">U</em>(0,1), complete the following. <ol id="fs-idp172122752" type="a"><li><em data-effect="italics">μ</em> = ______</li> <li><em data-effect="italics">σ</em> = ______</li> <li>first quartile = ______</li> <li>third quartile = ______</li> <li>median = __________</li> </ol> </li> <li>Are the empirical values (the data) in the section titled <a href="#fs-idp95550544">Collect the Data</a> close to the corresponding theoretical values? Why or why not?</li> </ol> </div> <div id="fs-idp170724832" data-type="list"><div data-type="title">Plot the Data</div> <ol><li>Construct a box plot of the data. Be sure to use a ruler to scale accurately and draw straight edges.</li> <li>Do you notice any potential outliers? If so, which values are they? Either way, justify your answer numerically. (Recall that any DATA that are less than <em data-effect="italics">Q</em><sub>1</sub> – 1.5(<em data-effect="italics">IQR</em>) or more than <em data-effect="italics">Q</em><sub>3</sub> + 1.5(<em data-effect="italics">IQR</em>) are potential outliers. <em data-effect="italics">IQR</em> means interquartile range.)</li> </ol> </div> <div id="fs-idm63566320" data-type="list"><div data-type="title">Compare the Data</div> <ol><li>For each of the following parts, use a complete sentence to comment on how the value obtained from the data compares to the theoretical value you expected from the distribution in the section titled <a href="#fs-idm19839648">Theoretical Distribution</a>. <ol id="fs-idp53285248" type="a"><li>minimum value: _______</li> <li>first quartile: _______</li> <li>median: _______</li> <li>third quartile: _______</li> <li>maximum value: _______</li> <li>width of <em data-effect="italics">IQR</em>: _______</li> <li>overall shape: _______</li> </ol> </li> <li>Based on your comments in the section titled <a href="#fs-idp95550544">Collect the Data</a>, how does the box plot fit or not fit what you would expect of the distribution in the section titled <a href="#fs-idm19839648">Theoretical Distribution</a>?</li> </ol> </div> <div id="fs-idm74366864" data-type="list"><div data-type="title">Discussion Question</div> <ol><li>Suppose that the number of values generated was 500, not 50. How would that affect what you would expect the empirical data to be and the shape of its graph to look like?</li> </ol> </div> </div> </div></div>
<div class="part " id="part-the-normal-distribution"><div class="part-title-wrap"><h3 class="part-number">VII</h3><h1 class="part-title">Chapter 7: The Normal Distribution</h1></div><div class="ugc part-ugc"></div></div>
<div class="chapter standard" id="chapter-introduction-18" title="Chapter 7.1: Introduction"><div class="chapter-title-wrap"><h3 class="chapter-number">41</h3><h2 class="chapter-title"><span class="display-none">Chapter 7.1: Introduction</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-idp3364000" class="splash"><div class="bc-figcaption figcaption">If you ask enough people about their shoe size, you will find that your graphed data is shaped like a bell curve and can be described as normally distributed. (credit: Ömer Ünlϋ)</div> <p><span id="fs-idp63701744" data-type="media" data-alt="This photo shows many different pairs of shoes in various colors. The shoes appear to be hanging from a wall by cords."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/CNX_Stats_C06_CON-1.jpg" alt="This photo shows many different pairs of shoes in various colors. The shoes appear to be hanging from a wall by cords." width="500" data-media-type="image/jpg" /></span></p> </div> <div id="fs-idp2636240" class="chapter-objectives" data-type="note" data-has-label="true" data-label=""><div data-type="title">Chapter Objectives</div> <p>By the end of this chapter, the student should be able to:</p> <ul id="list4253"><li>Recognize the normal probability distribution and apply it appropriately.</li> <li>Recognize the standard normal probability distribution and apply it appropriately.</li> <li>Compare normal probabilities by converting to the standard normal distribution.</li> </ul> </div> <p id="intro001">The normal, a continuous distribution, is the most important of all the distributions. It is widely used and even more widely abused. Its graph is bell-shaped. You see the bell curve in almost all disciplines. Some of these include psychology, business, economics, the sciences, nursing, and, of course, mathematics. Some of your instructors may use the normal distribution to help determine your grade. Most IQ scores are normally distributed. Often real-estate prices fit a normal distribution. The normal distribution is extremely important, but it cannot be applied to everything in the real world.</p> <p id="element-299">In this chapter, you will study the normal distribution, the standard normal distribution, and applications associated with them.</p> <p id="element-915">The normal distribution has two parameters (two numerical descriptive measures): the mean (<em data-effect="italics">μ</em>) and the standard deviation (<em data-effect="italics">σ</em>). If <em data-effect="italics">X</em> is a quantity to be measured that has a normal distribution with mean (<em data-effect="italics">μ</em>) and standard deviation (<em data-effect="italics">σ</em>), we designate this by writing</p> <div id="fs-idm13606720" class="bc-figure figure"><span id="id42732641" data-type="media" data-alt="This diagram shows a bell-shaped curve with the lower case Greek letter mu at the center of the x axis. It has the label Normal: uppercase X is similar to N (μ, σ)"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch_06_01_01-1.jpg" alt="This diagram shows a bell-shaped curve with the lower case Greek letter mu at the center of the x axis. It has the label Normal: uppercase X is similar to N (μ, σ)" width="450" data-media-type="image/jpg" /></span></div> <p>The probability density function is a rather complicated function. <strong>Do not memorize it</strong>. It is not necessary.</p> <p><em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = \(\frac{1}{\sigma \cdot \sqrt{2\cdot \pi }} \cdot {\text{ e}}^{-\frac{1}{2}\cdot {\left(\frac{x-\mu }{\sigma }\right)}^{2}}\)</p> <p>The cumulative distribution function is <em data-effect="italics">P</em>(<em data-effect="italics">X</em> &lt; <em data-effect="italics">x</em>). It is calculated either by a calculator or a computer, or it is looked up in a table. Technology has made the tables virtually obsolete. For that reason, as well as the fact that there are various table formats, we are not including table instructions.</p> <p id="element-396">The curve is symmetric about a vertical line drawn through the mean, <em data-effect="italics">μ</em>. In theory, the mean is the same as the median, because the graph is symmetric about <em data-effect="italics">μ</em>. As the notation indicates, the normal distribution depends only on the mean and the standard deviation. Since the area under the curve must equal one, a change in the standard deviation, <em data-effect="italics">σ</em>, causes a change in the shape of the curve; the curve becomes fatter or skinnier depending on <em data-effect="italics">σ</em>. A change in <em data-effect="italics">μ</em> causes the graph to shift to the left or right. This means there are an infinite number of normal probability distributions. One of special interest is called the <strong>standard normal distribution</strong>.</p> <div id="fs-idp30947824" class="statistics collab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Collaborative Classroom Activity</div> <p>Your instructor will record the heights of both men and women in your class, separately. Draw histograms of your data. Then draw a smooth curve through each histogram. Is each curve somewhat bell-shaped? Do you think that if you had recorded 200 data values for men and 200 for women that the curves would look bell-shaped? Calculate the mean for each data set. Write the means on the <em data-effect="italics">x</em>-axis of the appropriate graph below the peak. Shade the approximate area that represents the probability that one randomly chosen male is taller than 72 inches. Shade the approximate area that represents the probability that one randomly chosen female is shorter than 60 inches. If the total area under each curve is one, does either probability appear to be more than 0.5?</p> </div> <div id="fs-idm30369536" class="formula-review" data-depth="1"><h3 data-type="title">Formula Review</h3> <p id="fs-idp6996576"><em data-effect="italics">X</em> ∼ <em data-effect="italics">N</em>(<em data-effect="italics">μ</em>, <em data-effect="italics">σ</em>)</p> <p id="fs-idp6873424"><span data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item"><em data-effect="italics">μ</em> = the mean</span><span data-type="item"><em data-effect="italics">σ</em> = the standard deviation</span></span></p> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="normdist"><dt>Normal Distribution</dt> <dd id="id42733014">a continuous random variable (RV) with pdf <em data-effect="italics">f</em>(<em data-effect="italics">x</em>) = \(\frac{1}{\sigma \sqrt{2\pi }}{\text{ e}}^{{\frac{–\left(x\text{ }–\text{ }\mu \right)}{2{\sigma }^{2}}}^{2}}\), where <em data-effect="italics">μ</em> is the mean of the distribution and <em data-effect="italics">σ</em> is the standard deviation; notation: <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(<em data-effect="italics">μ</em>, <em data-effect="italics">σ</em>). If <em data-effect="italics">μ</em> = 0 and <em data-effect="italics">σ</em> = 1, the RV is called the <strong>standard normal distribution</strong>.</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-the-standard-normal-distribution" title="Chapter 7.2: The Standard Normal Distribution"><div class="chapter-title-wrap"><h3 class="chapter-number">42</h3><h2 class="chapter-title"><span class="display-none">Chapter 7.2: The Standard Normal Distribution</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p id="fs-idp71600">The <span data-type="term">standard normal distribution</span> is a normal distribution of <strong>standardized values called</strong> <span data-type="term"><em data-effect="italics">z</em>-scores</span>. <strong>A <em data-effect="italics">z</em>-score is measured in units of the standard deviation.</strong> For example, if the mean of a normal distribution is five and the standard deviation is two, the value 11 is three standard deviations above (or to the right of) the mean. The calculation is as follows:</p> <p id="fs-idp80744096"><em data-effect="italics">x</em> = <em data-effect="italics">μ</em> + (<em data-effect="italics">z</em>)(<em data-effect="italics">σ</em>) = 5 + (3)(2) = 11</p> <p id="fs-idp8223296">The <em data-effect="italics">z</em>-score is three.</p> <p id="fs-idm6684576">The mean for the standard normal distribution is zero, and the standard deviation is one. The transformation <em data-effect="italics">z</em> = \(\frac{x-\mu }{\sigma }\) produces the distribution <em data-effect="italics">Z</em> ~ <em data-effect="italics">N</em>(0, 1). The value <em data-effect="italics">x</em> in the given equation comes from a normal distribution with mean <em data-effect="italics">μ</em> and standard deviation <em data-effect="italics">σ</em>.</p> <div id="fs-idp52480624" class="bc-section section" data-depth="1"><h3 data-type="title"><em data-effect="italics">Z</em>-Scores</h3> <p id="fs-idm81546736">If <em data-effect="italics">X</em> is a normally distributed random variable and <em data-effect="italics">X</em> ~ <em data-effect="italics">N(μ, σ)</em>, then the <em data-effect="italics">z</em>-score is:</p> <div id="element-521" data-type="equation">\(z=\frac{x\text{ }–\text{ }\mu }{\sigma }\)</div> <p><strong>The <em data-effect="italics">z</em>-score tells you how many standard deviations the value <em data-effect="italics">x</em> is above (to the right of) or below (to the left of) the mean, <em data-effect="italics">μ</em>.</strong> Values of <em data-effect="italics">x</em> that are larger than the mean have positive <em data-effect="italics">z</em>-scores, and values of <em data-effect="italics">x</em> that are smaller than the mean have negative <em data-effect="italics">z</em>-scores. If <em data-effect="italics">x</em> equals the mean, then <em data-effect="italics">x</em> has a <em data-effect="italics">z</em>-score of zero.</p> <div class="textbox textbox--examples" data-type="example"><p>Suppose <em data-effect="italics">X</em> ~ <em data-effect="italics">N(5, 6)</em>. This says that <em data-effect="italics">X</em> is a normally distributed random variable with mean <em data-effect="italics">μ</em> = 5 and standard deviation <em data-effect="italics">σ</em> = 6. Suppose <em data-effect="italics">x</em> = 17. Then:</p> <div id="element-160" data-type="equation">\(z=\frac{x–\mu }{\sigma }=\frac{17–5}{6}=2\)</div> <p id="fs-idp325632">This means that <em data-effect="italics">x</em> = 17 is <strong>two standard deviations</strong> (2<em data-effect="italics">σ</em>) above or to the right of the mean <em data-effect="italics">μ</em> = 5.</p> <p>Notice that: 5 + (2)(6) = 17 (The pattern is <em data-effect="italics">μ</em> + <em data-effect="italics">zσ</em> = <em data-effect="italics">x</em>)</p> <p id="element-330">Now suppose <em data-effect="italics">x</em> = 1. Then: <em data-effect="italics">z</em> = \(\frac{x–\mu }{\sigma }\) = \(\frac{1–5}{6}\) = –0.67 (rounded to two decimal places)</p> <p id="element-468"><strong>This means that <em data-effect="italics">x</em> = 1 is 0.67 standard deviations (–0.67<em data-effect="italics">σ</em>) below or to the left of the mean <em data-effect="italics">μ</em> = 5. Notice that:</strong> 5 + (–0.67)(6) is approximately equal to one (This has the pattern <em data-effect="italics">μ</em> + (–0.67)σ = 1)</p> <p>Summarizing, when <em data-effect="italics">z</em> is positive, <em data-effect="italics">x</em> is above or to the right of <em data-effect="italics">μ</em> and when <em data-effect="italics">z</em> is negative, <em data-effect="italics">x</em> is to the left of or below <em data-effect="italics">μ</em>. Or, when <em data-effect="italics">z</em> is positive, <em data-effect="italics">x</em> is greater than <em data-effect="italics">μ</em>, and when <em data-effect="italics">z</em> is negative <em data-effect="italics">x</em> is less than <em data-effect="italics">μ</em>.</p> </div> <div id="fs-idp69782784" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="eip-949" data-type="exercise"><div data-type="problem"><p>What is the <em data-effect="italics">z</em>-score of <em data-effect="italics">x</em>, when <em data-effect="italics">x</em> = 1 and <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(12,3)?</p> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><p>Some doctors believe that a person can lose five pounds, on the average, in a month by reducing his or her fat intake and by exercising consistently. Suppose weight loss has a normal distribution. Let <em data-effect="italics">X</em> = the amount of weight lost (in pounds) by a person in a month. Use a standard deviation of two pounds. <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(5, 2). Fill in the blanks.</p> <p>&nbsp;</p> <div data-type="exercise"><div id="id1167373890535" data-type="problem"><p id="fs-idm105387200">a. Suppose a person <strong>lost</strong> ten pounds in a month. The <em data-effect="italics">z</em>-score when <em data-effect="italics">x</em> = 10 pounds is <em data-effect="italics">z</em> = 2.5 (verify). This <em data-effect="italics">z</em>-score tells you that <em data-effect="italics">x</em> = 10 is ________ standard deviations to the ________ (right or left) of the mean _____ (What is the mean?).</p> </div> <div id="id1167373888670" data-type="solution"><p>a. This <em data-effect="italics">z</em>-score tells you that <em data-effect="italics">x</em> = 10 is <strong>2.5</strong> standard deviations to the <strong>right</strong> of the mean <strong>five</strong>.</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id1167373888705" data-type="problem"><p>b. Suppose a person <strong>gained</strong> three pounds (a negative weight loss). Then <em data-effect="italics">z</em> = __________. This <em data-effect="italics">z</em>-score tells you that <em data-effect="italics">x</em> = –3 is ________ standard deviations to the __________ (right or left) of the mean.</p> </div> <div id="id1167373884020" data-type="solution" data-print-placement="end"><p>b. <em data-effect="italics">z</em> = <strong>–4</strong>. This <em data-effect="italics">z</em>-score tells you that <em data-effect="italics">x</em> = –3 is <strong>four</strong> standard deviations to the <strong>left</strong> of the mean.</p> <p>&nbsp;</p> </div> </div> <div id="eip-681" data-type="exercise"><div data-type="problem"><p>c. Suppose the random variables <em data-effect="italics">X</em> and <em data-effect="italics">Y</em> have the following normal distributions: <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(5, 6) and <em data-effect="italics">Y</em> ~ <em data-effect="italics">N</em>(2, 1). If <em data-effect="italics">x</em> = 17, then <em data-effect="italics">z</em> = 2. (This was previously shown.) If <em data-effect="italics">y</em> = 4, what is <em data-effect="italics">z</em>?</p> </div> <div data-type="solution"><p>c. <em data-effect="italics">z</em> = \(\frac{y-\mu }{\sigma }\) = \(\frac{4-2}{1}\) = 2 where <em data-effect="italics">µ</em> = 2 and <em data-effect="italics">σ</em> = 1.</p> </div> </div> <p>The <em data-effect="italics">z</em>-score for <em data-effect="italics">y</em> = 4 is <em data-effect="italics">z</em> = 2. This means that four is <em data-effect="italics">z</em> = 2 standard deviations to the right of the mean. Therefore, <em data-effect="italics">x</em> = 17 and <em data-effect="italics">y</em> = 4 are both two (of <strong>their own</strong>) standard deviations to the right of <strong>their</strong> respective means.</p> <p id="element-735"><strong>The <em data-effect="italics">z</em>-score allows us to compare data that are scaled differently.</strong> To understand the concept, suppose <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(5, 6) represents weight gains for one group of people who are trying to gain weight in a six week period and <em data-effect="italics">Y</em> ~ <em data-effect="italics">N</em>(2, 1) measures the same weight gain for a second group of people. A negative weight gain would be a weight loss. Since <em data-effect="italics">x</em> = 17 and <em data-effect="italics">y</em> = 4 are each two standard deviations to the right of their means, they represent the same, standardized weight gain <strong>relative to their means</strong>.</p> </div> <div id="fs-idp126867120" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div id="eip-151" data-type="problem"><p id="eip-idm83673168">Fill in the blanks.</p> <p>Jerome averages 16 points a game with a standard deviation of four points. <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(16,4). Suppose Jerome scores ten points in a game. The <em data-effect="italics">z</em>–score when <em data-effect="italics">x</em> = 10 is –1.5. This score tells you that <em data-effect="italics">x</em> = 10 is _____ standard deviations to the ______(right or left) of the mean______(What is the mean?).</p> </div> </div> </div> <p><span data-type="title">The Empirical Rule</span>If <em data-effect="italics">X</em> is a random variable and has a normal distribution with mean <em data-effect="italics">µ</em> and standard deviation <em data-effect="italics">σ</em>, then the <span data-type="term">Empirical Rule</span> states the following:</p> <ul id="fs-idp54135840"><li>About 68% of the <em data-effect="italics">x</em> values lie between –1<em data-effect="italics">σ</em> and +1<em data-effect="italics">σ</em> of the mean <em data-effect="italics">µ</em> (within one standard deviation of the mean).</li> <li>About 95% of the <em data-effect="italics">x</em> values lie between –2<em data-effect="italics">σ</em> and +2<em data-effect="italics">σ</em> of the mean <em data-effect="italics">µ</em> (within two standard deviations of the mean).</li> <li>About 99.7% of the <em data-effect="italics">x</em> values lie between –3<em data-effect="italics">σ</em> and +3<em data-effect="italics">σ</em> of the mean <em data-effect="italics">µ</em> (within three standard deviations of the mean). Notice that almost all the <em data-effect="italics">x</em> values lie within three standard deviations of the mean.</li> <li>The <em data-effect="italics">z</em>-scores for +1<em data-effect="italics">σ</em> and –1<em data-effect="italics">σ</em> are +1 and –1, respectively.</li> <li>The <em data-effect="italics">z</em>-scores for +2<em data-effect="italics">σ</em> and –2<em data-effect="italics">σ</em> are +2 and –2, respectively.</li> <li>The <em data-effect="italics">z</em>-scores for +3<em data-effect="italics">σ</em> and –3<em data-effect="italics">σ</em> are +3 and –3 respectively.</li> </ul> <p id="fs-idm22983328">The empirical rule is also known as the 68-95-99.7 rule.</p> <div id="fs-idp56138992" class="bc-figure figure"><span id="empir_rule" data-type="media" data-alt="This frequency curve illustrates the empirical rule. The normal curve is shown over a horizontal axis. The axis is labeled with points -3s, -2s, -1s, m, 1s, 2s, 3s. Vertical lines connect the axis to the curve at each labeled point. The peak of the curve aligns with the point m."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch06_03_01-1.jpg" alt="This frequency curve illustrates the empirical rule. The normal curve is shown over a horizontal axis. The axis is labeled with points -3s, -2s, -1s, m, 1s, 2s, 3s. Vertical lines connect the axis to the curve at each labeled point. The peak of the curve aligns with the point m." data-media-type="image/jpg" data-print-width="3in" /></span></div> <div class="textbox textbox--examples" data-type="example"><p>The mean height of 15 to 18-year-old males from Chile from 2009 to 2010 was 170 cm with a standard deviation of 6.28 cm. Male heights are known to follow a normal distribution. Let <em data-effect="italics">X</em> = the height of a 15 to 18-year-old male from Chile in 2009 to 2010. Then <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(170, 6.28).</p> <p>&nbsp;</p> <div id="eip-231" data-type="exercise"><div id="eip-633" data-type="problem"><p id="eip-idp457728">a. Suppose a 15 to 18-year-old male from Chile was 168 cm tall from 2009 to 2010. The <em data-effect="italics">z</em>-score when <em data-effect="italics">x</em> = 168 cm is <em data-effect="italics">z</em> = _______. This <em data-effect="italics">z</em>-score tells you that <em data-effect="italics">x</em> = 168 is ________ standard deviations to the ________ (right or left) of the mean _____ (What is the mean?).</p> </div> <div data-type="solution"><p>a. –0.32, 0.32, left, 170</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>b. Suppose that the height of a 15 to 18-year-old male from Chile from 2009 to 2010 has a <em data-effect="italics">z</em>-score of <em data-effect="italics">z</em> = 1.27. What is the male’s height? The <em data-effect="italics">z</em>-score (<em data-effect="italics">z</em> = 1.27) tells you that the male’s height is ________ standard deviations to the __________ (right or left) of the mean.</p> </div> <div id="eip-205" data-type="solution"><p id="eip-741">b. 177.98 cm, 1.27, right</p> </div> </div> </div> <div id="fs-idp103857616" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div data-type="problem"><p id="fs-idp74289648">Use the information in <a class="autogenerated-content" href="#eip-736">(Figure)</a> to answer the following questions.</p> <ol id="eip-idm24917152" type="a"><li>Suppose a 15 to 18-year-old male from Chile was 176 cm tall from 2009 to 2010. The <em data-effect="italics">z</em>-score when <em data-effect="italics">x</em> = 176 cm is <em data-effect="italics">z</em> = _______. This <em data-effect="italics">z</em>-score tells you that <em data-effect="italics">x</em> = 176 cm is ________ standard deviations to the ________ (right or left) of the mean _____ (What is the mean?).</li> <li>Suppose that the height of a 15 to 18-year-old male from Chile from 2009 to 2010 has a <em data-effect="italics">z</em>-score of <em data-effect="italics">z</em> = –2. What is the male’s height? The <em data-effect="italics">z</em>-score (<em data-effect="italics">z</em> = –2) tells you that the male’s height is ________ standard deviations to the __________ (right or left) of the mean.</li> </ol> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><div id="eip-117" data-type="exercise"><div id="eip-552" data-type="problem"><p>From 1984 to 1985, the mean height of 15 to 18-year-old males from Chile was 172.36 cm, and the standard deviation was 6.34 cm. Let <em data-effect="italics">Y</em> = the height of 15 to 18-year-old males from 1984 to 1985. Then <em data-effect="italics">Y</em> ~ <em data-effect="italics">N</em>(172.36, 6.34).</p> <p>The mean height of 15 to 18-year-old males from Chile from 2009 to 2010 was 170 cm with a standard deviation of 6.28 cm. Male heights are known to follow a normal distribution. Let <em data-effect="italics">X</em> = the height of a 15 to 18-year-old male from Chile in 2009 to 2010. Then <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(170, 6.28).</p> <p>Find the <em data-effect="italics">z</em>-scores for <em data-effect="italics">x</em> = 160.58 cm and <em data-effect="italics">y</em> = 162.85 cm. Interpret each <em data-effect="italics">z</em>-score. What can you say about <em data-effect="italics">x</em> = 160.58 cm and <em data-effect="italics">y</em> = 162.85 cm as they compare to their respective means and standard deviations?</p> </div> <div data-type="solution"><p id="eip-768">The <em data-effect="italics">z</em>-score for <em data-effect="italics">x</em> = -160.58 is <em data-effect="italics">z</em> = –1.5. <span data-type="newline"><br /> </span>The <em data-effect="italics">z</em>-score for <em data-effect="italics">y</em> = 162.85 is <em data-effect="italics">z</em> = –1.5. <span data-type="newline"><br /> </span>Both <em data-effect="italics">x</em> = 160.58 and <em data-effect="italics">y</em> = 162.85 deviate the same number of standard deviations from their respective means and in the same direction.</p> </div> </div> </div> <div id="fs-idp90060128" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div data-type="problem"><p id="eip-881">In 2012, 1,664,479 students took the SAT exam. The distribution of scores in the verbal section of the SAT had a mean <em data-effect="italics">µ</em> = 496 and a standard deviation <em data-effect="italics">σ</em> = 114. Let <em data-effect="italics">X</em> = a SAT exam verbal section score in 2012. Then <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(496, 114).</p> <p id="eip-idm6224096">Find the <em data-effect="italics">z</em>-scores for <em data-effect="italics">x</em><sub>1</sub> = 325 and <em data-effect="italics">x</em><sub>2</sub> = 366.21. Interpret each <em data-effect="italics">z</em>-score. What can you say about <em data-effect="italics">x</em><sub>1</sub> = 325 and <em data-effect="italics">x</em><sub>2</sub> = 366.21 as they compare to their respective means and standard deviations?</p> </div> </div> </div> <div id="eip-980" class="textbox textbox--examples" data-type="example"><p id="eip-692">Suppose <em data-effect="italics">x</em> has a normal distribution with mean 50 and standard deviation 6.</p> <ul id="eip-id1168769509491"><li>About 68% of the <em data-effect="italics">x</em> values lie within one standard deviation of the mean. Therefore, about 68% of the <em data-effect="italics">x</em> values lie between –1<em data-effect="italics">σ</em> = (–1)(6) = –6 and 1<em data-effect="italics">σ</em> = (1)(6) = 6 of the mean 50. The values 50 – 6 = 44 and 50 + 6 = 56 are within one standard deviation from the mean 50. The <em data-effect="italics">z</em>-scores are –1 and +1 for 44 and 56, respectively.</li> <li>About 95% of the <em data-effect="italics">x</em> values lie within two standard deviations of the mean. Therefore, about 95% of the <em data-effect="italics">x</em> values lie between –2<em data-effect="italics">σ</em> = (–2)(6) = –12 and 2<em data-effect="italics">σ</em> = (2)(6) = 12. The values 50 – 12 = 38 and 50 + 12 = 62 are within two standard deviations from the mean 50. The <em data-effect="italics">z</em>-scores are –2 and +2 for 38 and 62, respectively.</li> <li>About 99.7% of the <em data-effect="italics">x</em> values lie within three standard deviations of the mean. Therefore, about 95% of the <em data-effect="italics">x</em> values lie between –3<em data-effect="italics">σ</em> = (–3)(6) = –18 and 3<em data-effect="italics">σ</em> = (3)(6) = 18 from the mean 50. The values 50 – 18 = 32 and 50 + 18 = 68 are within three standard deviations of the mean 50. The <em data-effect="italics">z</em>-scores are –3 and +3 for 32 and 68, respectively.</li> </ul> </div> <div id="fs-idp63750176" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div data-type="problem"><p>Suppose <em data-effect="italics">X</em> has a normal distribution with mean 25 and standard deviation five. Between what values of <em data-effect="italics">x</em> do 68% of the values lie?</p> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><div data-type="exercise"><div data-type="problem"><p>From 1984 to 1985, the mean height of 15 to 18-year-old males from Chile was 172.36 cm, and the standard deviation was 6.34 cm. Let <em data-effect="italics">Y</em> = the height of 15 to 18-year-old males in 1984 to 1985. Then <em data-effect="italics">Y</em> ~ <em data-effect="italics">N</em>(172.36, 6.34).</p> <ol id="eip-idp122625264" type="a"><li>About 68% of the <em data-effect="italics">y</em> values lie between what two values? These values are ________________. The <em data-effect="italics">z</em>-scores are ________________, respectively.</li> <li>About 95% of the <em data-effect="italics">y</em> values lie between what two values? These values are ________________. The <em data-effect="italics">z</em>-scores are ________________ respectively.</li> <li>About 99.7% of the <em data-effect="italics">y</em> values lie between what two values? These values are ________________. The <em data-effect="italics">z</em>-scores are ________________, respectively.</li> </ol> </div> <div id="eip-214" data-type="solution"><ol id="fs-idp84281088" type="a"><li>About 68% of the values lie between 166.02 cm and 178.7 cm. The <em data-effect="italics">z</em>-scores are –1 and 1.</li> <li>About 95% of the values lie between 159.68 cm and 185.04 cm. The <em data-effect="italics">z</em>-scores are –2 and 2.</li> <li>About 99.7% of the values lie between 153.34 cm and 191.38 cm. The <em data-effect="italics">z</em>-scores are –3 and 3.</li> </ol> </div> </div> </div> <div id="fs-idp139717168" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div data-type="problem"><p>The scores on a college entrance exam have an approximate normal distribution with mean, <em data-effect="italics">µ</em> = 52 points and a standard deviation, <em data-effect="italics">σ</em> = 11 points.</p> <ol id="eip-idp121002672" type="a"><li>About 68% of the <em data-effect="italics">y</em> values lie between what two values? These values are ________________. The <em data-effect="italics">z</em>-scores are ________________, respectively.</li> <li>About 95% of the <em data-effect="italics">y</em> values lie between what two values? These values are ________________. The <em data-effect="italics">z</em>-scores are ________________, respectively.</li> <li>About 99.7% of the <em data-effect="italics">y</em> values lie between what two values? These values are ________________. The <em data-effect="italics">z</em>-scores are ________________, respectively.</li> </ol> </div> </div> </div> </div> <div id="fs-idm49020816" class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="fs-idm49032560">“Blood Pressure of Males and Females.” StatCruch, 2013. Available online at http://www.statcrunch.com/5.0/viewreport.php?reportid=11960 (accessed May 14, 2013).</p> <p id="fs-idm16230832">“The Use of Epidemiological Tools in Conflict-affected populations: Open-access educational resources for policy-makers: Calculation of z-scores.” London School of Hygiene and Tropical Medicine, 2009. Available online at http://conflict.lshtm.ac.uk/page_125.htm (accessed May 14, 2013).</p> <p id="fs-idp3576896">“2012 College-Bound Seniors Total Group Profile Report.” CollegeBoard, 2012. Available online at http://media.collegeboard.com/digitalServices/pdf/research/TotalGroup-2012.pdf (accessed May 14, 2013).</p> <p id="fs-idm63215776">“Digest of Education Statistics: ACT score average and standard deviations by sex and race/ethnicity and percentage of ACT test takers, by selected composite score ranges and planned fields of study: Selected years, 1995 through 2009.” National Center for Education Statistics. Available online at http://nces.ed.gov/programs/digest/d09/tables/dt09_147.asp (accessed May 14, 2013).</p> <p id="fs-idm36569680">Data from the <em data-effect="italics">San Jose Mercury News</em>.</p> <p id="fs-idp110586896">Data from <em data-effect="italics">The World Almanac and Book of Facts</em>.</p> <p id="fs-idm16015264">“List of stadiums by capacity.” Wikipedia. Available online at https://en.wikipedia.org/wiki/List_of_stadiums_by_capacity (accessed May 14, 2013).</p> <p id="fs-idp17697408">Data from the National Basketball Association. Available online at www.nba.com (accessed May 14, 2013).</p> </div> <div id="fs-idp33285824" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p>A <em data-effect="italics">z</em>-score is a standardized value. Its distribution is the standard normal, <em data-effect="italics">Z</em> ~ <em data-effect="italics">N</em>(0, 1). The mean of the <em data-effect="italics">z</em>-scores is zero and the standard deviation is one. If <em data-effect="italics">z</em> is the <em data-effect="italics">z</em>-score for a value <em data-effect="italics">x</em> from the normal distribution <em data-effect="italics">N</em>(<em data-effect="italics">µ</em>, <em data-effect="italics">σ</em>) then <em data-effect="italics">z</em> tells you how many standard deviations <em data-effect="italics">x</em> is above (greater than) or below (less than) <em data-effect="italics">µ</em>.</p> </div> <div id="eip-676" class="formula-review" data-depth="1"><h3 data-type="title">Formula Review</h3> <p><em data-effect="italics">z</em> = a standardized value (<em data-effect="italics">z</em>-score)</p> <p>mean = 0; standard deviation = 1</p> <p id="fs-idp101194144">To find the <em data-effect="italics">k</em><sup>th</sup> percentile of <em data-effect="italics">X</em> when the <em data-effect="italics">z</em>-scores is known:<span data-type="newline"><br /> </span><em data-effect="italics">k</em> = <em data-effect="italics">μ</em> + (<em data-effect="italics">z</em>)<em data-effect="italics">σ</em></p> <p id="fs-idp50699312"><em data-effect="italics">z</em>-score: <em data-effect="italics">z</em> = \(\frac{x\text{ – }\mu }{\sigma }\)</p> <p><em data-effect="italics">Z</em> = the random variable for <em data-effect="italics">z</em>-scores</p> </div> <div id="fs-idm124208336" class="practice" data-depth="1"><div id="fs-idp27962224" data-type="exercise"><div id="fs-idm130770816" data-type="problem"><p id="fs-idp48843184">A bottle of water contains 12.05 fluid ounces with a standard deviation of 0.01 ounces. Define the random variable <em data-effect="italics">X</em> in words. <em data-effect="italics">X</em> = ____________.</p> </div> <div id="fs-idm80802544" data-type="solution"><p id="fs-idm18940704">ounces of water in a bottle</p> </div> </div> <div id="fs-idm114970736" data-type="exercise"><div id="fs-idm51787952" data-type="problem"><p id="fs-idm25786560">A normal distribution has a mean of 61 and a standard deviation of 15. What is the median?</p> </div> <p>solution 61 &#8211;&gt;</p> </div> <div id="fs-idm121963344" data-type="exercise"><div id="fs-idp40425504" data-type="problem"><p id="fs-idp20871600"><em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(1, 2)</p> <p id="fs-idm38764432"><em data-effect="italics">σ</em> = _______</p> </div> <div id="fs-idp11170064" data-type="solution"><p id="fs-idp2346224">2</p> </div> </div> <div id="fs-idm142191456" data-type="exercise"><div id="fs-idm64835792" data-type="problem"><p id="fs-idm75859328">A company manufactures rubber balls. The mean diameter of a ball is 12 cm with a standard deviation of 0.2 cm. Define the random variable <em data-effect="italics">X</em> in words. <em data-effect="italics">X</em> = ______________.</p> </div> <p>solution  diameter of a rubber ball &#8211;&gt;</p> </div> <div id="fs-idm97876640" data-type="exercise"><div id="fs-idm79412944" data-type="problem"><p id="fs-idm109096880"><em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(–4, 1)</p> <p id="fs-idm13751808">What is the median?</p> </div> <div id="fs-idp14850608" data-type="solution"><p id="fs-idp197152">–4</p> </div> </div> <div id="fs-idm102541136" data-type="exercise"><div id="fs-idp43742064" data-type="problem"><p id="fs-idm72490192"><em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(3, 5)</p> <p id="fs-idm13818352"><em data-effect="italics">σ</em> = _______</p> </div> <p>solution  5 &#8211;&gt;</p> </div> <div id="fs-idm19058720" data-type="exercise"><div id="fs-idm48567008" data-type="problem"><p id="fs-idm75169840"><em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(–2, 1)</p> <p id="fs-idm55245024"><em data-effect="italics">μ</em> = _______</p> </div> <div id="fs-idm113945328" data-type="solution"><p id="fs-idm49269392">–2</p> </div> </div> <div id="fs-idm122119648" data-type="exercise"><div id="fs-idp17099760" data-type="problem"><p id="fs-idm57102944">What does a <em data-effect="italics">z</em>-score measure?</p> </div> <p>solution  The number of standard deviations a value is from the mean. &#8211;&gt;</p> </div> <div id="fs-idm63043376" data-type="exercise"><div id="fs-idm153107680" data-type="problem"><p id="fs-idp37549840">What does standardizing a normal distribution do to the mean?</p> </div> <div id="fs-idm61255584" data-type="solution"><p id="fs-idm44615152">The mean becomes zero.</p> </div> </div> <div id="fs-idm126130944" data-type="exercise"><div id="fs-idp37871552" data-type="problem"><p id="fs-idm121847760">Is <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(0, 1) a standardized normal distribution? Why or why not?</p> </div> <p>solution  Yes because the mean is zero, and the standard deviation is one. &#8211;&gt;</p> </div> <div id="fs-idm26153296" data-type="exercise"><div id="fs-idp18633664" data-type="problem"><p id="fs-idm97831088">What is the <em data-effect="italics">z</em>-score of <em data-effect="italics">x</em> = 12, if it is two standard deviations to the right of the mean?</p> </div> <div id="fs-idp23039248" data-type="solution"><p id="fs-idm139415456"><em data-effect="italics">z</em> = 2</p> </div> </div> <div id="fs-idm114925552" data-type="exercise"><div id="fs-idm74515008" data-type="problem"><p id="fs-idm58847696">What is the <em data-effect="italics">z</em>-score of <em data-effect="italics">x</em> = 9, if it is 1.5 standard deviations to the left of the mean?</p> </div> <p>solution  z = –1.5 &#8211;&gt;</p> </div> <div id="fs-idm64977712" data-type="exercise"><div id="fs-idm20742528" data-type="problem"><p id="fs-idm75597504">What is the <em data-effect="italics">z</em>-score of <em data-effect="italics">x</em> = –2, if it is 2.78 standard deviations to the right of the mean?</p> </div> <div id="fs-idm27536176" data-type="solution"><p id="fs-idm70797520"><em data-effect="italics">z</em> = 2.78</p> </div> </div> <div id="fs-idm140331968" data-type="exercise"><div id="fs-idp9376864" data-type="problem"><p id="fs-idm55422576">What is the <em data-effect="italics">z</em>-score of <em data-effect="italics">x</em> = 7, if it is 0.133 standard deviations to the left of the mean?</p> </div> <p>solution  z = –0.133 &#8211;&gt;</p> </div> <div id="fs-idm61157328" data-type="exercise"><div id="fs-idm48171808" data-type="problem"><p id="fs-idp28078544">Suppose <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(2, 6). What value of <em data-effect="italics">x</em> has a <em data-effect="italics">z</em>-score of three?</p> </div> <div id="fs-idm131800576" data-type="solution"><p id="fs-idp1737648"><em data-effect="italics">x</em> = 20</p> </div> </div> <div id="fs-idm18781776" data-type="exercise"><div id="fs-idm63469536" data-type="problem"><p id="fs-idm28953536">Suppose <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(8, 1). What value of <em data-effect="italics">x</em> has a <em data-effect="italics">z</em>-score of –2.25?</p> </div> <p>solution  x = 5.75 &#8211;&gt;</p> </div> <div id="fs-idp23562800" data-type="exercise"><div id="fs-idm71924736" data-type="problem"><p id="fs-idm63493504">Suppose <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(9, 5). What value of <em data-effect="italics">x</em> has a <em data-effect="italics">z</em>-score of –0.5?</p> </div> <div id="fs-idm133043200" data-type="solution"><p id="fs-idm138188336"><em data-effect="italics">x</em> = 6.5</p> </div> </div> <div id="fs-idm69709296" data-type="exercise"><div id="fs-idm125097440" data-type="problem"><p id="fs-idm114104400">Suppose <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(2, 3). What value of <em data-effect="italics">x</em> has a <em data-effect="italics">z</em>-score of –0.67?</p> </div> <p>solution  x = –0.01 &#8211;&gt;</p> </div> <div id="fs-idm137634928" data-type="exercise"><div id="fs-idm51281184" data-type="problem"><p id="fs-idm58278624">Suppose <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(4, 2). What value of <em data-effect="italics">x</em> is 1.5 standard deviations to the left of the mean?</p> </div> <div id="fs-idm54876272" data-type="solution"><p id="fs-idm98011888"><em data-effect="italics">x</em> = 1</p> </div> </div> <div id="fs-idm63053168" data-type="exercise"><div id="fs-idm99176336" data-type="problem"><p id="fs-idm54724480">Suppose <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(4, 2). What value of <em data-effect="italics">x</em> is two standard deviations to the right of the mean?</p> </div> <p>solution  x = 8 &#8211;&gt;</p> </div> <div id="fs-idm62507744" data-type="exercise"><div id="fs-idm214036384" data-type="problem"><p id="fs-idm132300800">Suppose <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(8, 9). What value of <em data-effect="italics">x</em> is 0.67 standard deviations to the left of the mean?</p> </div> <div id="fs-idm113788672" data-type="solution"><p id="fs-idm47929600"><em data-effect="italics">x</em> = 1.97</p> </div> </div> <div id="fs-idm78772448" data-type="exercise"><div id="fs-idm35324176" data-type="problem"><p id="fs-idm116824768">Suppose <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(–1, 2). What is the <em data-effect="italics">z</em>-score of <em data-effect="italics">x</em> = 2?</p> </div> <p>solution  z = 1.5 &#8211;&gt;</p> </div> <div id="fs-idp21093808" data-type="exercise"><div id="fs-idm52438864" data-type="problem"><p id="fs-idp11134448">Suppose <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(12, 6). What is the <em data-effect="italics">z</em>-score of <em data-effect="italics">x</em> = 2?</p> </div> <div id="fs-idm77621104" data-type="solution"><p id="fs-idp37206672"><em data-effect="italics">z</em> = –1.67</p> </div> </div> <div id="fs-idp10811840" data-type="exercise"><div id="fs-idp48078624" data-type="problem"><p id="fs-idp44090864">Suppose <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(9, 3). What is the <em data-effect="italics">z</em>-score of <em data-effect="italics">x</em> = 9?</p> </div> <p>solution  z = 0 &#8211;&gt;</p> </div> <div id="fs-idm74086800" data-type="exercise"><div id="fs-idm50021040" data-type="problem"><p id="fs-idm119619728">Suppose a normal distribution has a mean of six and a standard deviation of 1.5. What is the <em data-effect="italics">z</em>-score of <em data-effect="italics">x</em> = 5.5?</p> </div> <div id="fs-idm81174560" data-type="solution"><p id="fs-idm98629616"><em data-effect="italics">z</em> ≈ –0.33</p> </div> </div> <div id="fs-idm61163760" data-type="exercise"><div id="fs-idm65054032" data-type="problem"><p id="fs-idm3500320">In a normal distribution, <em data-effect="italics">x</em> = 5 and <em data-effect="italics">z</em> = –1.25. This tells you that <em data-effect="italics">x</em> = 5 is ____ standard deviations to the ____ (right or left) of the mean.</p> </div> <p>solution  1.25, left &#8211;&gt;</p> </div> <div id="fs-idp37768720" data-type="exercise"><div id="fs-idp7198048" data-type="problem"><p id="fs-idp27185904">In a normal distribution, <em data-effect="italics">x</em> = 3 and <em data-effect="italics">z</em> = 0.67. This tells you that <em data-effect="italics">x</em> = 3 is ____ standard deviations to the ____ (right or left) of the mean.</p> </div> <div id="fs-idm124110672" data-type="solution"><p id="fs-idm63010720">0.67, right</p> </div> </div> <div id="fs-idm38070992" data-type="exercise"><div id="fs-idp8300720" data-type="problem"><p id="fs-idm19010400">In a normal distribution, <em data-effect="italics">x</em> = –2 and <em data-effect="italics">z</em> = 6. This tells you that <em data-effect="italics">x</em> = –2 is ____ standard deviations to the ____ (right or left) of the mean.</p> </div> <p>solution  six, right &#8211;&gt;</p> </div> <div id="fs-idm122820544" data-type="exercise"><div id="fs-idm113546720" data-type="problem"><p id="fs-idp45897744">In a normal distribution, <em data-effect="italics">x</em> = –5 and <em data-effect="italics">z</em> = –3.14. This tells you that <em data-effect="italics">x</em> = –5 is ____ standard deviations to the ____ (right or left) of the mean.</p> </div> <div id="fs-idm111356256" data-type="solution"><p id="fs-idm50323712">3.14, left</p> </div> </div> <div id="fs-idm49315712" data-type="exercise"><div id="fs-idm115754224" data-type="problem"><p id="fs-idm131763888">In a normal distribution, <em data-effect="italics">x</em> = 6 and <em data-effect="italics">z</em> = –1.7. This tells you that <em data-effect="italics">x</em> = 6 is ____ standard deviations to the ____ (right or left) of the mean.</p> </div> <p>solution  1.7, left &#8211;&gt;</p> </div> <div id="fs-idm144883616" data-type="exercise"><div id="fs-idm58188480" data-type="problem"><p id="fs-idm102634880">About what percent of <em data-effect="italics">x</em> values from a normal distribution lie within one standard deviation (left and right) of the mean of that distribution?</p> </div> <div id="fs-idm48244416" data-type="solution"><p id="fs-idm26112272">about 68%</p> </div> </div> <div id="fs-idm80518368" data-type="exercise"><div id="fs-idp39726992" data-type="problem"><p id="fs-idm123701152">About what percent of the <em data-effect="italics">x</em> values from a normal distribution lie within two standard deviations (left and right) of the mean of that distribution?</p> </div> <p>solution  about 95.45% &#8211;&gt;</p> </div> <div id="fs-idm85847520" data-type="exercise"><div id="fs-idp9454880" data-type="problem"><p id="fs-idm13742624">About what percent of <em data-effect="italics">x</em> values lie between the second and third standard deviations (both sides)?</p> </div> <div id="fs-idm100576432" data-type="solution"><p id="fs-idp48492224">about 4%</p> </div> </div> <div id="fs-idm24842144" data-type="exercise"><div id="fs-idm76034736" data-type="problem"><p id="fs-idp9455792">Suppose <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(15, 3). Between what <em data-effect="italics">x</em> values does 68.27% of the data lie? The range of <em data-effect="italics">x</em> values is centered at the mean of the distribution (i.e., 15).</p> </div> <p>solution  between 12 and 18 &#8211;&gt;</p> </div> <div id="fs-idm53172656" data-type="exercise"><div id="fs-idp42925920" data-type="problem"><p id="fs-idm48779424">Suppose <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(–3, 1). Between what <em data-effect="italics">x</em> values does 95.45% of the data lie? The range of <em data-effect="italics">x</em> values is centered at the mean of the distribution(i.e., –3).</p> </div> <div id="fs-idm77746640" data-type="solution"><p id="fs-idp40669232">between –5 and –1</p> </div> </div> <div id="fs-idm113806912" data-type="exercise"><div id="fs-idm56959552" data-type="problem"><p id="fs-idm42868016">Suppose <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(–3, 1). Between what <em data-effect="italics">x</em> values does 34.14% of the data lie?</p> </div> <p>solution  between –4 and –3 or between –3 and –2 &#8211;&gt;</p> </div> <div id="fs-idp81343328" data-type="exercise"><div id="fs-idp81343584" data-type="problem"><p id="fs-idp81343712">About what percent of <em data-effect="italics">x</em> values lie between the mean and three standard deviations?</p> </div> <div id="fs-idp105035008" data-type="solution"><p id="fs-idp105035264">about 50%</p> </div> </div> <div id="fs-idp84402208" data-type="exercise"><div id="fs-idp84402464" data-type="problem"><p id="fs-idp3556480">About what percent of <em data-effect="italics">x</em> values lie between the mean and one standard deviation?</p> </div> <p>solution  about 34.14% &#8211;&gt;</p> </div> <div id="fs-idp76784880" data-type="exercise"><div id="fs-idp60510304" data-type="problem"><p id="fs-idp60510560">About what percent of <em data-effect="italics">x</em> values lie between the first and second standard deviations from the mean (both sides)?</p> </div> <div id="fs-idp124684976" data-type="solution"><p id="fs-idp124685232">about 27%</p> </div> </div> <div id="fs-idp72804864" data-type="exercise"><div id="fs-idp72805120" data-type="problem"><p id="fs-idp193808192">About what percent of <em data-effect="italics">x</em> values lie betwween the first and third standard deviations(both sides)?</p> </div> <p>solution  about 34.46% &#8211;&gt;</p> </div> <p id="fs-idp107376208"><em data-effect="italics">Use the following information to answer the next two exercises:</em> The life of Sunshine CD players is normally distributed with mean of 4.1 years and a standard deviation of 1.3 years. A CD player is guaranteed for three years. We are interested in the length of time a CD player lasts.</p> <div id="fs-idm143086848" data-type="exercise"><div id="fs-idm63716512" data-type="problem"><p id="fs-idm74092160">Define the random variable <em data-effect="italics">X</em> in words. <em data-effect="italics">X</em> = _______________.</p> </div> <div id="fs-idp101379808" data-type="solution"><p id="fs-idp101380064">The lifetime of a Sunshine CD player measured in years.</p> </div> </div> <div id="fs-idm119601152" data-type="exercise"><div id="fs-idp13620608" data-type="problem"><p id="fs-idm125057376"><em data-effect="italics">X</em> ~ _____(_____,_____)</p> </div> <p>solution  X ~ N(4.1, 1.3) &#8211;&gt;</p> </div> </div> <div id="fs-idm58670224" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <p id="fs-idp104712416"><em data-effect="italics">Use the following information to answer the next two exercises:</em> The patient recovery time from a particular surgical procedure is normally distributed with a mean of 5.3 days and a standard deviation of 2.1 days.</p> <div id="fs-idp12375792" data-type="exercise"><div id="fs-idm51776288" data-type="problem"><p id="fs-idm49796352">1) What is the median recovery time?</p> <ol id="fs-idp45853600" type="a"><li>2.7</li> <li>5.3</li> <li>7.4</li> <li>2.1</li> </ol> </div> <p>&nbsp;</p> </div> <div id="fs-idp45221936" data-type="exercise"><div id="fs-idp45222064" data-type="problem"><p id="fs-idm104057168">2) What is the <em data-effect="italics">z</em>-score for a patient who takes ten days to recover?</p> <ol id="fs-idm58845760" type="a"><li>1.5</li> <li>0.2</li> <li>2.2</li> <li>7.3</li> </ol> </div> <div id="fs-idm114044832" data-type="solution"><p id="fs-idm74915568"></p></div> </div> <div id="fs-idm62385744" data-type="exercise"><div id="fs-idm56430320" data-type="problem"><p>3) The length of time to find it takes to find a parking space at 9 A.M. follows a normal distribution with a mean of five minutes and a standard deviation of two minutes. If the mean is significantly greater than the standard deviation, which of the following statements is true?</p> <ol id="fs-idm68354864" type="I"><li>The data cannot follow the uniform distribution.</li> <li>The data cannot follow the exponential distribution..</li> <li>The data cannot follow the normal distribution.</li> </ol> <ol id="fs-idm18019792" type="a"><li>I only</li> <li>II only</li> <li>III only</li> <li>I, II, and III</li> </ol> </div> <p>&nbsp;</p> </div> <div id="eip-866" data-type="exercise"><div data-type="problem"><p id="eip-idm112969632">4) The heights of the 430 National Basketball Association players were listed on team rosters at the start of the 2005–2006 season. The heights of basketball players have an approximate normal distribution with mean, <em data-effect="italics">µ</em> = 79 inches and a standard deviation, <em data-effect="italics">σ</em> = 3.89 inches. For each of the following heights, calculate the <em data-effect="italics">z</em>-score and interpret it using complete sentences.</p> <ol id="eip-idm156998368" type="a"><li>77 inches</li> <li>85 inches</li> <li>If an NBA player reported his height had a <em data-effect="italics">z</em>-score of 3.5, would you believe him? Explain your answer.</li> </ol> <p>&nbsp;</p> </div> <div id="eip-413" data-type="solution"></div> </div> <div data-type="exercise"><div data-type="problem"><p id="eip-51">5) The systolic blood pressure (given in millimeters) of males has an approximately normal distribution with mean <em data-effect="italics">µ</em> = 125 and standard deviation <em data-effect="italics">σ</em> = 14. Systolic blood pressure for males follows a normal distribution.</p> <ol type="a"><li>Calculate the <em data-effect="italics">z</em>-scores for the male systolic blood pressures 100 and 150 millimeters.</li> <li>If a male friend of yours said he thought his systolic blood pressure was 2.5 standard deviations below the mean, but that he believed his blood pressure was between 100 and 150 millimeters, what would you say to him?</li> </ol> </div> <p>&nbsp;</p> </div> <div id="eip-808" data-type="exercise"><div data-type="problem"><p id="eip-idp38421088">6) Kyle’s doctor told him that the <em data-effect="italics">z</em>-score for his systolic blood pressure is 1.75. Which of the following is the best interpretation of this standardized score? The systolic blood pressure (given in millimeters) of males has an approximately normal distribution with mean <em data-effect="italics">µ</em> = 125 and standard deviation <em data-effect="italics">σ</em> = 14. If <em data-effect="italics">X</em> = a systolic blood pressure score then <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em> (125, 14).</p> <ol id="eip-idm125011184" type="a"><li>Which answer(s) <strong>is/are</strong> correct? <ol id="eip-idp32729008" type="i"><li>Kyle’s systolic blood pressure is 175.</li> <li>Kyle’s systolic blood pressure is 1.75 times the average blood pressure of men his age.</li> <li>Kyle’s systolic blood pressure is 1.75 above the average systolic blood pressure of men his age.</li> <li>Kyles’s systolic blood pressure is 1.75 standard deviations above the average systolic blood pressure for men.</li> </ol> </li> <li>Calculate Kyle’s blood pressure.</li> </ol> </div> <div id="eip-387" data-type="solution"><p>&nbsp;</p> </div> </div> <div id="eip-459" data-type="exercise"><div id="eip-325" data-type="problem"><p id="eip-371">7) Height and weight are two measurements used to track a child’s development. The World Health Organization measures child development by comparing the weights of children who are the same height and the same gender. In 2009, weights for all 80 cm girls in the reference population had a mean <em data-effect="italics">µ</em> = 10.2 kg and standard deviation <em data-effect="italics">σ</em> = 0.8 kg. Weights are normally distributed. <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(10.2, 0.8). Calculate the <em data-effect="italics">z</em>-scores that correspond to the following weights and interpret them.</p> <ol id="eip-idp79884128" type="a"><li>11 kg</li> <li>7.9 kg</li> <li>12.2 kg</li> </ol> </div> <p>&nbsp;</p> </div> <div id="eip-802" data-type="exercise"><div id="eip-574" data-type="problem"><p id="eip-766">8) In 2005, 1,475,623 students heading to college took the SAT. The distribution of scores in the math section of the SAT follows a normal distribution with mean <em data-effect="italics">µ</em> = 520 and standard deviation <em data-effect="italics">σ</em> = 115.</p> <ol id="eip-idp101733776" type="a"><li>Calculate the <em data-effect="italics">z</em>-score for an SAT score of 720. Interpret it using a complete sentence.</li> <li>What math SAT score is 1.5 standard deviations above the mean? What can you say about this SAT score?</li> <li>For 2012, the SAT math test had a mean of 514 and standard deviation 117. The ACT math test is an alternate to the SAT and is approximately normally distributed with mean 21 and standard deviation 5.3. If one person took the SAT math test and scored 700 and a second person took the ACT math test and scored 30, who did better with respect to the test they took?</li> </ol> </div> <div id="eip-849" data-type="solution"><ol id="fs-idp68725248" type="a"></ol> <p><strong>Answers to odd questions</strong></p> <p>1) b</p> <p>3) b</p> <p>5) Use the z-score formula. 100 – 125 14 ≈ –1.8 and 100 – 125 14 ≈ 1.8 I would tell him that 2.5 standard deviations below the mean would give him a blood pressure reading of 90, which is below the range of 100 to 150.</p> <p>7) a) (11 – 10.2) / 0.8 = 1         A child who weighs 11 kg is one standard deviation above the mean of 10.2 kg.<br /> b) (7.9 – 10.2) / 0.8 = –2.875 A child who weighs 7.9 kg is 2.875 standard deviations below the mean of 10.2 kg.<br /> c) (12.2 – 10.2) / 0.8 = 2.5 A child who weighs 12.2 kg is 2.5 standard deviation above the mean of 10.2 kg.</p> </div> </div> <p>&nbsp;</p> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="nrmdist"><dt>Standard Normal Distribution</dt> <dd id="id42925156">a continuous random variable (RV) <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(0, 1); when <em data-effect="italics">X</em> follows the standard normal distribution, it is often noted as <em data-effect="italics">Z</em> ~ <em data-effect="italics">N</em>(0, 1).</dd> </dl> <dl id="zscore"><dt>z-score</dt> <dd id="id3154393">the linear transformation of the form <em data-effect="italics">z</em> = \(\frac{x\text{ }–\text{ }\mu }{\sigma }\); if this transformation is applied to any normal distribution <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(<em data-effect="italics">μ</em>, <em data-effect="italics">σ</em>) the result is the standard normal distribution <em data-effect="italics">Z</em> ~ <em data-effect="italics">N</em>(0,1). If this transformation is applied to any specific value <em data-effect="italics">x</em> of the RV with mean <em data-effect="italics">μ</em> and standard deviation <em data-effect="italics">σ</em>, the result is called the <em data-effect="italics">z</em>-score of <em data-effect="italics">x</em>. The <em data-effect="italics">z</em>-score allows us to compare data that are normally distributed but scaled differently.</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-using-the-normal-distribution" title="Chapter 7.3: Using the Normal Distribution"><div class="chapter-title-wrap"><h3 class="chapter-number">43</h3><h2 class="chapter-title"><span class="display-none">Chapter 7.3: Using the Normal Distribution</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p id="fs-idm68176912" class="finger">The shaded area in the following graph indicates the area to the left of <em data-effect="italics">x</em>. This area is represented by the probability <em data-effect="italics">P</em>(<em data-effect="italics">X</em> &lt; <em data-effect="italics">x</em>). Normal tables, computers, and calculators provide or calculate the probability <em data-effect="italics">P</em>(<em data-effect="italics">X</em> &lt; <em data-effect="italics">x</em>).</p> <div id="fs-idm109348848" class="bc-figure figure"><div class="wp-caption alignnone" style="width: 487px"><img class="size-medium" src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch_06_04_01-1.jpg" alt="Shaded area for P(X &lt; x)" width="487" height="221" /><div class="wp-caption-text">This diagram shows a bell-shaped curve with uppercase X at the extreme right end of the X axis. The X axis also contains a lowercase x about one-quarter of the way across the X axis from the right. The area under the bell curve to the right of the lowercase x is shaded. The label states: shaded area represents probability P(X &lt; x)</div></div> </div> <p id="fs-idm95835008">The area to the right is then <em data-effect="italics">P </em>( <em data-effect="italics">X </em> &gt; <em data-effect="italics">x </em>) = 1 – <em data-effect="italics">P </em>( <em data-effect="italics">X </em> &lt; <em data-effect="italics">x </em>). Remember, <em data-effect="italics">P </em>( <em data-effect="italics">X </em> &lt; <em data-effect="italics">x </em>) = <span data-type="term">Area to the left </span> of the vertical line through <em data-effect="italics">x </em>. <em data-effect="italics">P </em>( <em data-effect="italics">X </em> &gt; <em data-effect="italics">x </em>) = 1 – <em data-effect="italics">P </em>( <em data-effect="italics">X </em> &lt; <em data-effect="italics">x </em>) = <span data-type="term">Area to the right </span> of the vertical line through <em data-effect="italics">x </em>. <em data-effect="italics">P </em>( <em data-effect="italics">X </em> &lt; <em data-effect="italics">x </em>) is the same as <em data-effect="italics">P </em>( <em data-effect="italics">X </em> ≤ <em data-effect="italics">x </em>) and <em data-effect="italics">P </em>( <em data-effect="italics">X </em> &gt; <em data-effect="italics">x </em>) is the same as <em data-effect="italics">P </em>( <em data-effect="italics">X </em> ≥ <em data-effect="italics">x </em>) for continuous distributions.</p> <div id="fs-idm123993536" class="finger" data-depth="1"><h3 data-type="title">Calculations of Probabilities</h3> <p class="finger">Probabilities are calculated using technology. There are instructions given as necessary for the TI-83+ and TI-84 calculators.</p> <div id="id1168986691838" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="eip-idp12614192">To calculate the probability, use the probability tables provided in <a class="autogenerated-content" href="/contents/dcec5517-4e51-42f5-9b11-8cb11c02d8ae">(Figure)</a> without the use of technology. The tables include instructions for how to use them.</p> </div> <div class="textbox textbox--examples" data-type="example"><p id="element-437">If the area to the left is 0.0228, then the area to the right is 1 – 0.0228 = 0.9772.</p> </div> <div id="fs-idm96786656" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm159222080" data-type="exercise"><div id="fs-idm32590832" data-type="problem"><p id="fs-idm114482720">If the area to the left of <em data-effect="italics">x</em> is 0.012, then what is the area to the right?</p> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><p>The final exam scores in a statistics class were normally distributed with a mean of 63 and a standard deviation of five.</p> <p>&nbsp;</p> <div data-type="exercise"><div id="id1168986723014" data-type="problem"><p id="element-415">a. Find the probability that a randomly selected student scored more than 65 on the exam.</p> </div> <div id="id1168986723026" data-type="solution"><p>a. Let <em data-effect="italics">X</em> = a score on the final exam. <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(63, 5), where <em data-effect="italics">μ</em> = 63 and <em data-effect="italics">σ</em> = 5.</p> <p id="element-449">Draw a graph.</p> <p id="element-389">Then, find <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; 65).</p> <p id="element-294"><em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; 65) = 0.3446</p> <div id="fs-idm5159232" class="bc-figure figure"><span id="id1168988266495" data-type="media">65) = 0.3446&#8243;&gt;<img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch_06_05_01-1.jpg" data-media-type="image/png" alt="image" /> 65) = 0.3446&#8243; width=&#8221;380&#8243;&gt;</span></div> <p>The probability that any student selected at random scores more than 65 is 0.3446.</p> <div id="fs-idm55987936" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="element-736">Go into <code>2nd DISTR</code>. <span data-type="newline"><br /> </span>After pressing <code>2nd DISTR</code>, press <code>2:normalcdf</code>.</p> <p>The syntax for the instructions are as follows:</p> <p>normalcdf(lower value, upper value, mean, standard deviation) For this problem: normalcdf(65,1E99,63,5) = 0.3446. You get 1E99 (= 10<sup>99</sup>) by pressing <code>1</code>, the <code>EE</code> key (a 2nd key) and then <code>99</code>. Or, you can enter <code>10^99</code> instead. The number 10<sup>99</sup> is way out in the right tail of the normal curve. We are calculating the area between 65 and 10<sup>99</sup>. In some instances, the lower number of the area might be –1E99 (= –10<sup>99</sup>). The number –10<sup>99</sup> is way out in the left tail of the normal curve.</p> </div> <div id="id1168986705026" data-type="note" data-has-label="true" data-label=""><div data-type="title">Historical Note</div> <p id="fs-idm194168736">The TI probability program calculates a <em data-effect="italics">z</em>-score and then the probability from the <em data-effect="italics">z</em>-score. Before technology, the <em data-effect="italics">z</em>-score was looked up in a standard normal probability table (because the math involved is too cumbersome) to find the probability. In this example, a standard normal table with area to the left of the <em data-effect="italics">z</em>-score was used. You calculate the <em data-effect="italics">z</em>-score and look up the area to the left. The probability is the area to the right.</p> </div> <p id="element-772"><em data-effect="italics">z</em> = \(\frac{65\text{ – 63}}{5}\) = 0.4</p> <p id="fs-idm33523184">Area to the left is 0.6554.</p> <p id="fs-idm125027840"><em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; 65) = <em data-effect="italics">P</em>(<em data-effect="italics">z</em> &gt; 0.4) = 1 – 0.6554 = 0.3446</p> </div> </div> <div id="fs-idm98690992" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="fs-idm197852304">Find the percentile for a student scoring 65:</p> <p id="fs-idm20771216">*Press <code>2nd Distr</code> <span data-type="newline"><br /> </span>*Press <code>2:normalcdf</code>( <span data-type="newline"><br /> </span>*Enter lower bound, upper bound, mean, standard deviation followed by ) <span data-type="newline"><br /> </span>*Press <code>ENTER</code>. <span data-type="newline"><br /> </span>For this Example, the steps are <span data-type="newline"><br /> </span><code>2nd Distr</code> <span data-type="newline"><br /> </span><code>2:normalcdf</code>(65,1,2nd EE,99,63,5) <code>ENTER</code> <span data-type="newline"><br /> </span>The probability that a selected student scored more than 65 is 0.3446.</p> </div> <div id="element-243" data-type="exercise"><div id="id1168986705169" data-type="problem"><p>b. Find the probability that a randomly selected student scored less than 85.</p> </div> <div id="id1168986705184" data-type="solution"><p id="element-181">b. Draw a graph.</p> <p>Then find <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &lt; 85), and shade the graph.</p> <p>Using a computer or calculator, find <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &lt; 85) = 1.</p> <p>normalcdf(0,85,63,5) = 1 (rounds to one)</p> <p>The probability that one student scores less than 85 is approximately one (or 100%).</p> <p>&nbsp;</p> </div> </div> <div id="element-585" data-type="exercise"><div id="id1168986688411" data-type="problem"><p id="element-922">c. Find the 90<sup>th</sup> percentile (that is, find the score <em data-effect="italics">k</em> that has 90% of the scores below <em data-effect="italics">k</em> and 10% of the scores above <em data-effect="italics">k</em>).</p> </div> <div id="id1168986688427" data-type="solution"><p>c. Find the 90<sup>th</sup> percentile. For each problem or part of a problem, draw a new graph. Draw the <em data-effect="italics">x</em>-axis. Shade the area that corresponds to the 90<sup>th</sup> percentile.</p> <p><strong>Let <em data-effect="italics">k</em> = the 90<sup>th</sup> percentile.</strong> The variable <em data-effect="italics">k</em> is located on the <em data-effect="italics">x</em>-axis. <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &lt; <em data-effect="italics">k</em>) is the area to the left of <em data-effect="italics">k</em>. The 90<sup>th</sup> percentile <em data-effect="italics">k</em> separates the exam scores into those that are the same or lower than <em data-effect="italics">k</em> and those that are the same or higher. Ninety percent of the test scores are the same or lower than <em data-effect="italics">k</em>, and ten percent are the same or higher. The variable <em data-effect="italics">k</em> is often called a <span data-type="term">critical value</span>.</p> <p id="element-58"><em data-effect="italics">k</em> = 69.4</p> <div id="fs-idm38798848" class="bc-figure figure"><div class="wp-caption alignnone" style="width: 487px"><img class="size-medium" src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch06_05_02-1.jpg" alt="Shaded area for P(x &lt; k) = 0.90" width="487" height="250" /><div class="wp-caption-text">This is a normal distribution curve. The peak of the curve coincides with the point 63 on the horizontal axis. A point, k, is labeled to the right of 63. A vertical line extends from k to the curve. The area under the curve to the left of k is shaded. This represents the probability that x is less than k: P(x &lt; k) = 0.90</div></div> </div> <p>The 90<sup>th</sup> percentile is 69.4. This means that 90% of the test scores fall at or below 69.4 and 10% fall at or above. To get this answer on the calculator, follow this step:</p> </div> </div> <div id="fs-idm89013760" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="fs-idm78070144"><code>invNorm</code> in <code>2nd DISTR</code>. invNorm(area to the left, mean, standard deviation) <span data-type="newline"><br /> </span>For this problem, invNorm(0.90,63,5) = 69.4</p> </div> <div id="element-322" data-type="exercise"><div id="id1168986735971" data-type="problem"><p>d. Find the 70<sup>th</sup> percentile (that is, find the score <em data-effect="italics">k</em> such that 70% of scores are below <em data-effect="italics">k</em> and 30% of the scores are above <em data-effect="italics">k</em>).</p> </div> <div id="id1168986735987" data-type="solution"><p>d. Find the 70<sup>th</sup> percentile.</p> <p id="element-339">Draw a new graph and label it appropriately. <em data-effect="italics">k</em> = 65.6</p> <p>The 70<sup>th</sup> percentile is 65.6. This means that 70% of the test scores fall at or below 65.5 and 30% fall at or above.</p> <p>invNorm(0.70,63,5) = 65.6</p> </div> </div> </div> <div id="fs-idm83435856" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm160100784" data-type="exercise"><div id="fs-idm31461824" data-type="problem"><p id="fs-idm14759824">The golf scores for a school team were normally distributed with a mean of 68 and a standard deviation of three.</p> <p id="fs-idm103631648">Find the probability that a randomly selected golfer scored less than 65.</p> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><p>A personal computer is used for office work at home, research, communication, personal finances, education, entertainment, social networking, and a myriad of other things. Suppose that the average number of hours a household personal computer is used for entertainment is two hours per day. Assume the times for entertainment are normally distributed and the standard deviation for the times is half an hour.</p> <p>&nbsp;</p> <div id="element-496" data-type="exercise"><div id="id1168986736072" data-type="problem"><p>a. Find the probability that a household personal computer is used for entertainment between 1.8 and 2.75 hours per day.</p> </div> <div id="id1168986736088" data-type="solution"><p>a. Let <em data-effect="italics">X</em> = the amount of time (in hours) a household personal computer is used for entertainment. <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(2, 0.5) where <em data-effect="italics">μ</em> = 2 and <em data-effect="italics">σ</em> = 0.5.</p> <p id="element-923">Find <em data-effect="italics">P</em>(1.8 &lt; <em data-effect="italics">x</em> &lt; 2.75).</p> <p id="element-618">The probability for which you are looking is the area <strong>between</strong> <em data-effect="italics">x</em> = 1.8 and <em data-effect="italics">x</em> = 2.75. <em data-effect="italics">P</em>(1.8 &lt; <em data-effect="italics">x</em> &lt; 2.75) = 0.5886</p> <div id="fs-idm13004032" class="bc-figure figure"><span id="id1168986736293" data-type="media" data-alt="This is a normal distribution curve. The peak of the curve coincides with the point 2 on the horizontal axis. The values 1.8 and 2.75 are also labeled on the x-axis. Vertical lines extend from 1.8 and 2.75 to the curve. The area between the lines is shaded."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch06_05_03-1.jpg" alt="This is a normal distribution curve. The peak of the curve coincides with the point 2 on the horizontal axis. The values 1.8 and 2.75 are also labeled on the x-axis. Vertical lines extend from 1.8 and 2.75 to the curve. The area between the lines is shaded." width="450" data-media-type="image/png" /></span></div> <p>normalcdf(1.8,2.75,2,0.5) = 0.5886</p> <p id="element-292">The probability that a household personal computer is used between 1.8 and 2.75 hours per day for entertainment is 0.5886.</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id1168986736344" data-type="problem"><p>b. Find the maximum number of hours per day that the bottom quartile of households uses a personal computer for entertainment.</p> </div> <div id="id1168986736360" data-type="solution"><p>b. To find the maximum number of hours per day that the bottom quartile of households uses a personal computer for entertainment, <strong>find the 25<sup>th</sup> percentile,</strong> <em data-effect="italics">k</em>, where <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &lt; <em data-effect="italics">k</em>) = 0.25.</p> <div id="fs-idm66393568" class="bc-figure figure"><span id="id1168986736428" data-type="media" data-alt="This is a normal distribution curve. The area under the left tail of the curve is shaded. The shaded area shows that the probability that x is less than k is 0.25. It follows that k = 1.67."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch06_05_04N-1.jpg" alt="This is a normal distribution curve. The area under the left tail of the curve is shaded. The shaded area shows that the probability that x is less than k is 0.25. It follows that k = 1.67." width="420" data-media-type="image/jpg" /></span></div> <p>invNorm(0.25,2,0.5) = 1.66</p> <p>The maximum number of hours per day that the bottom quartile of households uses a personal computer for entertainment is 1.66 hours.</p> </div> </div> </div> <div id="fs-idm101393328" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm193708112" data-type="exercise"><div id="fs-idm168992208" data-type="problem"><p id="fs-idm120157248">The golf scores for a school team were normally distributed with a mean of 68 and a standard deviation of three. Find the probability that a golfer scored between 66 and 70.</p> </div> </div> </div> <div id="eip-562" class="textbox textbox--examples" data-type="example"><p id="eip-idm115363904">In the United States the ages 13 to 55+ of smartphone users approximately follow a normal distribution with approximate mean and standard deviation of 36.9 years and 13.9 years, respectively.</p> <p>&nbsp;</p> <div id="eip-idm99631296" data-type="exercise"><div id="eip-idm93440352" data-type="problem"><p id="fs-idm218603968">a. Determine the probability that a random smartphone user in the age range 13 to 55+ is between 23 and 64.7 years old.</p> </div> <div id="eip-idm151027840" data-type="solution"><p id="eip-idm169219344">a. normalcdf(23,64.7,36.9,13.9) = 0.8186</p> <p>&nbsp;</p> </div> </div> <div id="eip-idm35763936" data-type="exercise"><div id="eip-idm100019648" data-type="problem"><p id="fs-idm1234448">b. Determine the probability that a randomly selected smartphone user in the age range 13 to 55+ is at most 50.8 years old.</p> </div> <div id="eip-idm92673856" data-type="solution"><p id="eip-idm21130928">b. normalcdf(–10<sup>99</sup>,50.8,36.9,13.9) = 0.8413</p> <p>&nbsp;</p> </div> </div> <div id="eip-idm12998192" data-type="exercise"><div id="eip-idm107995648" data-type="problem"><p id="fs-idm159232112">c. Find the 80<sup>th</sup> percentile of this distribution, and interpret it in a complete sentence.</p> </div> <div id="eip-idm20250224" data-type="solution"><p id="fs-idm194380112">c.</p> <ul id="fs-idm90633168" data-labeled-item="true"><li>invNorm(0.80,36.9,13.9) = 48.6</li> <li>The 80<sup>th</sup> percentile is 48.6 years.</li> <li>80% of the smartphone users in the age range 13 – 55+ are 48.6 years old or less.</li> </ul> </div> </div> </div> <div id="fs-idm126739248" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <p id="eip-idm1584896">Use the information in <a class="autogenerated-content" href="#eip-562">(Figure)</a> to answer the following questions.</p> <div id="eip-idm1584512" data-type="exercise"><div id="eip-idm103606800" data-type="problem"><ol id="eip-idm103606544" type="a"><li>Find the 30<sup>th</sup> percentile, and interpret it in a complete sentence.</li> <li>What is the probability that the age of a randomly selected smartphone user in the range 13 to 55+ is less than 27 years old.</li> </ol> </div> </div> </div> <div id="fs-idm50441184" class="textbox textbox--examples" data-type="example"><p id="fs-idm59646256">In the United States the ages 13 to 55+ of smartphone users approximately follow a normal distribution with approximate mean and standard deviation of 36.9 years and 13.9 years respectively. Using this information, answer the following questions (round answers to one decimal place).</p> <p>&nbsp;</p> <div id="fs-idm100093408" data-type="exercise"><div id="fs-idm43971024" data-type="problem"><p id="fs-idm56871120">a. Calculate the interquartile range (<em data-effect="italics">IQR</em>).</p> </div> <div id="fs-idm66621456" data-type="solution"><p id="fs-idm197260048">a.</p> <ul id="fs-idm183339056" data-labeled-item="true"><li><em data-effect="italics">IQR</em> = <em data-effect="italics">Q</em><sub>3</sub> – <em data-effect="italics">Q</em><sub>1</sub></li> <li>Calculate <em data-effect="italics">Q</em><sub>3</sub> = 75<sup>th</sup> percentile and <em data-effect="italics">Q</em><sub>1</sub> = 25<sup>th</sup> percentile.</li> <li>invNorm(0.75,36.9,13.9) = <em data-effect="italics">Q</em><sub>3</sub> = 46.2754</li> <li>invNorm(0.25,36.9,13.9) = <em data-effect="italics">Q</em><sub>1</sub> = 27.5246</li> <li><em data-effect="italics">IQR</em> = <em data-effect="italics">Q</em><sub>3</sub> – <em data-effect="italics">Q</em><sub>1</sub> = 18.8</li> </ul> <p>&nbsp;</p> <p>&nbsp;</p> </div> </div> <div id="eip-133" data-type="exercise"><div id="eip-940" data-type="problem"><p id="eip-270">b. Forty percent of the ages that range from 13 to 55+ are at least what age?</p> </div> <div id="eip-304" data-type="solution"><p id="fs-idm196458064">b.</p> <ul id="fs-idm125660800" data-labeled-item="true"><li>Find <em data-effect="italics">k</em> where <em data-effect="italics">P</em>(<em data-effect="italics">x</em> ≥ <em data-effect="italics">k</em>) = 0.40 (&#8220;At least&#8221; translates to &#8220;greater than or equal to.&#8221;)</li> <li>0.40 = the area to the right.</li> <li>Area to the left = 1 – 0.40 = 0.60.</li> <li>The area to the left of <em data-effect="italics">k</em> = 0.60.</li> <li>invNorm(0.60,36.9,13.9) = 40.4215.</li> <li><em data-effect="italics">k</em> = 40.4.</li> <li>Forty percent of the ages that range from 13 to 55+ are at least 40.4 years.</li> </ul> </div> </div> </div> <div id="fs-idm81547104" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm64657232" data-type="exercise"><div id="fs-idm78477504" data-type="problem"><p id="fs-idm49674336">Two thousand students took an exam. The scores on the exam have an approximate normal distribution with a mean <em data-effect="italics">μ</em> = 81 points and standard deviation <em data-effect="italics">σ</em> = 15 points.</p> <ol id="fs-idm139136192" type="a"><li>Calculate the first- and third-quartile scores for this exam.</li> <li>The middle 50% of the exam scores are between what two values?</li> </ol> </div> </div> </div> <div id="eip-662" class="textbox textbox--examples" data-type="example"><p>A citrus farmer who grows mandarin oranges finds that the diameters of mandarin oranges harvested on his farm follow a normal distribution with a mean diameter of 5.85 cm and a standard deviation of 0.24 cm.</p> <p>&nbsp;</p> <div id="eip-704" data-type="exercise"><div id="eip-586" data-type="problem"><p id="eip-idm49628176">a. Find the probability that a randomly selected mandarin orange from this farm has a diameter larger than 6.0 cm. Sketch the graph.</p> </div> <div id="eip-744" data-type="solution"><p id="eip-809">a. normalcdf(6,10^99,5.85,0.24) = 0.2660</p> <div id="eip-idp150275440" class="bc-figure figure"><span id="eip-idp150275696" data-type="media" data-alt="This is a normal distribution curve. The peak of the curve coincides with the point 2 on the horizontal axis. The values 1.8 and 2.75 are also labeled on the x-axis. Vertical lines extend from 1.8 and 2.75 to the curve. The area between the lines is shaded."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C06_M03_003a-1.jpg" alt="This is a normal distribution curve. The peak of the curve coincides with the point 2 on the horizontal axis. The values 1.8 and 2.75 are also labeled on the x-axis. Vertical lines extend from 1.8 and 2.75 to the curve. The area between the lines is shaded." width="420" data-media-type="jpg/png" /></span></div> </div> </div> <div data-type="exercise"><div data-type="problem"><p>b. The middle 20% of mandarin oranges from this farm have diameters between ______ and ______.</p> </div> <div data-type="solution"><p id="fs-idm172823344">b.</p> <ul id="fs-idm131986640" data-labeled-item="true"><li>1 – 0.20 = 0.80</li> <li>The tails of the graph of the normal distribution each have an area of 0.40.</li> <li>Find <em data-effect="italics">k1</em>, the 40<sup>th</sup> percentile, and <em data-effect="italics">k2</em>, the 60<sup>th</sup> percentile (0.40 + 0.20 = 0.60).</li> <li><em data-effect="italics">k1</em> = invNorm(0.40,5.85,0.24) = 5.79 cm</li> <li><em data-effect="italics">k2</em> = invNorm(0.60,5.85,0.24) = 5.91 cm</li> </ul> <p>&nbsp;</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="eip-91" data-type="problem"><p id="eip-733">c. Find the 90<sup>th</sup> percentile for the diameters of mandarin oranges, and interpret it in a complete sentence.</p> </div> <div id="eip-0" data-type="solution"><p>c. 6.16: Ninety percent of the diameter of the mandarin oranges is at most 6.16 cm.</p> </div> </div> </div> <div id="fs-idm127937776" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="eip-786" data-type="exercise"><div id="eip-686" data-type="problem"><p>Using the information from <a class="autogenerated-content" href="#eip-662">(Figure)</a>, answer the following:</p> <ol id="eip-idm69666000" type="a"><li>The middle 40% of mandarin oranges from this farm are between ______ and ______.</li> <li>Find the 16<sup>th</sup> percentile and interpret it in a complete sentence.</li> </ol> </div> </div> </div> </div> <div id="fs-idm101453488" class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="fs-idm202298288">“Naegele’s rule.” Wikipedia. Available online at http://en.wikipedia.org/wiki/Naegele&#8217;s_rule (accessed May 14, 2013).</p> <p id="fs-idm133443328">“403: NUMMI.” Chicago Public Media &amp; Ira Glass, 2013. Available online at http://www.thisamericanlife.org/radio-archives/episode/403/nummi (accessed May 14, 2013).</p> <p id="fs-idm77712176">“Scratch-Off Lottery Ticket Playing Tips.” WinAtTheLottery.com, 2013. Available online at http://www.winatthelottery.com/public/department40.cfm (accessed May 14, 2013).</p> <p id="fs-idm127381280">“Smart Phone Users, By The Numbers.” Visual.ly, 2013. Available online at http://visual.ly/smart-phone-users-numbers (accessed May 14, 2013).</p> <p id="fs-idm165219840">“Facebook Statistics.” Statistics Brain. Available online at http://www.statisticbrain.com/facebook-statistics/(accessed May 14, 2013).</p> </div> <div id="eip-424" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="eip-960">The normal distribution, which is continuous, is the most important of all the probability distributions. Its graph is bell-shaped. This bell-shaped curve is used in almost all disciplines. Since it is a continuous distribution, the total area under the curve is one. The parameters of the normal are the mean <em data-effect="italics">µ</em> and the standard deviation <em data-effect="italics">σ</em>. A special normal distribution, called the standard normal distribution is the distribution of <em data-effect="italics">z</em>-scores. Its mean is zero, and its standard deviation is one.</p> </div> <div id="eip-108" class="formula-review" data-depth="1"><h3 data-type="title">Formula Review</h3> <p>Normal Distribution: <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(<em data-effect="italics">µ</em>, <em data-effect="italics">σ</em>) where <em data-effect="italics">µ</em> is the mean and <em data-effect="italics">σ</em> is the standard deviation.</p> <p id="eip-idm32653056">Standard Normal Distribution: <em data-effect="italics">Z</em> ~ <em data-effect="italics">N</em>(0, 1).</p> <p id="eip-30">Calculator function for probability: normalcdf (lower <em data-effect="italics">x</em> value of the area, upper <em data-effect="italics">x</em> value of the area, mean, standard deviation)</p> <p id="fs-idm55516128">Calculator function for the <em data-effect="italics">k</em><sup>th</sup> percentile: <em data-effect="italics">k</em> = invNorm (area to the left of <em data-effect="italics">k</em>, mean, standard deviation)</p> </div> <div id="eip-527" class="practice" data-depth="1"><div id="fs-idp83652864" data-type="exercise"><div id="fs-idp3823216" data-type="problem"><p id="fs-idp81012528">How would you represent the area to the left of one in a probability statement?</p> <div id="eip-idp74682384" class="bc-figure figure"><span id="fs-idp31643648" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C06_M04_item001-1.jpg" alt="" data-media-type="image/jpg" data-print-width="3in" /></span></div> </div> <div id="fs-idp124208464" data-type="solution"><p id="fs-idp142221328"><em data-effect="italics">P</em>(<em data-effect="italics">x</em> &lt; 1)</p> </div> </div> <div id="fs-idm18365728" data-type="exercise"><div id="fs-idp30641696" data-type="problem"><p id="fs-idp68913872">What is the area to the right of one?</p> <div id="eip-idm26567824" class="bc-figure figure"><span id="fs-idp76787840" data-type="media" data-alt="" data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C06_M04_item001-1.jpg" alt="" data-media-type="image/jpg" data-print-width="3in" /></span></div> </div> <p>solution  1 – P(x &lt; 1) or P(x &lt; 1) &#8211;&gt;</p> </div> <div id="fs-idp76730048" data-type="exercise"><div id="fs-idp98636864" data-type="problem"><p id="fs-idm16173600">Is <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &lt; 1) equal to <em data-effect="italics">P</em>(<em data-effect="italics">x</em> ≤ 1)? Why?</p> </div> <div id="fs-idp26550912" data-type="solution"><p id="fs-idp101468880">Yes, because they are the same in a continuous distribution: <em data-effect="italics">P</em>(<em data-effect="italics">x</em> = 1) = 0</p> </div> </div> <div id="fs-idp16475344" data-type="exercise"><div id="fs-idm13017216" data-type="problem"><p id="fs-idp83647168">How would you represent the area to the left of three in a probability statement?</p> <div id="eip-idm45345808" class="bc-figure figure"><span id="fs-idp128136336" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C06_M04_item002-1.jpg" alt="" data-media-type="image/jpg" data-print-width="3in" /></span></div> </div> <p>solution P(x &lt; 3) &#8211;&gt;</p> </div> <div id="fs-idp103778720" data-type="exercise"><div id="fs-idp147171280" data-type="problem"><p id="fs-idp41528880">What is the area to the right of three?</p> <div id="eip-idp19359872" class="bc-figure figure"><span id="fs-idp2060576" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C06_M04_item002-1.jpg" alt="" data-media-type="image/jpg" data-print-width="3in" /></span></div> </div> <div id="fs-idp5045104" data-type="solution"><p id="fs-idp110734480">1 – <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &lt; 3) or <em data-effect="italics">P</em>(<em data-effect="italics">x</em> &gt; 3)</p> </div> </div> <div id="fs-idp880816" data-type="exercise"><div id="fs-idp154035488" data-type="problem"><p id="fs-idm221600">If the area to the left of <em data-effect="italics">x</em> in a normal distribution is 0.123, what is the area to the right of <em data-effect="italics">x</em>?</p> </div> <p>solution  1 – 0.123 = 0.877 &#8211;&gt;</p> </div> <div id="fs-idm33682096" data-type="exercise"><div id="fs-idp53588304" data-type="problem"><p id="fs-idp142371360">If the area to the right of <em data-effect="italics">x</em> in a normal distribution is 0.543, what is the area to the left of <em data-effect="italics">x</em>?</p> </div> <div id="fs-idp148923888" data-type="solution"><p id="fs-idp70427440">1 – 0.543 = 0.457</p> </div> </div> <p id="eip-410"><em data-effect="italics">Use the following information to answer the next four exercises:</em></p> <p><em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(54, 8)</p> <div id="fs-idp83610000" data-type="exercise"><div id="fs-idp149250384" data-type="problem"><p id="fs-idp151137280">Find the probability that <em data-effect="italics">x</em> &gt; 56.</p> </div> <p>solution  0.4013 &#8211;&gt;</p> </div> <div id="fs-idp105059184" data-type="exercise"><div id="fs-idm2106864" data-type="problem"><p id="fs-idp88634736">Find the probability that <em data-effect="italics">x</em> &lt; 30.</p> </div> <div id="fs-idp52756064" data-type="solution"><p id="fs-idp81762128">0.0013</p> </div> </div> <div id="fs-idp153118944" data-type="exercise"><div id="fs-idp69518640" data-type="problem"><p id="fs-idm17533616">Find the 80<sup>th</sup> percentile.</p> </div> <p>solution  60.73 &#8211;&gt;</p> </div> <div id="fs-idp49095264" data-type="exercise"><div id="fs-idp140530032" data-type="problem"><p id="fs-idp130860384">Find the 60<sup>th</sup> percentile.</p> </div> <div id="fs-idp150532688" data-type="solution"><p id="fs-idp96792864">56.03</p> </div> </div> <div id="fs-idp65259648" data-type="exercise"><div id="fs-idp122477920" data-type="problem"><p id="fs-idp107088272"><em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(6, 2)</p> <p id="fs-idp45580384">Find the probability that <em data-effect="italics">x</em> is between three and nine.</p> </div> <p>solution  0.8664 &#8211;&gt;</p> </div> <div id="fs-idp152542352" data-type="exercise"><div id="fs-idp77204752" data-type="problem"><p id="fs-idp25172544"><em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(–3, 4)</p> <p id="fs-idp101237440">Find the probability that <em data-effect="italics">x</em> is between one and four.</p> </div> <div id="fs-idm11629600" data-type="solution"><p id="fs-idp146932464">0.1186</p> </div> </div> <div id="fs-idp153143568" data-type="exercise"><div id="fs-idp121093952" data-type="problem"><p id="fs-idp152391232"><em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(4, 5)</p> <p id="fs-idm90444112">Find the maximum of <em data-effect="italics">x</em> in the bottom quartile.</p> </div> <p>solution  0.6276 &#8211;&gt;</p> </div> <div data-type="exercise"><div id="id44370927" data-type="problem"><p id="element-390"><em data-effect="italics">Use the following information to answer the next three exercise:</em> The life of Sunshine CD players is normally distributed with a mean of 4.1 years and a standard deviation of 1.3 years. A CD player is guaranteed for three years. We are interested in the length of time a CD player lasts. Find the probability that a CD player will break down during the guarantee period.</p> <ol type="a"><li>Sketch the situation. Label and scale the axes. Shade the region corresponding to the probability. <div id="fig1" class="bc-figure figure"><span id="id43849343" data-type="media" data-alt="Empty normal distribution curve."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch06_07_01-1.jpg" alt="Empty normal distribution curve." data-media-type="image/jpg" data-print-width="3in" /></span></div> </li> <li><em data-effect="italics">P</em>(0 &lt; <em data-effect="italics">x</em> &lt; ____________) = ___________ (Use zero for the minimum value of <em data-effect="italics">x</em>.)</li> </ol> </div> <div id="id44028319" data-type="solution"><ol type="a" data-mark-suffix="."><li>Check student’s solution.</li> <li>3, 0.1979</li> </ol> </div> </div> <div id="element-893" data-type="exercise"><div id="id44542589" data-type="problem"><p>Find the probability that a CD player will last between 2.8 and six years.</p> <ol id="element-654" type="a" data-mark-suffix="."><li>Sketch the situation. Label and scale the axes. Shade the region corresponding to the probability. <div id="fig-231" class="bc-figure figure"><span id="id44444856" data-type="media" data-alt="Empty normal distribution curve."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch06_07_02-1.jpg" alt="Empty normal distribution curve." data-media-type="image/jpg" data-print-width="3in" /></span></div> </li> <li><em data-effect="italics">P</em>(__________ &lt; <em data-effect="italics">x</em> &lt; __________) = __________</li> </ol> </div> <p>solution  Check student’s solution 2.8, 6, 0.7694 &#8211;&gt;</p> </div> <div data-type="exercise"><div id="id44024075" data-type="problem"><p>Find the 70<sup>th</sup> percentile of the distribution for the time a CD player lasts.</p> <ol type="a" data-mark-suffix="."><li>Sketch the situation. Label and scale the axes. Shade the region corresponding to the lower 70%. <div id="fig-241552" class="bc-figure figure"><span id="id44403356" data-type="media" data-alt="Empty normal distribution curve."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch06_07_03-1.jpg" alt="Empty normal distribution curve." data-media-type="image/jpg" data-print-width="3in" /></span></div> </li> <li><em data-effect="italics">P</em>(<em data-effect="italics">x</em> &lt; <em data-effect="italics">k</em>) = __________ Therefore, <em data-effect="italics">k</em> = _________</li> </ol> </div> <div id="id44002181" data-type="solution"><ol id="element-494" type="a" data-mark-suffix="."><li>Check student’s solution.</li> <li>0.70, 4.78 years</li> </ol> </div> </div> </div> <div class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <p id="eip-idp18302448"><em data-effect="italics">Use the following information to answer the next two exercises:</em> The patient recovery time from a particular surgical procedure is normally distributed with a mean of 5.3 days and a standard deviation of 2.1 days.</p> <div data-type="exercise"><div id="id3435957" data-type="problem"><p>1) What is the probability of spending more than two days in recovery?</p> <ol id="yep" type="a"><li>0.0580</li> <li>0.8447</li> <li>0.0553</li> <li>0.9420</li> </ol> </div> <p>&nbsp;</p> </div> <div data-type="exercise"><div id="id17871945" data-type="problem"><p>2) The 90<sup>th</sup> percentile for recovery times is?</p> <ol id="yop" type="a"><li>8.89</li> <li>7.07</li> <li>7.99</li> <li>4.32</li> </ol> </div> <div id="id10286094" data-type="solution"><p id="element-553"></p></div> </div> <p><em data-effect="italics">Use the following information to answer the next three exercises:</em> The length of time it takes to find a parking space at 9 A.M. follows a normal distribution with a mean of five minutes and a standard deviation of two minutes.</p> <div id="eip-486" data-type="exercise"><div id="id20620841" data-type="problem"><p>3) Based upon the given information and numerically justified, would you be surprised if it took less than one minute to find a parking space?</p> <ol id="yap" type="a" data-mark-suffix="."><li>Yes</li> <li>No</li> <li>Unable to determine</li> </ol> </div> <p>&nbsp;</p> </div> <div data-type="exercise"><div id="id4498978" data-type="problem"><p id="element-560">4) Find the probability that it takes at least eight minutes to find a parking space.</p> <ol id="element-298s" type="a"><li>0.0001</li> <li>0.9270</li> <li>0.1862</li> <li>0.0668</li> </ol> </div> <div id="id9734192" data-type="solution"><p>&nbsp;</p> </div> </div> <div id="eip-295" data-type="exercise"><div id="id25388640" data-type="problem"><p>5) Seventy percent of the time, it takes more than how many minutes to find a parking space?</p> <ol type="a"><li>1.24</li> <li>2.41</li> <li>3.95</li> <li>6.05</li> </ol> </div> <p>&nbsp;</p> </div> <div data-type="exercise"><div id="id20593649" data-type="problem"><p>6) According to a study done by De Anza students, the height for Asian adult males is normally distributed with an average of 66 inches and a standard deviation of 2.5 inches. Suppose one Asian adult male is randomly chosen. Let <em data-effect="italics">X</em> = height of the individual.</p> <ol type="a"><li><em data-effect="italics">X</em> ~ _____(_____,_____)</li> <li>Find the probability that the person is between 65 and 69 inches. Include a sketch of the graph, and write a probability statement.</li> <li>Would you expect to meet many Asian adult males over 72 inches? Explain why or why not, and justify your answer numerically.</li> </ol> <p>&nbsp;</p> </div> <div id="fs-idp83361824" data-type="solution"></div> </div> <div data-type="exercise"><div id="id21466710" data-type="problem"><p>7) IQ is normally distributed with a mean of 100 and a standard deviation of 15. Suppose one individual is randomly chosen. Let <em data-effect="italics">X</em> = IQ of an individual.</p> <ol type="a"><li><em data-effect="italics">X</em> ~ _____(_____,_____)</li> <li>Find the probability that the person has an IQ greater than 120. Include a sketch of the graph, and write a probability statement.</li> <li>MENSA is an organization whose members have the top 2% of all IQs. Find the minimum IQ needed to qualify for the MENSA organization. Sketch the graph, and write the probability statement.</li> <li>The middle 50% of IQs fall between what two values? Sketch the graph and write the probability statement.</li> </ol> </div> <p>&nbsp;</p> </div> <div data-type="exercise"><div id="id25415343" data-type="problem"><p id="element-811">8) The percent of fat calories that a person in America consumes each day is normally distributed with a mean of about 36 and a standard deviation of 10. Suppose that one individual is randomly chosen. Let <em data-effect="italics">X</em> = percent of fat calories.</p> <ol type="a"><li><em data-effect="italics">X</em> ~ _____(_____,_____)</li> <li>Find the probability that the percent of fat calories a person consumes is more than 40. Graph the situation. Shade in the area to be determined.</li> <li>Find the maximum number for the lower quarter of percent of fat calories. Sketch the graph and write the probability statement.</li> </ol> </div> <div id="eip-idp2232784" data-type="solution"><ol id="element-993s" type="a"><li><em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(36, 10)</li> <li>The probability that a person consumes more than 40% of their calories as fat is 0.3446.</li> <li>Approximately 25% of people consume less than 29.26% of their calories as fat.</li> </ol> </div> </div> <div id="eip-967" data-type="exercise"><div id="id18000241" data-type="problem"><p>9) Suppose that the distance of fly balls hit to the outfield (in baseball) is normally distributed with a mean of 250 feet and a standard deviation of 50 feet.</p> <ol id="element-107" type="a"><li>If <em data-effect="italics">X</em> = distance in feet for a fly ball, then <em data-effect="italics">X</em> ~ _____(_____,_____)</li> <li>If one fly ball is randomly chosen from this distribution, what is the probability that this ball traveled fewer than 220 feet? Sketch the graph. Scale the horizontal axis <em data-effect="italics">X</em>. Shade the region corresponding to the probability. Find the probability.</li> <li>Find the 80<sup>th</sup> percentile of the distribution of fly balls. Sketch the graph, and write the probability statement.</li> </ol> </div> <p>&nbsp;</p> </div> <div data-type="exercise"><div id="id16003321" data-type="problem"><p>10) In China, four-year-olds average three hours a day unsupervised. Most of the unsupervised children live in rural areas, considered safe. Suppose that the standard deviation is 1.5 hours and the amount of time spent alone is normally distributed. We randomly select one Chinese four-year-old living in a rural area. We are interested in the amount of time the child spends alone per day.</p> <ol id="element-265" type="a"><li>In words, define the random variable <em data-effect="italics">X</em>.</li> <li><em data-effect="italics">X</em> ~ _____(_____,_____)</li> <li>Find the probability that the child spends less than one hour per day unsupervised. Sketch the graph, and write the probability statement.</li> <li>What percent of the children spend over ten hours per day unsupervised?</li> <li>Seventy percent of the children spend at least how long per day unsupervised?</li> </ol> </div> <div id="fs-idp38795616" data-type="solution"><p>&nbsp;</p> </div> </div> <div id="eip-698" data-type="exercise"><div id="id19993022" data-type="problem"><p>11) In the 1992 presidential election, Alaska’s 40 election districts averaged 1,956.8 votes per district for President Clinton. The standard deviation was 572.3. (There are only 40 election districts in Alaska.) The distribution of the votes per district for President Clinton was bell-shaped. Let <em data-effect="italics">X</em> = number of votes for President Clinton for an election district.</p> <ol type="a"><li>State the approximate distribution of <em data-effect="italics">X</em>.</li> <li>Is 1,956.8 a population mean or a sample mean? How do you know?</li> <li>Find the probability that a randomly selected district had fewer than 1,600 votes for President Clinton. Sketch the graph and write the probability statement.</li> <li>Find the probability that a randomly selected district had between 1,800 and 2,000 votes for President Clinton.</li> <li>Find the third quartile for votes for President Clinton.</li> </ol> </div> <p>&nbsp;</p> </div> <div id="eip-838" data-type="exercise"><div id="id18019640" data-type="problem"><p>12) Suppose that the duration of a particular type of criminal trial is known to be normally distributed with a mean of 21 days and a standard deviation of seven days.</p> <ol type="a"><li>In words, define the random variable <em data-effect="italics">X</em>.</li> <li><em data-effect="italics">X</em> ~ _____(_____,_____)</li> <li>If one of the trials is randomly chosen, find the probability that it lasted at least 24 days. Sketch the graph and write the probability statement.</li> <li>Sixty percent of all trials of this type are completed within how many days?</li> </ol> <p>&nbsp;</p> </div> <div id="id20218475" data-type="solution"></div> </div> <div data-type="exercise"><div id="id24511113" data-type="problem"><p>13) Terri Vogel, an amateur motorcycle racer, averages 129.71 seconds per 2.5 mile lap (in a seven-lap race) with a standard deviation of 2.28 seconds. The distribution of her race times is normally distributed. We are interested in one of her randomly selected laps.</p> <ol type="a"><li>In words, define the random variable <em data-effect="italics">X</em>.</li> <li><em data-effect="italics">X</em> ~ _____(_____,_____)</li> <li>Find the percent of her laps that are completed in less than 130 seconds.</li> <li>The fastest 3% of her laps are under _____.</li> <li>The middle 80% of her laps are from _______ seconds to _______ seconds.</li> </ol> </div> <p>&nbsp;</p> </div> <div data-type="exercise"><div id="id3458007" data-type="problem"><p>14) Thuy Dau, Ngoc Bui, Sam Su, and Lan Voung conducted a survey as to how long customers at Lucky claimed to wait in the checkout line until their turn. Let <em data-effect="italics">X</em> = time in line. <a class="autogenerated-content" href="#element-694">(Figure)</a> displays the ordered real data (in minutes):</p> <table summary="This table presents raw data in 50 cells."><tbody><tr><td>0.50</td> <td>4.25</td> <td>5</td> <td>6</td> <td>7.25</td> </tr> <tr><td>1.75</td> <td>4.25</td> <td>5.25</td> <td>6</td> <td>7.25</td> </tr> <tr><td>2</td> <td>4.25</td> <td>5.25</td> <td>6.25</td> <td>7.25</td> </tr> <tr><td>2.25</td> <td>4.25</td> <td>5.5</td> <td>6.25</td> <td>7.75</td> </tr> <tr><td>2.25</td> <td>4.5</td> <td>5.5</td> <td>6.5</td> <td>8</td> </tr> <tr><td>2.5</td> <td>4.75</td> <td>5.5</td> <td>6.5</td> <td>8.25</td> </tr> <tr><td>2.75</td> <td>4.75</td> <td>5.75</td> <td>6.5</td> <td>9.5</td> </tr> <tr><td>3.25</td> <td>4.75</td> <td>5.75</td> <td>6.75</td> <td>9.5</td> </tr> <tr><td>3.75</td> <td>5</td> <td>6</td> <td>6.75</td> <td>9.75</td> </tr> <tr><td>3.75</td> <td>5</td> <td>6</td> <td>6.75</td> <td>10.75</td> </tr> </tbody> </table> <ol type="a"><li>Calculate the sample mean and the sample standard deviation.</li> <li>Construct a histogram.</li> <li>Draw a smooth curve through the midpoints of the tops of the bars.</li> <li>In words, describe the shape of your histogram and smooth curve.</li> <li>Let the sample mean approximate <em data-effect="italics">μ</em> and the sample standard deviation approximate <em data-effect="italics">σ</em>. The distribution of <em data-effect="italics">X</em> can then be approximated by <em data-effect="italics">X</em> ~ _____(_____,_____)</li> <li>Use the distribution in part e to calculate the probability that a person will wait fewer than 6.1 minutes.</li> <li>Determine the cumulative relative frequency for waiting less than 6.1 minutes.</li> <li>Why aren’t the answers to part f and part g exactly the same?</li> <li>Why are the answers to part f and part g as close as they are?</li> <li>If only ten customers has been surveyed rather than 50, do you think the answers to part f and part g would have been closer together or farther apart? Explain your conclusion.</li> </ol> </div> <div id="id23136376" data-type="solution"><p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id14371352" data-type="problem"><p>15) Suppose that Ricardo and Anita attend different colleges. Ricardo’s GPA is the same as the average GPA at his school. Anita’s GPA is 0.70 standard deviations above her school average. In complete sentences, explain why each of the following statements may be false.</p> <ol id="element-264" type="a"><li>Ricardo’s actual GPA is lower than Anita’s actual GPA.</li> <li>Ricardo is not passing because his <em data-effect="italics">z</em>-score is zero.</li> <li>Anita is in the 70<sup>th</sup> percentile of students at her college.</li> </ol> </div> <p>&nbsp;</p> </div> <div id="eip-67" data-type="exercise"><div id="id20765694" data-type="problem"><p>16) <a class="autogenerated-content" href="#element-666">(Figure)</a> shows a sample of the maximum capacity (maximum number of spectators) of sports stadiums. The table does not include horse-racing or motor-racing stadiums.</p> <table summary="The table is a sample of the capacity of 60 sports stadiums ordered by the maximum number of spectators. Horse racing and motor racing stadiums are not included."><tbody><tr><td>40,000</td> <td>40,000</td> <td>45,050</td> <td>45,500</td> <td>46,249</td> <td>48,134</td> </tr> <tr><td>49,133</td> <td>50,071</td> <td>50,096</td> <td>50,466</td> <td>50,832</td> <td>51,100</td> </tr> <tr><td>51,500</td> <td>51,900</td> <td>52,000</td> <td>52,132</td> <td>52,200</td> <td>52,530</td> </tr> <tr><td>52,692</td> <td>53,864</td> <td>54,000</td> <td>55,000</td> <td>55,000</td> <td>55,000</td> </tr> <tr><td>55,000</td> <td>55,000</td> <td>55,000</td> <td>55,082</td> <td>57,000</td> <td>58,008</td> </tr> <tr><td>59,680</td> <td>60,000</td> <td>60,000</td> <td>60,492</td> <td>60,580</td> <td>62,380</td> </tr> <tr><td>62,872</td> <td>64,035</td> <td>65,000</td> <td>65,050</td> <td>65,647</td> <td>66,000</td> </tr> <tr><td>66,161</td> <td>67,428</td> <td>68,349</td> <td>68,976</td> <td>69,372</td> <td>70,107</td> </tr> <tr><td>70,585</td> <td>71,594</td> <td>72,000</td> <td>72,922</td> <td>73,379</td> <td>74,500</td> </tr> <tr><td>75,025</td> <td>76,212</td> <td>78,000</td> <td>80,000</td> <td>80,000</td> <td>82,300</td> </tr> </tbody> </table> <ol id="element-422" type="a"><li>Calculate the sample mean and the sample standard deviation for the maximum capacity of sports stadiums (the data).</li> <li>Construct a histogram.</li> <li>Draw a smooth curve through the midpoints of the tops of the bars of the histogram.</li> <li>In words, describe the shape of your histogram and smooth curve.</li> <li>Let the sample mean approximate <em data-effect="italics">μ</em> and the sample standard deviation approximate <em data-effect="italics">σ</em>. The distribution of <em data-effect="italics">X</em> can then be approximated by <em data-effect="italics">X</em> ~ _____(_____,_____).</li> <li>Use the distribution in part e to calculate the probability that the maximum capacity of sports stadiums is less than 67,000 spectators.</li> <li>Determine the cumulative relative frequency that the maximum capacity of sports stadiums is less than 67,000 spectators. Hint: Order the data and count the sports stadiums that have a maximum capacity less than 67,000. Divide by the total number of sports stadiums in the sample.</li> <li>Why aren’t the answers to part f and part g exactly the same?</li> </ol> </div> <div id="id15324926" data-type="solution"><p>&nbsp;</p> </div> </div> <div id="eip-406" data-type="exercise"><div id="eip-308" data-type="problem"><p>17) An expert witness for a paternity lawsuit testifies that the length of a pregnancy is normally distributed with a mean of 280 days and a standard deviation of 13 days. An alleged father was out of the country from 240 to 306 days before the birth of the child, so the pregnancy would have been less than 240 days or more than 306 days long if he was the father. The birth was uncomplicated, and the child needed no medical intervention. What is the probability that he was NOT the father? What is the probability that he could be the father? Calculate the <em data-effect="italics">z</em>-scores first, and then use those to calculate the probability.</p> </div> <p>&nbsp;</p> </div> <div id="eip-889" data-type="exercise"><div data-type="problem"><p id="eip-536">18) A NUMMI assembly line, which has been operating since 1984, has built an average of 6,000 cars and trucks a week. Generally, 10% of the cars were defective coming off the assembly line. Suppose we draw a random sample of <em data-effect="italics">n</em> = 100 cars. Let <em data-effect="italics">X</em> represent the number of defective cars in the sample. What can we say about <em data-effect="italics">X</em> in regard to the 68-95-99.7 empirical rule (one standard deviation, two standard deviations and three standard deviations from the mean are being referred to)? Assume a normal distribution for the defective cars in the sample.</p> </div> <div id="eip-906" data-type="solution"><ul id="eip-idp126470336" data-labeled-item="true"><li><em data-effect="italics">n</em> = 100; <em data-effect="italics">p</em> = 0.1; <em data-effect="italics">q</em> = 0.9</li> <li><em data-effect="italics">μ</em> = <em data-effect="italics">np</em> = (100)(0.10) = 10</li> <li><em data-effect="italics">σ</em> = \(\sqrt{npq}\) = \(\sqrt{\text{(100)(0}\text{.1)(0}\text{.9)}}\) = 3</li> </ul> <p>&nbsp;</p> <ol id="eip-idm100993808" type="i"></ol> </div> </div> <div id="eip-35" data-type="exercise"><div data-type="problem"><p id="eip-947">19) We flip a coin 100 times (<em data-effect="italics">n</em> = 100) and note that it only comes up heads 20% (<em data-effect="italics">p</em> = 0.20) of the time. The mean and standard deviation for the number of times the coin lands on heads is <em data-effect="italics">µ</em> = 20 and <em data-effect="italics">σ</em> = 4 (verify the mean and standard deviation). Solve the following:</p> <ol id="eip-idp64792544" type="a"><li>There is about a 68% chance that the number of heads will be somewhere between ___ and ___.</li> <li>There is about a ____chance that the number of heads will be somewhere between 12 and 28.</li> <li>There is about a ____ chance that the number of heads will be somewhere between eight and 32.</li> </ol> </div> <p>&nbsp;</p> </div> <div data-type="exercise"><div data-type="problem"><p id="eip-842">20) A \$1 scratch off lotto ticket will be a winner one out of five times. Out of a shipment of <em data-effect="italics">n</em> = 190 lotto tickets, find the probability for the lotto tickets that there are</p> <ol id="eip-idp139727617865536" type="a"><li>somewhere between 34 and 54 prizes.</li> <li>somewhere between 54 and 64 prizes.</li> <li>more than 64 prizes.</li> </ol> </div> <div data-type="solution"><ul id="xeip" data-labeled-item="true"><li><em data-effect="italics">n</em> = 190; <em data-effect="italics">p</em> = \(\frac{1}{5}\) = 0.2; <em data-effect="italics">q</em> = 0.8</li> <li><em data-effect="italics">μ</em> = <em data-effect="italics">np</em> = (190)(0.2) = 38</li> <li><em data-effect="italics">σ</em> = \(\sqrt{npq}\) = \(\sqrt{\text{(190)(0}\text{.2)(0}\text{.8)}}\) = 5.5136</li> </ul> <p>&nbsp;</p> </div> </div> <div id="eip-291" data-type="exercise"><div data-type="problem"><p>21) Facebook provides a variety of statistics on its Web site that detail the growth and popularity of the site.</p> <p id="eip-idp26662736">On average, 28 percent of 18 to 34 year olds check their Facebook profiles before getting out of bed in the morning. Suppose this percentage follows a normal distribution with a standard deviation of five percent.</p> <ol id="eip-idp160307472" type="a"><li>Find the probability that the percent of 18 to 34-year-olds who check Facebook before getting out of bed in the morning is at least 30.</li> <li>Find the 95<sup>th</sup> percentile, and express it in a sentence.</li> </ol> </div> <p>&nbsp;</p> <p><strong>Answers to odd questions</strong></p> <p>1) d</p> <p>3) a</p> <p>5) c</p> <p>7)    a) N(100, 15)<br /> b) The probability that a person has an IQ greater than 120 is 0.0918.<br /> c) A person has to have an IQ over 130 to qualify for MENSA.<br /> d) The middle 50% of IQ scores falls between 89.95 and 110.05.</p> <p>9) a)  X ~ N(250, 50)<br /> b) The probability that a fly ball travels less than 220 feet is 0.2743.<br /> c) Eighty percent of the fly balls will travel less than 292 feet.</p> <p>11) a) X ~ N(1956.8, 572.3)<br /> b) This is a population mean, because all election districts are included.<br /> c) The probability that a district had less than 1,600 votes for President Clinton is 0.2676.<br /> d) 0.3798<br /> e) Seventy-five percent of the districts had fewer than 2,340 votes for President Clinton.</p> <p>13) a)  X = the distribution of race times that Terry Vogel produces<br /> b) X ~ N(129.71, 2.28)<br /> c) Terri completes 55.17% of her laps in less than 130 seconds.<br /> d) Terri completes 55.17% of her laps in less than 130 seconds.<br /> e) 124.4 and 135.02</p> <p>15) a) If the average GPA is less at Anita’s school than it is at Ricardo’s, then Ricardo’s actual score could be higher.<br /> b)  Passing can be defined differently at different schools. Also, since Ricardo’s z-score is 0, his GPA is actually the average for his school, which is typically a passing GPA.<br /> c)  Anita’s percentile is higher than the 70<sup>th</sup> percentile.</p> <p>17)  For x = 240, X−μ σ = 240−280 13 =−3.0769  For x = 306, 306−280 13 =2 P(240 &lt; x &lt; 306) = P(–3.0769 &lt; z &lt; 2) = normalcdf(–3.0769,2,0,1) = 0.9762. According to the scenario given, this means that there is a 97.62% chance that he is not the father. To answer the second part of the question, there is a 1 – 0.9762 = 0.0238 = 2.38% chance that he is the father.</p> <p>19)  a) There is about a 68% chance that the number of heads will be somewhere between 16 and 24. z = ±1: x1 = µ + zσ = 20 + 1(4) = 24 and x2 = µ-zσ = 20 – 1(4) = 16. There is about a 95% chance that the number of heads will be somewhere between 12 and 28.<br /> b) For this problem: normalcdf(12,28,20,4) = 0.9545 = 95.45% There is about a 99.73% chance that the number of heads will be somewhere between eight and 32.<br /> c) For this problem: normalcdf(8,32,20,4) = 0.9973 = 99.73%.</p> <p>21) a) X = the percent of 18 to 34-year-olds who check Facebook before getting out of bed in the morning. X ~ N(28, 5) P(x ≥ 30) = 0.3446; normalcdf(30,1EE99,28,5) = 0.3446 invNorm(0.95,0.28,0.05) = 0.3622.95% of the percent of 18 to 34 year olds who check Facebook before getting out of bed in the morning is at most 36.22%.<br /> b) P(25 &lt; x &lt; 55). P(25 &lt; x &lt; 55) = normalcdf(25,55,28,5) = 0.7257(0.7257)(400) = 290.28</p> <p>&nbsp;</p> </div> </div> </div></div>
<div class="chapter standard" id="chapter-normal-distribution-lap-times" title="Activity 7.4: Normal Distribution (Lap Times)"><div class="chapter-title-wrap"><h3 class="chapter-number">44</h3><h2 class="chapter-title"><span class="display-none">Activity 7.4: Normal Distribution (Lap Times)</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-id8352229" class="statistics lab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Normal Distribution (Lap Times)</div> <p id="id5906467">Class Time:</p> <p id="id7315926">Names:</p> <div id="id3819098" data-type="list"><div data-type="title">Student Learning Outcome</div> <ul><li>The student will compare and contrast empirical data and a theoretical distribution to determine if Terry Vogel&#8217;s lap times fit a continuous distribution.</li> </ul> </div> <p><span data-type="title">Directions</span>Round the relative frequencies and probabilities to four decimal places. Carry all other decimal answers to two places.</p> <div id="list-23435" data-type="list"><div id="CollectData" data-type="title">Collect the Data</div> <ol><li>Use the data from <a href="/contents/3ef830bc-5247-460a-9007-e3fd762e5e93">Appendix C</a>. Use a stratified sampling method by lap (races 1 to 20) and a random number generator to pick six lap times from each stratum. Record the lap times below for laps two to seven.<span data-type="newline"><br /> </span> <table id="element-2352564535" summary="Blank table with 36 empty cells."><tbody><tr><td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> </tr> <tr><td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> </tr> <tr><td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> </tr> <tr><td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> </tr> <tr><td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> </tr> <tr><td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> </tr> </tbody> </table> </li> <li>Construct a histogram. Make five to six intervals. Sketch the graph using a ruler and pencil. Scale the axes. <div id="fig045324" class="bc-figure figure"><span id="id7826845" data-type="media" data-alt="Blank graph with relative frequency on the vertical axis and lap time on the horizontal axis."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch06_10_01-1.png" alt="Blank graph with relative frequency on the vertical axis and lap time on the horizontal axis." width="400" data-media-type="image/png" /></span></div> </li> <li>Calculate the following: <ol id="list-245243533" type="a"><li>\(\overline{x}\) = _______</li> <li><em data-effect="italics">s</em> = _______</li> </ol> </li> <li>Draw a smooth curve through the tops of the bars of the histogram. Write one to two complete sentences to describe the general shape of the curve. (Keep it simple. Does the graph go straight across, does it have a v-shape, does it have a hump in the middle or at either end, and so on?)</li> </ol> </div> <p><span id="AnalyzeDist" data-type="title">Analyze the Distribution</span> Using your sample mean, sample standard deviation, and histogram to help, what is the approximate theoretical distribution of the data?</p> <ul id="list-23452"><li><em data-effect="italics">X</em> ~ _____(_____,_____)</li> <li>How does the histogram help you arrive at the approximate distribution?</li> </ul> <p><span id="DescData" data-type="title">Describe the Data</span> Use the data you collected to complete the following statements.</p> <ul id="list-2342532"><li>The <em data-effect="italics">IQR</em> goes from __________ to __________.</li> <li><em data-effect="italics">IQR</em> = __________. (<em data-effect="italics">IQR</em> = <em data-effect="italics">Q</em><sub>3</sub> – <em data-effect="italics">Q</em><sub>1</sub>)</li> <li>The 15<sup>th</sup> percentile is _______.</li> <li>The 85<sup>th</sup> percentile is _______.</li> <li>The median is _______.</li> <li>The empirical probability that a randomly chosen lap time is more than 130 seconds is _______.</li> <li>Explain the meaning of the 85<sup>th</sup> percentile of this data.</li> </ul> <p><span data-type="title">Theoretical Distribution</span> Using the theoretical distribution, complete the following statements. You should use a normal approximation based on your sample data.</p> <ul id="list-23432432"><li>The <em data-effect="italics">IQR</em> goes from __________ to __________.</li> <li><em data-effect="italics">IQR</em> = _______.</li> <li>The 15<sup>th</sup> percentile is _______.</li> <li>The 85<sup>th</sup> percentile is _______.</li> <li>The median is _______.</li> <li>The probability that a randomly chosen lap time is more than 130 seconds is _______.</li> <li>Explain the meaning of the 85<sup>th</sup> percentile of this distribution.</li> </ul> <p id="fs-idm197255200"><span data-type="title">Discussion Questions</span>Do the data from the section titled <a href="#CollectData">Collect the Data</a> give a close approximation to the theoretical distribution in the section titled <a href="#AnalyzeDist">Analyze the Distribution</a>? In complete sentences and comparing the result in the sections titled <a href="#DescData">Describe the Data</a> and <a href="#TheoDist">Theoretical Distribution</a>, explain why or why not.</p> </div> </div></div>
<div class="chapter standard" id="chapter-normal-distribution-pinkie-length" title="Activity 7.5: Normal Distribution (Pinkie Length)"><div class="chapter-title-wrap"><h3 class="chapter-number">45</h3><h2 class="chapter-title"><span class="display-none">Activity 7.5: Normal Distribution (Pinkie Length)</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-id1164259932581" class="statistics lab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Normal Distribution (Pinkie Length)</div> <p id="id10270690">Class Time:</p> <p id="id9706632">Names:</p> <div id="id10998330" data-type="list"><div data-type="title">Student Learning Outcomes</div> <ul><li>The student will compare empirical data and a theoretical distribution to determine if data from the experiment follow a continuous distribution.</li> </ul> </div> <p id="element-967"><span data-type="title">Collect the Data</span> Measure the length of your pinky finger (in centimeters).</p> <ol id="list-32487618725"><li>Randomly survey 30 adults for their pinky finger lengths. Round the lengths to the nearest 0.5 cm.<span data-type="newline"><br /> </span> <table id="element-2352564535s" summary="Blank table with 30 blank cells."><tbody><tr><td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> </tr> <tr><td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> </tr> <tr><td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> </tr> <tr><td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> </tr> <tr><td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> </tr> <tr><td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> <td>_______</td> </tr> </tbody> </table> </li> <li>Construct a histogram. Make five to six intervals. Sketch the graph using a ruler and pencil. Scale the axes. <div id="eip-idp99193136" class="bc-figure figure"><span id="id5490004" data-type="media" data-alt="Blank graph with frequency on the vertical axis and length of finger on the horizontal axis."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch06_11_01N-1.png" alt="Blank graph with frequency on the vertical axis and length of finger on the horizontal axis." width="380" data-media-type="image/png" /></span></div> </li> <li>Calculate the following. <ol id="list-98752438" type="a"><li>\(\overline{x}\) = _______</li> <li><em data-effect="italics">s</em> = _______</li> </ol> </li> <li>Draw a smooth curve through the top of the bars of the histogram. Write one to two complete sentences to describe the general shape of the curve. (Keep it simple. Does the graph go straight across, does it have a v-shape, does it have a hump in the middle or at either end, and so on?)</li> </ol> <p><span data-type="title">Analyze the Distribution</span> Using your sample mean, sample standard deviation, and histogram, what was the approximate theoretical distribution of the data you collected?</p> <ul id="list-2342345"><li><em data-effect="italics">X</em> ~ _____(_____,_____)</li> <li>How does the histogram help you arrive at the approximate distribution?</li> </ul> <p id="element-298"><span id="DescData2" data-type="title">Describe the Data</span> Using the data you collected complete the following statements. (Hint: order the data)</p> <div id="id5390377" data-type="note" data-has-label="true" data-label="" data-element-type="Remember"><div data-type="title">Remember</div> <p id="eip-idp29979984">(<em data-effect="italics">IQR</em> = <em data-effect="italics">Q</em><sub>3</sub> – <em data-effect="italics">Q</em><sub>1</sub>)</p> </div> <ul id="list-98723985"><li><em data-effect="italics">IQR</em> = _______</li> <li>The 15<sup>th</sup> percentile is _______.</li> <li>The 85<sup>th</sup> percentile is _______.</li> <li>Median is _______.</li> <li>What is the theoretical probability that a randomly chosen pinky length is more than 6.5 cm?</li> <li>Explain the meaning of the 85<sup>th</sup> percentile of this data.</li> </ul> <p id="element-935"><span data-type="title">Theoretical Distribution</span> Using the theoretical distribution, complete the following statements. Use a normal approximation based on the sample mean and standard deviation.</p> <ul id="list-98436723985"><li><em data-effect="italics">IQR</em> = _______</li> <li>The 15<sup>th</sup> percentile is _______.</li> <li>The 85<sup>th</sup> percentile is _______.</li> <li>Median is _______.</li> <li>What is the theoretical probability that a randomly chosen pinky length is more than 6.5 cm?</li> <li>Explain the meaning of the 85<sup>th</sup> percentile of this data.</li> </ul> <p id="eip-idp56724736"><span data-type="title">Discussion Questions</span>Do the data you collected give a close approximation to the theoretical distribution? In complete sentences and comparing the results in the sections titled <a href="#DescData2">Describe the Data</a> and <a href="#TheoDist2">Theoretical Distribution</a>, explain why or why not.</p> </div> </div></div>
<div class="part " id="part-the-central-limit-theorem"><div class="part-title-wrap"><h3 class="part-number">VIII</h3><h1 class="part-title">Chapter 8: The Central Limit Theorem</h1></div><div class="ugc part-ugc"></div></div>
<div class="chapter standard" id="chapter-introduction-19" title="Chapter 8.1: Introduction"><div class="chapter-title-wrap"><h3 class="chapter-number">46</h3><h2 class="chapter-title"><span class="display-none">Chapter 8.1: Introduction</span></h2></div><div class="ugc chapter-ugc"><p>[latexpage]</p> <div id="fs-idm16905488" class="splash"><div class="bc-figcaption figcaption">If you want to figure out the distribution of the change people carry in their pockets, using the central limit theorem and assuming your sample is large enough, you will find that the distribution is normal and bell-shaped. (credit: John Lodder)</div> <p><span id="fs-idm70194688" data-type="media" data-alt="This is a photo of change a set of keys in a pile. There appear to be five pennies, three quarters, four dimes, and two nickels. The key ring has a bronze whale on it and holds eleven keys."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/CNX_Stats_C07_CO-1.jpg" alt="This is a photo of change a set of keys in a pile. There appear to be five pennies, three quarters, four dimes, and two nickels. The key ring has a bronze whale on it and holds eleven keys." width="500" data-media-type="image/jpg" /></span></p> </div> <div id="fs-idp26371136" class="chapter-objectives" data-type="note" data-has-label="true" data-label=""><div data-type="title">Chapter Objectives</div> <p>By the end of this chapter, the student should be able to:</p> <ul id="list6234"><li>Recognize central limit theorem problems.</li> <li>Classify continuous word problems by their distributions.</li> <li>Apply and interpret the central limit theorem for means.</li> <li>Apply and interpret the central limit theorem for sums.</li> </ul> </div> <p>Why are we so concerned with means? Two reasons are: they give us a middle ground for comparison, and they are easy to calculate. In this chapter, you will study means and the <strong>central limit theorem</strong>.</p> <p>The <span data-type="term">central limit theorem</span> (clt for short) is one of the most powerful and useful ideas in all of statistics. There are two alternative forms of the theorem, and both alternatives are concerned with drawing finite samples size <em data-effect="italics">n</em> from a population with a known mean, <em data-effect="italics">μ</em>, and a known standard deviation, <em data-effect="italics">σ</em>. The first alternative says that if we collect samples of size <em data-effect="italics">n</em> with a &#8220;large enough <em data-effect="italics">n</em>,&#8221; calculate each sample&#8217;s mean, and create a histogram of those means, then the resulting histogram will tend to have an approximate normal bell shape. The second alternative says that if we again collect samples of size <em data-effect="italics">n</em> that are &#8220;large enough,&#8221; calculate the sum of each sample and create a histogram, then the resulting histogram will again tend to have a normal bell-shape.</p> <p>The size of the sample, <em data-effect="italics">n</em>, that is required in order to be &#8220;large enough&#8221; depends on the original population from which the samples are drawn (the sample size should be at least 30 or the data should come from a normal distribution). If the original population is far from normal, then more observations are needed for the sample means or sums to be normal. <strong>Sampling is done with replacement.</strong></p> <p>It would be difficult to overstate the importance of the central limit theorem in statistical theory. Knowing that data, even if its distribution is not normal, behaves in a predictable way is a powerful tool.</p> <div id="fs-idp64638656" class="statistics collab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Collaborative Classroom Activity</div> <p id="element-332">Suppose eight of you roll one fair die ten times, seven of you roll two fair dice ten times, nine of you roll five fair dice ten times, and 11 of you roll ten fair dice ten times.</p> <p>Each time a person rolls more than one die, he or she calculates the sample <span data-type="term">mean</span> of the faces showing. For example, one person might roll five fair dice and get 2, 2, 3, 4, 6 on one roll.</p> <p>The mean is (frac{text{2 + 2 + 3 + 4 + 6}}{5}) = 3.4. The 3.4 is one mean when five fair dice are rolled. This same person would roll the five dice nine more times and calculate nine more means for a total of ten means.</p> <p>Your instructor will pass out the dice to several people. Roll your dice ten times. For each roll, record the faces, and find the mean. Round to the nearest 0.5.</p> <p id="element-73">Your instructor (and possibly you) will produce one graph (it might be a histogram) for one die, one graph for two dice, one graph for five dice, and one graph for ten dice. Since the &#8220;mean&#8221; when you roll one die is just the face on the die, what distribution do these <strong>means</strong> appear to be representing?</p> <p><strong>Draw the graph for the means using two dice.</strong> Do the sample means show any kind of pattern?</p> <p><strong>Draw the graph for the means using five dice.</strong> Do you see any pattern emerging?</p> <p><strong>Finally, draw the graph for the means using ten dice.</strong> Do you see any pattern to the graph? What can you conclude as you increase the number of dice?</p> <p>As the number of dice rolled increases from one to two to five to ten, the following is happening:</p> <ol id="list-1"><li>The mean of the sample means remains approximately the same.</li> <li>The spread of the sample means (the standard deviation of the sample means) gets smaller.</li> <li>The graph appears steeper and thinner.</li> </ol> <p>You have just demonstrated the central limit theorem (clt).</p> <p>The central limit theorem tells you that as you increase the number of dice, <strong>the sample means tend toward a normal distribution (the sampling distribution).</strong></p> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="fs-idm28209360"><dt>Sampling Distribution</dt> <dd id="fs-idm33714048">Given simple random samples of size <em data-effect="italics">n</em> from a given population with a measured characteristic such as mean, proportion, or standard deviation for each sample, the probability distribution of all the measured characteristics is called a sampling distribution.</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-the-central-limit-theorem-for-sample-means-averages" title="Chapter 8.2: The Central Limit Theorem for Sample Means (Averages)"><div class="chapter-title-wrap"><h3 class="chapter-number">47</h3><h2 class="chapter-title"><span class="display-none">Chapter 8.2: The Central Limit Theorem for Sample Means (Averages)</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p id="fs-idm47553248">Suppose <em data-effect="italics">X</em> is a random variable with a distribution that may be known or unknown (it can be any distribution). Using a subscript that matches the random variable, suppose:</p> <ol type="a"><li><em data-effect="italics">μ</em><em data-effect="italics"><sub>X</sub></em> = the mean of <em data-effect="italics">X</em></li> <li><em data-effect="italics">σ</em><em data-effect="italics"><sub>X</sub></em> = the standard deviation of <em data-effect="italics">X</em></li> </ol> <p id="fs-idm4523440">If you draw random samples of size <em data-effect="italics">n</em>, then as <em data-effect="italics">n</em> increases, the random variable \(\overline{X}\) which consists of sample means, tends to be <span data-type="term">normally distributed</span> and</p> <p>\(\overline{X}\) ~ <em data-effect="italics">N</em>\(\left({\mu }_{x}\text{, }\frac{\sigma x}{\sqrt{n}}\right)\).</p> <p>The <span data-type="term">central limit theorem</span> for sample means says that if you keep drawing larger and larger samples (such as rolling one, two, five, and finally, ten dice) and <strong>calculating their means,</strong> the sample means form their own <strong>normal distribution</strong> (the sampling distribution). The normal distribution has the same mean as the original distribution and a variance that equals the original variance divided by the sample size. Standard deviation is the square root of variance, so the standard deviation of the sampling distribution is the standard deviation of the original distribution divided by the square root of <em data-effect="italics">n</em>. The variable <em data-effect="italics">n</em> is the number of values that are averaged together, not the number of times the experiment is done.</p> <p id="fs-idm26555216">To put it more formally, if you draw random samples of size <em data-effect="italics">n</em>, the distribution of the random variable \(\overline{X}\), which consists of sample means, is called the <strong>sampling distribution of the mean</strong>. The sampling distribution of the mean approaches a normal distribution as <em data-effect="italics">n</em>, the <span data-type="term">sample size</span>, increases.</p> <p>The random variable \(\overline{X}\) has a different <em data-effect="italics">z</em>-score associated with it from that of the random variable <em data-effect="italics">X</em>. The mean \(\overline{x}\) is the value of \(\overline{X}\) in one sample.</p> <div data-type="equation">\(z=\frac{\overline{x}-{\mu }_{x}}{\left(\frac{{\sigma }_{x}}{\sqrt{n}}\right)}\)</div> <p><em data-effect="italics">μ</em><sub><em data-effect="italics">X</em></sub> is the average of both <em data-effect="italics">X</em> and \(\overline{X}\).</p> <p>\(\sigma \overline{x}\text{ = }\frac{\sigma x}{\sqrt{n}}\) = standard deviation of \(\overline{X}\) and is called the <span data-type="term">standard error of the mean.</span></p> <div id="fs-idm74236704" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="fs-idm164143744">To find probabilities for means on the calculator, follow these steps.</p> <p id="fs-idm81814896">2nd DISTR<span data-type="newline"><br /> </span>2:normalcdf</p> <p id="fs-idm32771680">\(normalcdf\left(lower\text{ }value\text{ }of\text{ }the\text{ }area,\text{ }upper\text{ }value\text{ }of\text{ }the\text{ }area,\text{ }mean,\text{ }\frac{standard deviation}{\sqrt{sample size}}\right)\)</p> <p id="fs-idm176650400">where:</p> <ul id="fs-idm130027040"><li><em data-effect="italics">mean</em> is the mean of the original distribution</li> <li><em data-effect="italics">standard deviation</em> is the standard deviation of the original distribution</li> <li><em data-effect="italics">sample size</em> = <em data-effect="italics">n</em></li> </ul> </div> <div class="textbox textbox--examples" data-type="example"><p>An unknown distribution has a mean of 90 and a standard deviation of 15. Samples of size <em data-effect="italics">n</em> = 25 are drawn randomly from the population.</p> <div data-type="exercise"><div id="id5873867" data-type="problem"><p><span data-type="newline"><br /> </span>a. Find the probability that the <span data-type="term">sample mean</span> is between 85 and 92.</p> </div> <div id="id5873893" data-type="solution"><p id="fs-idp14520352">a. Let <em data-effect="italics">X</em> = one value from the original unknown population. The probability question asks you to find a probability for the <strong>sample mean</strong>.</p> <p>Let \(\overline{X}\) = the mean of a sample of size 25. Since <em data-effect="italics">μ</em><sub><em data-effect="italics">X</em></sub> = 90, <em data-effect="italics">σ<sub>X</sub></em> = 15, and <em data-effect="italics">n</em> = 25,</p> <p>\(\overline{X}\) ~ <em data-effect="italics">N</em>\(\left(90\text{, }\frac{15}{\sqrt{25}}\right)\).</p> <p>Find <em data-effect="italics">P</em>(85 &lt; \(\overline{x}\) &lt; 92). Draw a graph.</p> <p><em data-effect="italics">P</em>(85 &lt; \(\overline{x}\) &lt; 92) = 0.6997</p> <p id="element-594">The probability that the sample mean is between 85 and 92 is 0.6997.</p> </div> </div> <div id="fs-idm143949248" class="bc-figure figure"><div class="wp-caption alignnone" style="width: 490px"><img class="size-medium" src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch07_02_01-1.jpg" alt="Shaded area for P(85 &lt; x bar &lt; 92)" width="490" height="229" /><div class="wp-caption-text">This is a normal distribution curve. The peak of the curve coincides with the point 90 on the horizontal axis. The points 85 and 92 are labeled on the axis. Vertical lines are drawn from these points to the curve and the area between the lines is shaded. The shaded region represents the probability that 85 &lt; x &lt; 92)</div></div> </div> <div id="fs-idm146534448" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="element-868"><code>normalcdf</code>(lower value, upper value, mean, standard error of the mean)</p> <p>The parameter list is abbreviated (lower value, upper value, <em data-effect="italics">μ</em>, \(\frac{\sigma }{\sqrt{n}}\))</p> <p><code>normalcdf</code>(85,92,90,\(\frac{15}{\sqrt{25}}\)) = 0.6997</p> </div> <div id="element-808" data-type="exercise"><div id="id5864943" data-type="problem"><p id="fs-idm58973456">b. Find the value that is two standard deviations above the expected value, 90, of the sample mean.</p> </div> <div id="id5864961" data-type="solution"><p>b. To find the value that is two standard deviations above the expected value 90, use the formula:</p> <p id="eip-idm63061120">value = <em data-effect="italics">μ</em><sub>x</sub> + (#ofTSDEVs)\(\left(\frac{{\sigma }_{x}}{\sqrt{n}}\right)\)</p> <p id="eip-idp63452384">value = 90 + 2 \(\left(\frac{15}{\sqrt{25}}\right)\) = 96</p> <p>The value that is two standard deviations above the expected value is 96.</p> <p id="fs-idp29133808">The standard error of the mean is \(\frac{\sigma x}{\sqrt{n}}\) = \(\frac{15}{\sqrt{25}}\) = 3. Recall that the standard error of the mean is a description of how far (on average) that the sample mean will be from the population mean in repeated simple random samples of size <em data-effect="italics">n</em>.</p> </div> </div> </div> <div id="fs-idm83395104" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm18823760" data-type="exercise"><div id="fs-idp5202224" data-type="problem"><p id="fs-idp2240544">An unknown distribution has a mean of 45 and a standard deviation of eight. Samples of size <em data-effect="italics">n</em> = 30 are drawn randomly from the population. Find the probability that the sample mean is between 42 and 50.</p> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><div id="element-794" data-type="exercise"><div id="id5865146" data-type="problem"><p>The length of time, in hours, it takes an &#8220;over 40&#8221; group of people to play one soccer match is normally distributed with a <strong>mean of two hours</strong> and a <strong>standard deviation of 0.5 hours</strong>. A <strong>sample of size <em data-effect="italics">n</em> = 50</strong> is drawn randomly from the population. Find the probability that the <strong>sample mean</strong> is between 1.8 hours and 2.3 hours.</p> </div> <div id="id5865170" data-type="solution"><p>Let <em data-effect="italics">X</em> = the time, in hours, it takes to play one soccer match.</p> <p>The probability question asks you to find a probability for the <strong>sample mean time, in hours</strong>, it takes to play one soccer match.</p> <p>Let \(\overline{X}\) = the <span data-type="term">mean</span> time, in hours, it takes to play one soccer match.</p> <p id="fs-idm28701424">If <em data-effect="italics">μ<sub>X</sub></em> = _________, <em data-effect="italics">σ<sub>X</sub></em> = __________, and <em data-effect="italics">n</em> = ___________, then <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(______, ______) by the <span data-type="term">central limit theorem for means</span>.</p> <p id="fs-idm68643920"><em data-effect="italics">μ<sub>X</sub></em> = 2, <em data-effect="italics">σ<sub>X</sub></em> = 0.5, <em data-effect="italics">n</em> = 50, and <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>\(\left(\text{2, }\frac{0.5}{\sqrt{50}}\right)\)</p> <p id="fs-idm39125184">Find <em data-effect="italics">P</em>(1.8 &lt; \(\overline{x}\) &lt; 2.3). Draw a graph.</p> <p id="fs-idp90367824"><em data-effect="italics">P</em>(1.8 &lt; \(\overline{x}\) &lt; 2.3) = 0.9977</p> <p class="finger"><code>normalcdf</code>\(\left(1.\text{8,2}\text{.3,2,}\frac{.5}{\sqrt{50}}\right)\) = 0.9977</p> <p>The probability that the mean time is between 1.8 hours and 2.3 hours is 0.9977.</p> </div> </div> </div> <div id="fs-idm109373536" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp55734224" data-type="exercise"><div id="fs-idp43500160" data-type="problem"><p id="fs-idp24603664">The length of time taken on the SAT for a group of students is normally distributed with a mean of 2.5 hours and a standard deviation of 0.25 hours. A sample size of <em data-effect="italics">n</em> = 60 is drawn randomly from the population. Find the probability that the sample mean is between two hours and three hours.</p> </div> </div> </div> <div id="fs-idm13085488" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="fs-idm146458576">To find percentiles for means on the calculator, follow these steps.</p> <p id="fs-idm39446144">2<sup>nd</sup> DIStR <span data-type="newline"><br /> </span>3:invNorm</p> <p id="fs-idm50856880"><em data-effect="italics">k</em> = invNorm\(\left(\text{area to the left of }k\text{, mean,} \frac{standard deviation}{\sqrt{sample size}}\right)\)</p> <p id="fs-idm74111040">where:</p> <ul id="fs-idm74110784"><li><em data-effect="italics">k</em> = the <em data-effect="italics">k</em><sup>th</sup> percentile</li> <li><em data-effect="italics">mean</em> is the mean of the original distribution</li> <li><em data-effect="italics">standard deviation</em> is the standard deviation of the original distribution</li> <li><em data-effect="italics">sample size</em> = <em data-effect="italics">n</em></li> </ul> </div> <div id="fs-idp22561888" class="textbox textbox--examples" data-type="example"><div id="fs-idp8724208" data-type="exercise"><div id="fs-idp17530752" data-type="problem"><p id="fs-idp27120736">In a recent study reported Oct. 29, 2012 on the Flurry Blog, the mean age of tablet users is 34 years. Suppose the standard deviation is 15 years. Take a sample of size <em data-effect="italics">n</em> = 100.</p> <ol id="fs-idm47147072" type="a"><li>What are the mean and standard deviation for the sample mean ages of tablet users?</li> <li>What does the distribution look like?</li> <li>Find the probability that the sample mean age is more than 30 years (the reported mean age of tablet users in this particular study).</li> <li>Find the 95<sup>th</sup> percentile for the sample mean age (to one decimal place).</li> </ol> </div> <div id="fs-idp42522512" data-type="solution"><ol id="fs-idm50017008" type="a"><li>Since the sample mean tends to target the population mean, we have <em data-effect="italics">μ<sub>χ</sub></em> = <em data-effect="italics">μ</em> = 34. The sample standard deviation is given by <em data-effect="italics">σ<sub>χ</sub></em> = \(\frac{\sigma }{\sqrt{n}}\) = \(\frac{15}{\sqrt{100}}\) = \(\frac{15}{10}\) = 1.5</li> <li>The central limit theorem states that for large sample sizes(<em data-effect="italics">n</em>), the sampling distribution will be approximately normal.</li> <li>The probability that the sample mean age is more than 30 is given by <em data-effect="italics">P</em>(<em data-effect="italics">Χ</em> &gt; 30) = <code>normalcdf</code>(30,E99,34,1.5) = 0.9962</li> <li>Let <em data-effect="italics">k</em> = the 95<sup>th</sup> percentile. <span data-type="newline" data-count="1"><br /> </span><em data-effect="italics">k</em> = invNorm\(\left(0.\text{95,34,}\frac{15}{\sqrt{100}}\right)\) = 36.5</li> </ol> </div> </div> </div> <div id="fs-idm63776496" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm62512800" data-type="exercise"><div id="fs-idm44878080" data-type="problem"><p id="fs-idm48406832">In an article on Flurry Blog, a gaming marketing gap for men between the ages of 30 and 40 is identified. You are researching a startup game targeted at the 35-year-old demographic. Your idea is to develop a strategy game that can be played by men from their late 20s through their late 30s. Based on the article’s data, industry research shows that the average strategy player is 28 years old with a standard deviation of 4.8 years. You take a sample of 100 randomly selected gamers. If your target market is 29- to 35-year-olds, should you continue with your development strategy?</p> </div> </div> </div> <div id="fs-idm59036512" class="textbox textbox--examples" data-type="example"><div id="fs-idp15422768" data-type="exercise"><div id="fs-idp210784" data-type="problem"><p id="fs-idm45834688">The mean number of minutes for app engagement by a tablet user is 8.2 minutes. Suppose the standard deviation is one minute. Take a sample of 60.</p> <ol id="fs-idm51964960" type="a"><li>What are the mean and standard deviation for the sample mean number of app engagement by a tablet user?</li> <li>What is the standard error of the mean?</li> <li>Find the 90<sup>th</sup> percentile for the sample mean time for app engagement for a tablet user. Interpret this value in a complete sentence.</li> <li>Find the probability that the sample mean is between eight minutes and 8.5 minutes.</li> </ol> </div> </div> </div> <div id="fs-idm70207568" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp8586512" data-type="exercise"><div id="fs-idm57780000" data-type="problem"><p id="fs-idm57779872">Cans of a cola beverage claim to contain 16 ounces. The amounts in a sample are measured and the statistics are <em data-effect="italics">n</em> = 34, \(\overline{x}\) = 16.01 ounces. If the cans are filled so that <em data-effect="italics">μ</em> = 16.00 ounces (as labeled) and <em data-effect="italics">σ</em> = 0.143 ounces, find the probability that a sample of 34 cans will have an average amount greater than 16.01 ounces. Do the results suggest that cans are filled with an amount greater than 16 ounces?</p> </div> </div> </div> <div id="fs-idp10661120" class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="fs-idp64632544">Baran, Daya. “20 Percent of Americans Have Never Used Email.”WebGuild, 2010. Available online at http://www.webguild.org/20080519/20-percent-of-americans-have-never-used-email (accessed May 17, 2013).</p> <p id="fs-idp57996496">Data from The Flurry Blog, 2013. Available online at http://blog.flurry.com (accessed May 17, 2013).</p> <p id="fs-idp112347536">Data from the United States Department of Agriculture.</p> </div> <div id="fs-idm33599328" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idm66102448">In a population whose distribution may be known or unknown, if the size (<em data-effect="italics">n</em>) of samples is sufficiently large, the distribution of the sample means will be approximately normal. The mean of the sample means will equal the population mean. The standard deviation of the distribution of the sample means, called the standard error of the mean, is equal to the population standard deviation divided by the square root of the sample size (<em data-effect="italics">n</em>).</p> </div> <div id="fs-idm104045584" class="formula-review" data-depth="1"><h3 data-type="title">Formula Review</h3> <p id="fs-idm43969232">The Central Limit Theorem for Sample Means: \(\overline{X}\) ~ <em data-effect="italics">N</em>\(\left({\mu }_{x}\text{, }\frac{\sigma x}{\sqrt{n}}\right)\)</p> <p id="fs-idm14943744">The Mean \(\overline{X}\): <em data-effect="italics">μ<sub>x</sub></em></p> <p id="fs-idp59416400">Central Limit Theorem for Sample Means z-score and standard error of the mean: \(z=\frac{\overline{x}-{\mu }_{x}}{\left(\frac{{\sigma }_{x}}{\sqrt{n}}\right)}\)</p> <p id="fs-idp66400464">Standard Error of the Mean (Standard Deviation (\(\overline{X}\))): \(\frac{{\sigma }_{x}}{\sqrt{n}}\)</p> </div> <div id="fs-idm23437024" class="practice" data-depth="1"><p id="fs-idm21552528"><em data-effect="italics">Use the following information to answer the next six exercises:</em> Yoonie is a personnel manager in a large corporation. Each month she must review 16 of the employees. From past experience, she has found that the reviews take her approximately four hours each to do with a population standard deviation of 1.2 hours. Let <em data-effect="italics">Χ</em> be the random variable representing the time it takes her to complete one review. Assume <em data-effect="italics">Χ</em> is normally distributed. Let \(\overline{X}\) be the random variable representing the mean time to complete the 16 reviews. Assume that the 16 reviews represent a random set of reviews.</p> <div id="fs-idm15565424" data-type="exercise"><div id="fs-idm37731072" data-type="problem"><p id="fs-idm37730816">What is the mean, standard deviation, and sample size?</p> </div> <div id="eip-id1165897421719" data-type="solution"><p id="eip-id1165895503793">mean = 4 hours; standard deviation = 1.2 hours; sample size = 16</p> </div> </div> <div id="fs-idp17869408" data-type="exercise"><div id="fs-idm62741104" data-type="problem"><p id="fs-idm25207024">Complete the distributions.</p> <ol id="fs-idm16577088" type="a"><li><em data-effect="italics">X</em> ~ _____(_____,_____)</li> <li>\(\overline{X}\) ~ _____(_____,_____)</li> </ol> </div> <p>solution  X ~ N(4, 1.2). X ¯ ~ N ( 4,  1.2 16 ) &#8211;&gt;</p> </div> <div data-type="exercise"><div id="id16709593" data-type="problem"><p>Find the probability that <strong>one</strong> review will take Yoonie from 3.5 to 4.25 hours. Sketch the graph, labeling and scaling the horizontal axis. Shade the region corresponding to the probability.</p> <ol id="sublist1346264" type="a"><li><div id="fs-idm61024352" class="bc-figure figure"><span id="id14288416" data-type="media" data-alt="This is a frequency curve for a normal distribution. It shows a single peak in the center with the curve tapering down to the horizontal axis on each side. The distribution is symmetrical. The horizontal axis represents the random variable X."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch07_06_01-1.jpg" alt="This is a frequency curve for a normal distribution. It shows a single peak in the center with the curve tapering down to the horizontal axis on each side. The distribution is symmetrical. The horizontal axis represents the random variable X." width="380" data-media-type="image/jpg" /></span></div> </li> <li><em data-effect="italics">P</em>(________ &lt; <em data-effect="italics">x</em> &lt; ________) = _______</li> </ol> </div> <div id="id14567929" data-type="solution"><p>a. Check student&#8217;s solution.<span data-type="newline"><br /> </span> b. 3.5, 4.25, 0.2441</p> </div> </div> <div data-type="exercise"><div id="id14777355" data-type="problem"><p id="element-914">Find the probability that the <strong>mean</strong> of a month’s reviews will take Yoonie from 3.5 to 4.25 hrs. Sketch the graph, labeling and scaling the horizontal axis. Shade the region corresponding to the probability.</p> <ol id="sublist27646547" type="a"><li><div id="fs-idm31915744" class="bc-figure figure"><span id="fs-idm62152464" data-type="media" data-alt="This is a frequency curve for a normal distribution. It shows a single peak in the center with the curve tapering down to the horizontal axis on each side. The distribution is symmetrical. The horizontal axis represents the random variable X."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch07_06_02-1.jpg" alt="This is a frequency curve for a normal distribution. It shows a single peak in the center with the curve tapering down to the horizontal axis on each side. The distribution is symmetrical. The horizontal axis represents the random variable X." width="380" data-media-type="image/jpg" /></span></div> </li> <li><em data-effect="italics">P</em>(________________) = _______</li> </ol> </div> <p>solution  b. 0.7499 &#8211;&gt;</p> </div> <div id="fs-idm42157984" data-type="exercise"><div id="fs-idm42157728" data-type="problem"><p id="fs-idm42157472">What causes the probabilities in <a class="autogenerated-content" href="#element-446">(Figure)</a> and <a class="autogenerated-content" href="#element-322">(Figure)</a> to be different?</p> </div> <div id="fs-idm43625488" data-type="solution"><p id="fs-idm67842816">The fact that the two distributions are different accounts for the different probabilities.</p> </div> </div> <div id="fs-idm70816816" data-type="exercise"><div id="fs-idm70816560" data-type="problem"><p id="fs-idm70816304">Find the 95<sup>th</sup> percentile for the mean time to complete one month&#8217;s reviews. Sketch the graph.</p> <ol id="fs-idm62305696" type="a"><li><div id="fs-idm79810480" class="bc-figure figure"><span id="fs-idm101235264" data-type="media" data-alt="This is a frequency curve for a normal distribution. It shows a single peak in the center with the curve tapering down to the horizontal axis on each side. The distribution is symmetrical. The horizontal axis represents the random variable X."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch07_06_02-1.jpg" alt="This is a frequency curve for a normal distribution. It shows a single peak in the center with the curve tapering down to the horizontal axis on each side. The distribution is symmetrical. The horizontal axis represents the random variable X." width="380" data-media-type="image/jpg" /></span></div> </li> <li>The 95<sup>th</sup> Percentile =____________</li> </ol> </div> <p>solution  P(3.5 &lt; x ¯ &lt; 4.25) = <code>invNorm</code> ( 95,4, 1.2 16 ) = 4.49 &#8211;&gt;</p> </div> </div> <div id="fs-idp66376176" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <div id="element-475" data-type="exercise"><div id="id6273705" data-type="problem"><p id="element-850">1) Previously, De Anza statistics students estimated that the amount of change daytime statistics students carry is exponentially distributed with a mean of \$0.88. Suppose that we randomly pick 25 daytime statistics students.</p> <ol type="a"><li>In words, <em data-effect="italics">Χ</em> = ____________</li> <li><em data-effect="italics">Χ</em> ~ _____(_____,_____)</li> <li>In words, \(\overline{X}\) = ____________</li> <li>\(\overline{X}\) ~ ______ (______, ______)</li> <li>Find the probability that an individual had between \$0.80 and \$1.00. Graph the situation, and shade in the area to be determined.</li> <li>Find the probability that the average of the 25 students was between \$0.80 and \$1.00. Graph the situation, and shade in the area to be determined.</li> <li>Explain why there is a difference in part e and part f.</li> </ol> </div> <div id="fs-idp10454080" data-type="solution"><p>&nbsp;</p> </div> </div> <div id="element-51" data-type="exercise"><div id="id6273949" data-type="problem"><p id="element-365">2) Suppose that the distance of fly balls hit to the outfield (in baseball) is normally distributed with a mean of 250 feet and a standard deviation of 50 feet. We randomly sample 49 fly balls.</p> <ol type="a"><li>If \(\overline{X}\) = average distance in feet for 49 fly balls, then \(\overline{X}\) ~ _______(_______,_______)</li> <li>What is the probability that the 49 balls traveled an average of less than 240 feet? Sketch the graph. Scale the horizontal axis for \(\overline{X}\). Shade the region corresponding to the probability. Find the probability.</li> <li>Find the 80<sup>th</sup> percentile of the distribution of the average of 49 fly balls.</li> </ol> </div> <p>&nbsp;</p> </div> <div id="element-820" data-type="exercise"><div id="id6533450" data-type="problem"><p id="element-518">3) According to the Internal Revenue Service, the average length of time for an individual to complete (keep records for, learn, prepare, copy, assemble, and send) IRS Form 1040 is 10.53 hours (without any attached schedules). The distribution is unknown. Let us assume that the standard deviation is two hours. Suppose we randomly sample 36 taxpayers.</p> <ol id="element-685" type="a"><li>In words, <em data-effect="italics">Χ</em> = _____________</li> <li>In words, \(\overline{X}\) = _____________</li> <li>\(\overline{X}\) ~ _____(_____,_____)</li> <li>Would you be surprised if the 36 taxpayers finished their Form 1040s in an average of more than 12 hours? Explain why or why not in complete sentences.</li> <li>Would you be surprised if one taxpayer finished his or her Form 1040 in more than 12 hours? In a complete sentence, explain why.</li> </ol> </div> <div id="fs-idm166717360" data-type="solution"><p>&nbsp;</p> </div> </div> <div id="element-133" data-type="exercise"><div id="id6533652" data-type="problem"><p>4) Suppose that a category of world-class runners are known to run a marathon (26 miles) in an average of 145 minutes with a standard deviation of 14 minutes. Consider 49 of the races. Let \(\overline{X}\) the average of the 49 races.</p> <ol type="a"><li>\(\overline{X}\) ~ _____(_____,_____)</li> <li>Find the probability that the runner will average between 142 and 146 minutes in these 49 marathons.</li> <li>Find the 80<sup>th</sup> percentile for the average of these 49 marathons.</li> <li>Find the median of the average running times.</li> </ol> </div> <p>solution  N ( 145,  14 49 ) 0.6247 146.68 145 minutes &#8211;&gt;</p> </div> <div data-type="exercise"><div id="id6534960" data-type="problem"><p>5) The length of songs in a collector’s iTunes album collection is uniformly distributed from two to 3.5 minutes. Suppose we randomly pick five albums from the collection. There are a total of 43 songs on the five albums.</p> <ol type="a"><li>In words, <em data-effect="italics">Χ</em> = _________</li> <li><em data-effect="italics">Χ</em> ~ _____________</li> <li>In words, \(\overline{X}\) = _____________</li> <li>\(\overline{X}\) ~ _____(_____,_____)</li> <li>Find the first quartile for the average song length, \(\overline{X}\).</li> <li>The IQR (interquartile range) for the average song length, \(\overline{X}\), is from ___ &#8211; ___.</li> </ol> </div> <div id="fs-idm135637280" data-type="solution"><p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id6536525" data-type="problem"><p>6) In 1940 the average size of a U.S. farm was 174 acres. Let’s say that the standard deviation was 55 acres. Suppose we randomly survey 38 farmers from 1940.</p> <ol id="madeup2" type="a"><li>In words, <em data-effect="italics">Χ</em> = _____________</li> <li>In words, \(\overline{X}\) = _____________</li> <li>\(\overline{X}\) ~ _____(_____,_____)</li> <li>The IQR for \(\overline{X}\) is from _______ acres to _______ acres.</li> </ol> </div> <p>&nbsp;</p> </div> <div id="element-371" data-type="exercise"><div id="id6264723" data-type="problem"><p id="fs-idm19354672">7) Determine which of the following are true and which are false. Then, in complete sentences, justify your answers.</p> <ol id="fs-idm39736064" type="a"><li>When the sample size is large, the mean of \(\overline{X}\) is approximately equal to the mean of <em data-effect="italics">Χ</em>.</li> <li>When the sample size is large, \(\overline{X}\) is approximately normally distributed.</li> <li>When the sample size is large, the standard deviation of \(\overline{X}\) is approximately the same as the standard deviation of <em data-effect="italics">Χ</em>.</li> </ol> </div> <div id="fs-idp58516336" data-type="solution"><p>&nbsp;</p> </div> </div> <div id="element-169" data-type="exercise"><div id="id6272391" data-type="problem"><p>8) The percent of fat calories that a person in America consumes each day is normally distributed with a mean of about 36 and a standard deviation of about ten. Suppose that 16 individuals are randomly chosen. Let \(\overline{X}\) = average percent of fat calories.</p> <ol id="eip-idm27753728" type="a"><li>\(\overline{X}\) ~ ______(______, ______)</li> <li>For the group of 16, find the probability that the average percent of fat calories consumed is more than five. Graph the situation and shade in the area to be determined.</li> <li>Find the first quartile for the average percent of fat calories.</li> </ol> </div> <p>&nbsp;</p> </div> <div data-type="exercise"><div id="id6535840" data-type="problem"><p id="fs-idm32345632">9) The distribution of income in some Third World countries is considered wedge shaped (many very poor people, very few middle income people, and even fewer wealthy people). Suppose we pick a country with a wedge shaped distribution. Let the average salary be \$2,000 per year with a standard deviation of \$8,000. We randomly survey 1,000 residents of that country.</p> <ol type="a"><li>In words, <em data-effect="italics">Χ</em> = _____________</li> <li>In words, \(\overline{X}\) = _____________</li> <li>\(\overline{X}\) ~ _____(_____,_____)</li> <li>How is it possible for the standard deviation to be greater than the average?</li> <li>Why is it more likely that the average of the 1,000 residents will be from \$2,000 to \$2,100 than from \$2,100 to \$2,200?</li> </ol> </div> <div id="fs-idm50399184" data-type="solution"><p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id6538396" data-type="problem"><p>10) Which of the following is NOT TRUE about the distribution for averages?</p> <ol type="a"><li>The mean, median, and mode are equal.</li> <li>The area under the curve is one.</li> <li>The curve never touches the <em data-effect="italics">x</em>-axis.</li> <li>The curve is skewed to the right.</li> </ol> </div> <p>&nbsp;</p> </div> <div data-type="exercise"><div id="id6538522" data-type="problem"><p id="fs-idp43800992">11) The cost of unleaded gasoline in the Bay Area once followed an unknown distribution with a mean of \$4.59 and a standard deviation of \$0.10. Sixteen gas stations from the Bay Area are randomly chosen. We are interested in the average cost of gasoline for the 16 gas stations. The distribution to use for the average cost of gasoline for the 16 gas stations is:</p> <ol type="a"><li>\(\overline{X}\) ~ <em data-effect="italics">N</em>(4.59, 0.10)</li> <li>\(\overline{X}\) ~ <em data-effect="italics">N</em>\(\left(\text{4}\text{.59, }\frac{0.10}{\sqrt{16}}\right)\)</li> <li>\(\overline{X}\) ~ <em data-effect="italics">N</em>\(\left(\text{4}\text{.59, }\frac{16}{0.10}\right)\)</li> <li>\(\overline{X}\) ~ <em data-effect="italics">N</em>\(\left(\text{4}\text{.59, }\frac{\sqrt{16}}{0.10}\right)\)</li> </ol> </div> <div id="id6538759" data-type="solution"><p>&nbsp;</p> <p><strong>Answers to odd questions</strong></p> <p>1)</p> <ol id="fs-idm137132816" type="a"><li><em data-effect="italics">Χ</em> = amount of change students carry</li> <li><em data-effect="italics">Χ</em> ~ <em data-effect="italics">E</em>(0.88, 0.88)</li> <li>\(\overline{X}\) = average amount of change carried by a sample of 25 sstudents.</li> <li>\(\overline{X}\) ~ <em data-effect="italics">N</em>(0.88, 0.176)</li> <li>0.0819</li> <li>0.1882</li> <li>The distributions are different. Part a is exponential and part b is normal.</li> </ol> <p>3)</p> <ol id="fs-idm37918272" type="a"><li>length of time for an individual to complete IRS form 1040, in hours.</li> <li>mean length of time for a sample of 36 taxpayers to complete IRS form 1040, in hours.</li> <li><em data-effect="italics">N</em>\(\left(\text{10}\text{.53, }\frac{1}{3}\right)\)</li> <li>Yes. I would be surprised, because the probability is almost 0.</li> <li>No. I would not be totally surprised because the probability is 0.2312</li> </ol> <p>5)</p> <ol id="fs-idm135637024" type="a"><li>the length of a song, in minutes, in the collection</li> <li><em data-effect="italics">U</em>(2, 3.5)</li> <li>the average length, in minutes, of the songs from a sample of five albums from the collection</li> <li><em data-effect="italics">N</em>(2.75, 0.0660)</li> <li>2.71 minutes</li> <li>0.09 minutes</li> </ol> <p>7)</p> <ol id="fs-idp58516592" type="a"><li>True. The mean of a sampling distribution of the means is approximately the mean of the data distribution.</li> <li>True. According to the Central Limit Theorem, the larger the sample, the closer the sampling distribution of the means becomes normal.</li> <li>The standard deviation of the sampling distribution of the means will decrease making it approximately the same as the standard deviation of X as the sample size increases.</li> </ol> <p>9)</p> <ol id="fs-idp55008880" type="a"><li><em data-effect="italics">X</em> = the yearly income of someone in a third world country</li> <li>the average salary from samples of 1,000 residents of a third world country</li> <li>\(\overline{X}\) ∼ <em data-effect="italics">N</em>\(\left(\text{2000, }\frac{\text{8000}}{\sqrt{\text{1000}}}\right)\)</li> <li>Very wide differences in data values can have averages smaller than standard deviations.</li> <li>The distribution of the sample mean will have higher probabilities closer to the population mean. <span data-type="newline"><br /> </span><em data-effect="italics">P</em>(2000 &lt; \(\overline{X}\) &lt; 2100) = 0.1537 <span data-type="newline"><br /> </span><em data-effect="italics">P</em>(2100 &lt; \(\overline{X}\) &lt; 2200) = 0.1317</li> </ol> <p>11) b</p> </div> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl><dt>Average</dt> <dd>a number that describes the central tendency of the data; there are a number of specialized averages, including the arithmetic mean, weighted mean, median, mode, and geometric mean.</dd> </dl> <dl id="centlimit"><dt>Central Limit Theorem</dt> <dd id="id43867050">Given a random variable (RV) with known mean <em data-effect="italics">μ</em> and known standard deviation, <em data-effect="italics">σ</em>, we are sampling with size <em data-effect="italics">n</em>, and we are interested in two new RVs: the sample mean, \(\overline{X}\), and the sample sum, <em data-effect="italics">ΣΧ</em>. If the size (<em data-effect="italics">n</em>) of the sample is sufficiently large, then \(\overline{X}\) ~ <em data-effect="italics">N</em>(<em data-effect="italics">μ</em>, \(\frac{\sigma }{\sqrt{n}}\)) and <em data-effect="italics">ΣΧ</em> ~ <em data-effect="italics">N</em>(<em data-effect="italics">nμ</em>, (\(\sqrt{n}\))(<em data-effect="italics">σ</em>)). If the size (<em data-effect="italics">n</em>) of the sample is sufficiently large, then the distribution of the sample means and the distribution of the sample sums will approximate a normal distributions regardless of the shape of the population. The mean of the sample means will equal the population mean, and the mean of the sample sums will equal <em data-effect="italics">n</em> times the population mean. The standard deviation of the distribution of the sample means, \(\frac{\sigma }{\sqrt{n}}\), is called the standard error of the mean.</dd> </dl> <dl><dt>Normal Distribution</dt> <dd>a continuous random variable (RV) with pdf \(f\left(x\right)\text{ = }\frac{1}{\sigma \sqrt{2\pi }} {e}^{\frac{–{\text{(}x\text{ }–\text{ }\mu \right)}^{2}}{2{\sigma }^{2}}}\), where <em data-effect="italics">μ</em> is the mean of the distribution and <em data-effect="italics">σ</em> is the standard deviation; notation: <em data-effect="italics">Χ</em> ~ <em data-effect="italics">N</em>(<em data-effect="italics">μ</em>, <em data-effect="italics">σ</em>). If <em data-effect="italics">μ</em> = 0 and <em data-effect="italics">σ</em> = 1, the RV is called a <strong>standard normal distribution</strong>.</dd> </dl> <dl id="stdmean"><dt>Standard Error of the Mean</dt> <dd id="id6201946">the standard deviation of the distribution of the sample means, or \(\frac{\sigma }{\sqrt{n}}\).</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-central-limit-theorem-pocket-change" title="Activity 8.3: Central Limit Theorem (Pocket Change)"><div class="chapter-title-wrap"><h3 class="chapter-number">48</h3><h2 class="chapter-title"><span class="display-none">Activity 8.3: Central Limit Theorem (Pocket Change)</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-id1172804210713" class="statistics lab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Central Limit Theorem (Pocket Change)</div> <p id="id48439801">Class Time:</p> <p id="id48439812">Names:</p> <div id="id48439824" data-type="list"><div data-type="title">Student Learning Outcomes</div> <ul><li>The student will demonstrate and compare properties of the central limit theorem.</li> </ul> </div> <div id="id17413334" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="fs-idm68262064">This lab works best when sampling from several classes and combining data.</p> </div> <div id="list-9099999888" data-type="list"><div data-type="title">Collect the Data</div> <ol><li>Count the change in your pocket. (Do not include bills.)</li> <li>Randomly survey 30 classmates. Record the values of the change in <a class="autogenerated-content" href="#Ch07_lab1_tbl001">(Figure)</a>.<br /> <table id="Ch07_lab1_tbl001" summary="Blank table with 30 empty cells."><tbody><tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> </tbody> </table> </li> <li>Construct a histogram. Make five to six intervals. Sketch the graph using a ruler and pencil. Scale the axes.<span data-type="newline"><br /> </span> <div id="id17573354" class="bc-figure figure"><span id="id17573359" data-type="media" data-alt="Blank graph template. The horizontal axis is labeled Value of the change and the vertical axis is labeled Frequency."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch07_09_01-1.png" alt="Blank graph template. The horizontal axis is labeled Value of the change and the vertical axis is labeled Frequency." width="380" data-media-type="image/png" /></span></div> </li> <li>Calculate the following (<em data-effect="italics">n</em> = 1; surveying one person at a time): <ol id="list-975947" type="a"><li>\(\overline{x}\) = _______</li> <li><em data-effect="italics">s</em> = _______</li> </ol> </li> <li>Draw a smooth curve through the tops of the bars of the histogram. Use one to two complete sentences to describe the general shape of the curve.</li> </ol> </div> <p id="element-509"><span data-type="title">Collecting Averages of Pairs</span>Repeat steps one through five of the section <a href="#CollectData">Collect the Data.</a> with one exception. Instead of recording the change of 30 classmates, record the average change of 30 pairs.</p> <ol id="list-9759443"><li>Randomly survey 30 <strong>pairs</strong> of classmates.</li> <li>Record the values of the average of their change in <a class="autogenerated-content" href="#Ch07_lab1_tbl002">(Figure)</a>.<br /> <table id="Ch07_lab1_tbl002" summary="Blank table containing 30 blank cells."><tbody><tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> </tbody> </table> </li> <li>Construct a histogram. Scale the axes using the same scaling you used for the section titled <a href="#CollectData">Collect the Data</a>. Sketch the graph using a ruler and a pencil.<span data-type="newline"><br /> </span> <div id="id17561753" class="bc-figure figure"><span id="id17561758" data-type="media" data-alt="This is a blank graph template. The horizontal axis is labeled Value of the change and the vertical axis is labeled Frequency."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch07_09_02-1.png" alt="This is a blank graph template. The horizontal axis is labeled Value of the change and the vertical axis is labeled Frequency." width="380" data-media-type="image/png" /></span></div> </li> <li>Calculate the following (<em data-effect="italics">n</em> = 2; surveying two people at a time): <ol id="list-975875947" type="a"><li>\(\overline{x}\) = _______</li> <li><em data-effect="italics">s</em> = _______</li> </ol> </li> <li>Draw a smooth curve through tops of the bars of the histogram. Use one to two complete sentences to describe the general shape of the curve.</li> </ol> <p id="element-505436759"><span data-type="title">Collecting Averages of Groups of Five</span>Repeat steps one through five (of the section titled <a href="#CollectData">Collect the Data</a>) with one exception. Instead of recording the change of 30 classmates, record the average change of 30 groups of five.</p> <ol id="list-9735659975974443" type="1"><li>Randomly survey 30 <strong>groups of five</strong> classmates.</li> <li>Record the values of the average of their change.<br /> <table id="Ch07_lab1_tbl003" summary="Blank table containing 30 blank cells."><tbody><tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> </tbody> </table> </li> <li>Construct a histogram. Scale the axes using the same scaling you used for the section titled <a href="#CollectData">Collect the Data</a>. Sketch the graph using a ruler and a pencil. <div id="id17563881" class="bc-figure figure"><span id="id17563886" data-type="media" data-alt="This is a blank graph template. The horizontal axis is labeled Value of the change and the vertical axis is labeled Frequency."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch07_09_03-1.png" alt="This is a blank graph template. The horizontal axis is labeled Value of the change and the vertical axis is labeled Frequency." width="380" data-media-type="image/png" /></span></div> </li> <li>Calculate the following (<em data-effect="italics">n</em> = 5; surveying five people at a time): <ol id="list-97587555947" type="a"><li>\(\overline{x}\) = _______</li> <li><em data-effect="italics">s</em> = _______</li> </ol> </li> <li>Draw a smooth curve through tops of the bars of the histogram. Use one to two complete sentences to describe the general shape of the curve.</li> </ol> <div data-type="list"><div data-type="title">Discussion Questions</div> <ol><li>Why did the shape of the distribution of the data change, as <em data-effect="italics">n</em> changed? Use one to two complete sentences to explain what happened.</li> <li>In the section titled <a href="#CollectData">Collect the Data</a>, what was the approximate distribution of the data? <em data-effect="italics">X</em> ~ _____(_____,_____)</li> <li>In the section titled <a href="#element-505436759">Collecting Averages of Groups of Five</a>, what was the approximate distribution of the averages? \(\overline{X}\) ~ _____(_____,_____)</li> <li>In one to two complete sentences, explain any differences in your answers to the previous two questions.</li> </ol> </div> </div> </div></div>
<div class="chapter standard" id="chapter-central-limit-theorem-cookie-recipes" title="Activity 8.4: Central Limit Theorem (Cookie Recipes)"><div class="chapter-title-wrap"><h3 class="chapter-number">49</h3><h2 class="chapter-title"><span class="display-none">Activity 8.4: Central Limit Theorem (Cookie Recipes)</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-id1171897709330" class="statistics lab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Central Limit Theorem (Cookie Recipes)</div> <p id="id5609695">Class Time:</p> <p id="id5609702">Names:</p> <div id="list-2397864897" data-type="list"><div data-type="title">Student Learning Outcomes</div> <ul><li>The student will demonstrate and compare properties of the central limit theorem.</li> </ul> </div> <p><span data-type="title">Given</span><em data-effect="italics">X</em> = length of time (in days) that a cookie recipe lasted at the Olmstead Homestead. (Assume that each of the different recipes makes the same quantity of cookies.)</p> <table id="tableones23" summary="This table presents recipe data in the first column and X length of days data in the second column of 15 rows. These two columns repeat three more times in the same table."><thead><tr><th data-align="center">Recipe #</th> <th data-align="center"><em data-effect="italics">X</em></th> <th></th> <th data-align="center">Recipe #</th> <th data-align="center"><em data-effect="italics">X</em></th> <th></th> <th data-align="center">Recipe #</th> <th data-align="center"><em data-effect="italics">X</em></th> <th></th> <th data-align="center">Recipe #</th> <th data-align="center"><em data-effect="italics">X</em></th> </tr> </thead> <tbody><tr><td data-align="center">1</td> <td data-align="center">1</td> <td></td> <td data-align="center">16</td> <td data-align="center">2</td> <td></td> <td data-align="center">31</td> <td data-align="center">3</td> <td></td> <td data-align="center">46</td> <td data-align="center">2</td> </tr> <tr><td data-align="center">2</td> <td data-align="center">5</td> <td></td> <td data-align="center">17</td> <td data-align="center">2</td> <td></td> <td data-align="center">32</td> <td data-align="center">4</td> <td></td> <td data-align="center">47</td> <td data-align="center">2</td> </tr> <tr><td data-align="center">3</td> <td data-align="center">2</td> <td></td> <td data-align="center">18</td> <td data-align="center">4</td> <td></td> <td data-align="center">33</td> <td data-align="center">5</td> <td></td> <td data-align="center">48</td> <td data-align="center">11</td> </tr> <tr><td data-align="center">4</td> <td data-align="center">5</td> <td></td> <td data-align="center">19</td> <td data-align="center">6</td> <td></td> <td data-align="center">34</td> <td data-align="center">6</td> <td></td> <td data-align="center">49</td> <td data-align="center">5</td> </tr> <tr><td data-align="center">5</td> <td data-align="center">6</td> <td></td> <td data-align="center">20</td> <td data-align="center">1</td> <td></td> <td data-align="center">35</td> <td data-align="center">6</td> <td></td> <td data-align="center">50</td> <td data-align="center">5</td> </tr> <tr><td data-align="center">6</td> <td data-align="center">1</td> <td></td> <td data-align="center">21</td> <td data-align="center">6</td> <td></td> <td data-align="center">36</td> <td data-align="center">1</td> <td></td> <td data-align="center">51</td> <td data-align="center">4</td> </tr> <tr><td data-align="center">7</td> <td data-align="center">2</td> <td></td> <td data-align="center">22</td> <td data-align="center">5</td> <td></td> <td data-align="center">37</td> <td data-align="center">1</td> <td></td> <td data-align="center">52</td> <td data-align="center">6</td> </tr> <tr><td data-align="center">8</td> <td data-align="center">6</td> <td></td> <td data-align="center">23</td> <td data-align="center">2</td> <td></td> <td data-align="center">38</td> <td data-align="center">2</td> <td></td> <td data-align="center">53</td> <td data-align="center">5</td> </tr> <tr><td data-align="center">9</td> <td data-align="center">5</td> <td></td> <td data-align="center">24</td> <td data-align="center">5</td> <td></td> <td data-align="center">39</td> <td data-align="center">1</td> <td></td> <td data-align="center">54</td> <td data-align="center">1</td> </tr> <tr><td data-align="center">10</td> <td data-align="center">2</td> <td></td> <td data-align="center">25</td> <td data-align="center">1</td> <td></td> <td data-align="center">40</td> <td data-align="center">6</td> <td></td> <td data-align="center">55</td> <td data-align="center">1</td> </tr> <tr><td data-align="center">11</td> <td data-align="center">5</td> <td></td> <td data-align="center">26</td> <td data-align="center">6</td> <td></td> <td data-align="center">41</td> <td data-align="center">1</td> <td></td> <td data-align="center">56</td> <td data-align="center">2</td> </tr> <tr><td data-align="center">12</td> <td data-align="center">1</td> <td></td> <td data-align="center">27</td> <td data-align="center">4</td> <td></td> <td data-align="center">42</td> <td data-align="center">6</td> <td></td> <td data-align="center">57</td> <td data-align="center">4</td> </tr> <tr><td data-align="center">13</td> <td data-align="center">1</td> <td></td> <td data-align="center">28</td> <td data-align="center">1</td> <td></td> <td data-align="center">43</td> <td data-align="center">2</td> <td></td> <td data-align="center">58</td> <td data-align="center">3</td> </tr> <tr><td data-align="center">14</td> <td data-align="center">3</td> <td></td> <td data-align="center">29</td> <td data-align="center">6</td> <td></td> <td data-align="center">44</td> <td data-align="center">6</td> <td></td> <td data-align="center">59</td> <td data-align="center">6</td> </tr> <tr><td data-align="center">15</td> <td data-align="center">2</td> <td></td> <td data-align="center">30</td> <td data-align="center">2</td> <td></td> <td data-align="center">45</td> <td data-align="center">2</td> <td></td> <td data-align="center">60</td> <td data-align="center">5</td> </tr> </tbody> </table> <p id="element-538">Calculate the following:</p> <ol id="list-2396953" type="a"><li><em data-effect="italics">μ<sub>x</sub></em> = _______</li> <li><em data-effect="italics">σ<sub>x</sub></em> = _______</li> </ol> <p id="element-453" class="finger"><span data-type="title">Collect the Data</span>Use a random number generator to randomly select four samples of size <em data-effect="italics">n</em> = 5 from the given population. Record your samples in <a class="autogenerated-content" href="#Ch07_lab2_tbl002">(Figure)</a>. Then, for each sample, calculate the mean to the nearest tenth. Record them in the spaces provided. Record the sample means for the rest of the class.</p> <ol id="list-98698565"><li>Complete the table:<br /> <table id="Ch07_lab2_tbl002" summary="Partially filled table with samples (4) in columns and 5 blank rows plus a blank Means row."><thead><tr><th></th> <th>Sample 1</th> <th>Sample 2</th> <th>Sample 3</th> <th>Sample 4</th> <th>Sample means from other groups:</th> </tr> </thead> <tbody><tr><td></td> <td></td> <td></td> <td></td> <td></td> <td></td> </tr> <tr><td></td> <td></td> <td></td> <td></td> <td></td> <td></td> </tr> <tr><td></td> <td></td> <td></td> <td></td> <td></td> <td></td> </tr> <tr><td></td> <td></td> <td></td> <td></td> <td></td> <td></td> </tr> <tr><td></td> <td></td> <td></td> <td></td> <td></td> <td></td> </tr> <tr><td>Means:</td> <td>\(\overline{x}\) = ____</td> <td>\(\overline{x}\) = ____</td> <td>\(\overline{x}\) = ____</td> <td>\(\overline{x}\) = ____</td> <td></td> </tr> </tbody> </table> </li> <li>Calculate the following: <ol id="eip-idp136101696" type="a"><li>\(\overline{x}\) = _______</li> <li><em data-effect="italics">s</em><sub>\(\overline{x}\)</sub> = _______</li> </ol> </li> <li class="finger">Again, use a random number generator to randomly select four samples from the population. This time, make the samples of size <em data-effect="italics">n</em> = 10. Record the samples in <a class="autogenerated-content" href="#Ch07_lab2_tbl003">(Figure)</a>. As before, for each sample, calculate the mean to the nearest tenth. Record them in the spaces provided. Record the sample means for the rest of the class.<br /> <table id="Ch07_lab2_tbl003" summary="Same as the above table except with 10 blank rows and a means row."><thead><tr><th></th> <th>Sample 1</th> <th>Sample 2</th> <th>Sample 3</th> <th>Sample 4</th> <th>Sample means from other groups</th> </tr> </thead> <tbody><tr><td></td> <td></td> <td></td> <td></td> <td></td> <td></td> </tr> <tr><td></td> <td></td> <td></td> <td></td> <td></td> <td></td> </tr> <tr><td></td> <td></td> <td></td> <td></td> <td></td> <td></td> </tr> <tr><td></td> <td></td> <td></td> <td></td> <td></td> <td></td> </tr> <tr><td></td> <td></td> <td></td> <td></td> <td></td> <td></td> </tr> <tr><td></td> <td></td> <td></td> <td></td> <td></td> <td></td> </tr> <tr><td></td> <td></td> <td></td> <td></td> <td></td> <td></td> </tr> <tr><td></td> <td></td> <td></td> <td></td> <td></td> <td></td> </tr> <tr><td></td> <td></td> <td></td> <td></td> <td></td> <td></td> </tr> <tr><td></td> <td></td> <td></td> <td></td> <td></td> <td></td> </tr> <tr><td>Means:</td> <td>\(\overline{x}\) = ____</td> <td>\(\overline{x}\) = ____</td> <td>\(\overline{x}\) = ____</td> <td>\(\overline{x}\) = ____</td> <td></td> </tr> </tbody> </table> </li> <li>Calculate the following: <ol id="element-8475643" type="a"><li>\(\overline{x}\) = ______</li> <li><em data-effect="italics">s</em><sub>\(\overline{x}\)</sub> = ______</li> </ol> </li> <li>For the original population, construct a histogram. Make intervals with a bar width of one day. Sketch the graph using a ruler and pencil. Scale the axes.<span data-type="newline"><br /> </span> <div id="id6597795" class="bc-figure figure"><span id="id6597799" data-type="media" data-alt="This is a blank graph template. The horizontal axis is labeled Time (days) and the vertical axis is labeled Frequency."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch07_10_01-1.png" alt="This is a blank graph template. The horizontal axis is labeled Time (days) and the vertical axis is labeled Frequency." width="380" data-media-type="image/png" /></span></div> </li> <li>Draw a smooth curve through the tops of the bars of the histogram. Use one to two complete sentences to describe the general shape of the curve.</li> </ol> <div id="list-2646" data-type="list"><div data-type="title">Repeat the Procedure for <em data-effect="italics">n</em> = 5</div> <ol><li>For the sample of <em data-effect="italics">n</em> = 5 days averaged together, construct a histogram of the averages (your means together with the means of the other groups). Make intervals with bar widths of \(\frac{1}{2}\) a day. Sketch the graph using a ruler and pencil. Scale the axes. <div id="id6597892" class="bc-figure figure"><span id="id6597897" data-type="media" data-alt="This is a blank graph template. The horizontal axis is labeled Time (days) and the vertical axis is labeled Frequency."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch07_10_02-1.png" alt="This is a blank graph template. The horizontal axis is labeled Time (days) and the vertical axis is labeled Frequency." width="380" data-media-type="image/png" /></span></div> </li> <li>Draw a smooth curve through the tops of the bars of the histogram. Use one to two complete sentences to describe the general shape of the curve.</li> </ol> </div> <div id="list-2764646" data-type="list"><div data-type="title">Repeat the Procedure for <em data-effect="italics">n</em> = 10</div> <ol><li>For the sample of <em data-effect="italics">n</em> = 10 days averaged together, construct a histogram of the averages (your means together with the means of the other groups). Make intervals with bar widths of \(\frac{1}{2}\) a day. Sketch the graph using a ruler and pencil. Scale the axes. <div id="id6597991" class="bc-figure figure"><span id="id6597997" data-type="media" data-alt="This is a blank graph template. The horizontal axis is labeled Time (days) and the vertical axis is labeled Frequency."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch07_10_03-1.png" alt="This is a blank graph template. The horizontal axis is labeled Time (days) and the vertical axis is labeled Frequency." width="380" data-media-type="image/png" /></span></div> </li> <li>Draw a smooth curve through the tops of the bars of the histogram. Use one to two complete sentences to describe the general shape of the curve.</li> </ol> </div> <div id="list-2937597532" data-type="list"><div data-type="title">Discussion Questions</div> <ol data-mark-suffix="."><li>Compare the three histograms you have made, the one for the population and the two for the sample means. In three to five sentences, describe the similarities and differences.</li> <li>State the theoretical (according to the clt) distributions for the sample means. <ol id="list-239875697325" type="a"><li><em data-effect="italics">n</em> = 5: \(\overline{x}\) ~ _____(_____,_____)</li> <li><em data-effect="italics">n</em> = 10: \(\overline{x}\) ~ _____(_____,_____)</li> </ol> </li> <li>Are the sample means for <em data-effect="italics">n</em> = 5 and <em data-effect="italics">n</em> = 10 “close” to the theoretical mean, <em data-effect="italics">μ<sub>x</sub></em>? Explain why or why not.</li> <li>Which of the two distributions of sample means has the smaller standard deviation? Why?</li> <li>As <em data-effect="italics">n</em> changed, why did the shape of the distribution of the data change? Use one to two complete sentences to explain what happened.</li> </ol> </div> </div> </div></div>
<div class="part " id="part-confidence-intervals"><div class="part-title-wrap"><h3 class="part-number">IX</h3><h1 class="part-title">Chapter 9: Confidence Intervals</h1></div><div class="ugc part-ugc"></div></div>
<div class="chapter standard" id="chapter-introduction-20" title="Chapter 9.1: Introduction"><div class="chapter-title-wrap"><h3 class="chapter-number">50</h3><h2 class="chapter-title"><span class="display-none">Chapter 9.1: Introduction</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-idp38862000" class="splash"><div class="bc-figcaption figcaption">Have you ever wondered what the average number of M&amp;Ms in a bag at the grocery store is? You can use confidence intervals to answer this question. (credit: comedy_nose/flickr)</div> <p><span id="fs-idm31074784" data-type="media" data-alt="This is a photo of M&amp;Ms piled together. The M&amp;Ms are red, blue, green, yellow, orange and brown."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/CNX_Stats_C08_CO-1.jpg" alt="This is a photo of M&amp;Ms piled together. The M&amp;Ms are red, blue, green, yellow, orange and brown." width="500" data-media-type="image/jpeg" /></span></p> </div> <div id="fs-idp115342032" class="chapter-objectives" data-type="note" data-has-label="true" data-label=""><div data-type="title">Chapter Objectives</div> <p>By the end of this chapter, the student should be able to:</p> <ul id="list12315"><li>Calculate and interpret confidence intervals for estimating a population mean and a population proportion.</li> <li>Interpret the Student&#8217;s t probability distribution as the sample size changes.</li> <li>Discriminate between problems applying the normal and the Student&#8217;s <em data-effect="italics">t</em> distributions.</li> <li>Calculate the sample size required to estimate a population mean and a population proportion given a desired confidence level and margin of error.</li> </ul> </div> <p>Suppose you were trying to determine the mean rent of a two-bedroom apartment in your town. You might look in the classified section of the newspaper, write down several rents listed, and average them together. You would have obtained a point estimate of the true mean. If you are trying to determine the percentage of times you make a basket when shooting a basketball, you might count the number of shots you make and divide that by the number of shots you attempted. In this case, you would have obtained a point estimate for the true proportion.</p> <p id="element-550">We use sample data to make generalizations about an unknown population. This part of statistics is called <span data-type="term">inferential statistics</span>. <strong>The sample data help us to make an estimate of a population <span data-type="term">parameter</span></strong>. We realize that the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals.</p> <p id="element-667">In this chapter, you will learn to construct and interpret confidence intervals. You will also learn a new distribution, the Student&#8217;s-t, and how it is used with these intervals. Throughout the chapter, it is important to keep in mind that the confidence interval is a random variable. It is the population parameter that is fixed.</p> <p>If you worked in the marketing department of an entertainment company, you might be interested in the mean number of songs a consumer downloads a month from iTunes. If so, you could conduct a survey and calculate the sample mean, \(\overline{x}\), and the sample standard deviation, <em data-effect="italics">s</em>. You would use \(\overline{x}\) to estimate the population mean and <em data-effect="italics">s</em> to estimate the population standard deviation. The sample mean, \(\overline{x}\), is the <span data-type="term">point estimate</span> for the population mean, <em data-effect="italics">μ</em>. The sample standard deviation, <em data-effect="italics">s</em>, is the point estimate for the population standard deviation, <em data-effect="italics">σ</em>.</p> <p id="eip-65">Each of \(\overline{x}\) and <em data-effect="italics">s</em> is called a statistic.</p> <p>A <span data-type="term">confidence interval</span> is another type of estimate but, instead of being just one number, it is an interval of numbers. It provides a range of reasonable values in which we expect the population parameter to fall. There is no guarantee that a given confidence interval does capture the parameter, but there is a predictable probability of success.</p> <p id="element-53">Suppose, for the iTunes example, we do not know the population mean <em data-effect="italics">μ</em>, but we do know that the population standard deviation is <em data-effect="italics">σ</em> = 1 and our sample size is 100. Then, by the central limit theorem, the standard deviation for the sample mean is</p> <p id="element-624">\(\frac{\sigma }{\sqrt{n}}=\frac{1}{\sqrt{100}}=0.1\).</p> <p>The <span data-type="term">empirical rule</span>, which applies to bell-shaped distributions, says that in approximately 95% of the samples, the sample mean, \(\overline{x}\), will be within two standard deviations of the population mean <em data-effect="italics">μ</em>. For our iTunes example, two standard deviations is (2)(0.1) = 0.2. The sample mean \(\overline{x}\) is likely to be within 0.2 units of <em data-effect="italics">μ</em>.</p> <p id="element-457">Because \(\overline{x}\) is within 0.2 units of <em data-effect="italics">μ</em>, which is unknown, then <em data-effect="italics">μ</em> is likely to be within 0.2 units of \(\overline{x}\) in 95% of the samples. The population mean <em data-effect="italics">μ</em> is contained in an interval whose lower number is calculated by taking the sample mean and subtracting two standard deviations (2)(0.1) and whose upper number is calculated by taking the sample mean and adding two standard deviations. In other words, <em data-effect="italics">μ</em> is between \(\overline{x}\text{ }-\text{ 0}\text{.2}\) and \(\overline{x}\text{ }+\text{ 0}\text{.2}\) in 95% of all the samples.</p> <p>For the iTunes example, suppose that a sample produced a sample mean \(\overline{x}\text{ }=\text{ 2}\). Then the unknown population mean <em data-effect="italics">μ</em> is between</p> <p>\(\overline{x}-0.2=2-0.2=1.8\) and \(\overline{x}+0.2=2+0.2=2.2\)</p> <p>We say that we are <strong>95% confident</strong> that the unknown population mean number of songs downloaded from iTunes per month is between 1.8 and 2.2. <strong>The 95% confidence interval is (1.8, 2.2).</strong></p> <p>The 95% confidence interval implies two possibilities. Either the interval (1.8, 2.2) contains the true mean <em data-effect="italics">μ</em> or our sample produced an \(\overline{x}\) that is not within 0.2 units of the true mean <em data-effect="italics">μ</em>. The second possibility happens for only 5% of all the samples (95–100%).</p> <p id="element-39">Remember that a confidence interval is created for an unknown population parameter like the population mean, <em data-effect="italics">μ</em>. Confidence intervals for some parameters have the form:</p> <p id="element-75"><strong>(point estimate – margin of error, point estimate + </strong><span data-type="term">margin of error</span><strong>)</strong></p> <p>The margin of error depends on the confidence level or percentage of confidence and the standard error of the mean.</p> <p>When you read newspapers and journals, some reports will use the phrase &#8220;margin of error.&#8221; Other reports will not use that phrase, but include a confidence interval as the point estimate plus or minus the margin of error. These are two ways of expressing the same concept.</p> <div id="eip-882" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="eip-idm127604736">Although the text only covers symmetrical confidence intervals, there are non-symmetrical confidence intervals (for example, a confidence interval for the standard deviation).</p> </div> <div id="fs-idp47490208" class="statistics collab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Collaborative Exercise</div> <p>Have your instructor record the number of meals each student in your class eats out in a week. Assume that the standard deviation is known to be three meals. Construct an approximate 95% confidence interval for the true mean number of meals students eat out each week.</p> <ol><li>Calculate the sample mean.</li> <li>Let <em data-effect="italics">σ</em> = 3 and <em data-effect="italics">n</em> = the number of students surveyed.</li> <li>Construct the interval \(\left(\overline{x}-\text{2}\cdot \frac{\text{σ}}{\sqrt{\text{n}}}\text{, }\overline{x}\text{ + }\text{2}\cdot \frac{\text{σ}}{\sqrt{\text{n}}}\right)\).</li> </ol> <p>We say we are approximately 95% confident that the true mean number of meals that students eat out in a week is between __________ and ___________.</p> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="coninter"><dt>Confidence Interval (CI)</dt> <dd id="id20495395">an interval estimate for an unknown population parameter. This depends on: <ul id="confint1"><li>the desired confidence level,</li> <li>information that is known about the distribution (for example, known standard deviation),</li> <li>the sample and its size.</li> </ul> </dd> </dl> <dl id="infrstats"><dt>Inferential Statistics</dt> <dd id="id20359958">also called statistical inference or inductive statistics; this facet of statistics deals with estimating a population parameter based on a sample statistic. For example, if four out of the 100 calculators sampled are defective we might infer that four percent of the production is defective.</dd> </dl> <dl id="parameter"><dt>Parameter</dt> <dd id="id18035004">a numerical characteristic of a population</dd> </dl> <dl id="pointest"><dt>Point Estimate</dt> <dd id="id20057179">a single number computed from a sample and used to estimate a population parameter</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-a-population-proportion" title="Chapter 9.2: A Population Proportion"><div class="chapter-title-wrap"><h3 class="chapter-number">51</h3><h2 class="chapter-title"><span class="display-none">Chapter 9.2: A Population Proportion</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p>During an election year, we see articles in the newspaper that state <span data-type="term">confidence intervals</span> in terms of proportions or percentages. For example, a poll for a particular candidate running for president might show that the candidate has 40% of the vote within three percentage points (if the sample is large enough). Often, election polls are calculated with 95% confidence, so, the pollsters would be 95% confident that the true proportion of voters who favored the candidate would be between 0.37 and 0.43: (0.40 – 0.03,0.40 + 0.03).</p> <p>Investors in the stock market are interested in the true proportion of stocks that go up and down each week. Businesses that sell personal computers are interested in the proportion of households in the United States that own personal computers. Confidence intervals can be calculated for the true proportion of stocks that go up or down each week and for the true proportion of households in the United States that own personal computers.</p> <p>The procedure to find the confidence interval, the sample size, the <span data-type="term">error bound</span>, and the <span data-type="term">confidence level</span> for a proportion is similar to that for the population mean, but the formulas are different.</p> <p id="element-574"><strong>How do you know you are dealing with a proportion problem?</strong> First, the underlying <strong>distribution is a</strong> <span data-type="term">binomial distribution</span>. (There is no mention of a mean or average.) If <em data-effect="italics">X</em> is a binomial random variable, then <em data-effect="italics">X</em> ~ <em data-effect="italics">B</em>(<em data-effect="italics">n</em>, <em data-effect="italics">p</em>) where <em data-effect="italics">n</em> is the number of trials and <em data-effect="italics">p</em> is the probability of a success. To form a proportion, take <em data-effect="italics">X</em>, the random variable for the number of successes and divide it by <em data-effect="italics">n</em>, the number of trials (or the sample size). The random variable <em data-effect="italics">P′</em> (read &#8220;P prime&#8221;) is that proportion,</p> <p>\({P}^{\prime }=\frac{X}{n}\)</p> <p>(Sometimes the random variable is denoted as \(\stackrel{^}{P}\), read &#8220;P hat&#8221;.)</p> <p>When <em data-effect="italics">n</em> is large and <em data-effect="italics">p</em> is not close to zero or one, we can use the <span data-type="term">normal distribution</span> to approximate the binomial.</p> <p>\(X~N\left(np,\sqrt{npq}\right)\)</p> <p>If we divide the random variable, the mean, and the standard deviation by <em data-effect="italics">n</em>, we get a normal distribution of proportions with <em data-effect="italics">P′</em>, called the estimated proportion, as the random variable. (Recall that a proportion as the number of successes divided by <em data-effect="italics">n</em>.)</p> <p id="element-461">\(\frac{X}{n}={P}^{\prime }\text{~ }N\left(\frac{np}{n},\frac{\sqrt{npq}}{n}\right)\)</p> <p>Using algebra to simplify : \(\frac{\sqrt{npq}}{n}=\sqrt{\frac{pq}{n}}\)</p> <p><strong><em data-effect="italics">P′</em> follows a normal distribution for proportions</strong>: \(\frac{X}{n}={P}^{\prime }\text{~ }N\left(\frac{np}{n},\frac{\sqrt{npq}}{n}\right)\)</p> <p>The confidence interval has the form (<em data-effect="italics">p′</em> – <em data-effect="italics">EBP</em>, <em data-effect="italics">p′</em> + <em data-effect="italics">EBP</em>). <em data-effect="italics">EBP</em> is error bound for the proportion.</p> <p><em data-effect="italics">p′</em> = \(\frac{x}{n}\)</p> <p><em data-effect="italics">p′</em> = the <strong>estimated proportion</strong> of successes (<em data-effect="italics">p′</em> is a <strong>point estimate</strong> for <em data-effect="italics">p</em>, the true proportion.)</p> <p><em data-effect="italics">x</em> = the <strong>number</strong> of successes</p> <p><em data-effect="italics">n</em> = the size of the sample</p> <p><strong>The error bound for a proportion is</strong></p> <p id="element-249">\(EBP=\left({z}_{\frac{\alpha }{2}}\right)\left(\sqrt{\frac{{p}^{\prime }{q}^{\prime }}{n}}\right)\) where <em data-effect="italics">q′</em> = 1 – <em data-effect="italics">p′</em></p> <p>This formula is similar to the error bound formula for a mean, except that the &#8220;appropriate standard deviation&#8221; is different. For a mean, when the population standard deviation is known, the appropriate standard deviation that we use is \(\frac{\sigma }{\sqrt{n}}\). For a proportion, the appropriate standard deviation is \(\sqrt{\frac{pq}{n}}\).</p> <p>However, in the error bound formula, we use \(\sqrt{\frac{{p}^{\prime }{q}^{\prime }}{n}}\) as the standard deviation, instead of \(\sqrt{\frac{pq}{n}}\).</p> <p>In the error bound formula, the <strong>sample proportions <em data-effect="italics">p′</em> and <em data-effect="italics">q′</em> are estimates of the unknown population proportions <em data-effect="italics">p</em> and <em data-effect="italics">q</em></strong>. The estimated proportions <em data-effect="italics">p′</em> and <em data-effect="italics">q′</em> are used because <em data-effect="italics">p</em> and <em data-effect="italics">q</em> are not known. The sample proportions <em data-effect="italics">p′</em> and <em data-effect="italics">q′</em> are calculated from the data: <em data-effect="italics">p′</em> is the estimated proportion of successes, and <em data-effect="italics">q′</em> is the estimated proportion of failures.</p> <p>The confidence interval can be used only if the number of successes <em data-effect="italics">np′</em> and the number of failures <em data-effect="italics">nq′</em> are both greater than five.</p> <div id="id24362956" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="fs-idp79319136">For the normal distribution of proportions, the <em data-effect="italics">z</em>-score formula is as follows.</p> <p>If \({P}^{\prime }\text{~}N\left(p,\sqrt{\frac{pq}{n}}\right)\) then the <em data-effect="italics">z</em>-score formula is \(z=\frac{{p}^{\prime }-p}{\sqrt{\frac{pq}{n}}}\)</p> </div> <div class="textbox textbox--examples" data-type="example"><div data-type="exercise"><div id="id24363103" data-type="problem"><p>Suppose that a market research firm is hired to estimate the percent of adults living in a large city who have cell phones. Five hundred randomly selected adult residents in this city are surveyed to determine whether they have cell phones. Of the 500 people surveyed, 421 responded yes &#8211; they own cell phones. Using a 95% confidence level, compute a confidence interval estimate for the true proportion of adult residents of this city who have cell phones.</p> </div> <div id="id24363164" data-type="solution"><div data-type="list"><div data-type="title">Solution A</div> <ul><li>The first solution is step-by-step (Solution A).</li> <li>The second solution uses a function of the TI-83, 83+ or 84 calculators (Solution B).</li> </ul> </div> <p>Let <em data-effect="italics">X</em> = the number of people in the sample who have cell phones. <em data-effect="italics">X</em> is binomial. \(X~B\left(500,\frac{421}{500}\right)\).</p> <p id="element-985">To calculate the confidence interval, you must find <em data-effect="italics">p′</em>, <em data-effect="italics">q′</em>, and <em data-effect="italics">EBP</em>.</p> <p><em data-effect="italics">n</em> = 500</p> <p id="eip-idm67964976"><em data-effect="italics">x</em> = the number of successes = 421</p> <p>\({p}^{\prime }=\frac{x}{n}=\frac{421}{500}=0.842\)</p> <p id="fs-idp54910160"><em data-effect="italics">p′</em> = 0.842 is the sample proportion; this is the point estimate of the population proportion.</p> <p><em data-effect="italics">q′</em> = 1 – <em data-effect="italics">p′</em> = 1 – 0.842 = 0.158</p> <p>Since <em data-effect="italics">CL</em> = 0.95, then <em data-effect="italics">α</em> = 1 – <em data-effect="italics">CL</em> = 1 – 0.95 = 0.05 \(\left(\frac{\alpha }{2}\right)\) = 0.025.</p> <p id="element-838a">Then \({z}_{\frac{\alpha }{2}}={z}_{0.025}=1.96\)</p> <p class="finger">Use the TI-83, 83+, or 84+ calculator command invNorm(0.975,0,1) to find <em data-effect="italics">z<sub>0.025</sub></em>. Remember that the area to the right of <em data-effect="italics">z<sub>0.025</sub></em> is 0.025 and the area to the left of <em data-effect="italics">z<sub>0.025</sub></em> is 0.975. This can also be found using appropriate commands on other calculators, using a computer, or using a Standard Normal probability table.</p> <p>\(EBP=\left({z}_{\frac{\alpha }{2}}\right)\sqrt{\frac{{p}^{\prime }{q}^{\prime }}{n}}=\left(1.96\right)\sqrt{\frac{\left(0.842\right)\left(0.158\right)}{500}}=0.032\)</p> <p>\(p\text{&#8216;}–EBP=0.842–0.032=0.81\)</p> <p>\({p}^{\prime }+EBP=0.842+0.032=0.874\)</p> <p id="element-144">The confidence interval for the true binomial population proportion is (<em data-effect="italics">p′</em> – <em data-effect="italics">EBP</em>, <em data-effect="italics">p′</em> + <em data-effect="italics">EBP</em>) = (0.810, 0.874).</p> <p><span data-type="title">Interpretation</span>We estimate with 95% confidence that between 81% and 87.4% of all adult residents of this city have cell phones.</p> <p><span data-type="title">Explanation of 95% Confidence Level</span>Ninety-five percent of the confidence intervals constructed in this way would contain the true value for the population proportion of all adult residents of this city who have cell phones.<span data-type="newline"><br /> </span></p> </div> <div id="fs-idm19757088" data-type="solution"><p id="eip-idp6259424"><span data-type="title">Solution B</span></p> <div id="fs-idm86739552" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p>Press <code>STAT</code> and arrow over to <code>TESTS</code>.<span data-type="newline"><br /> </span> Arrow down to <code>A:1-PropZint</code>. Press <code>ENTER</code>.<span data-type="newline"><br /> </span> Arrow down to \(x\) and enter 421.<span data-type="newline"><br /> </span> Arrow down to \(n\) and enter 500.<span data-type="newline"><br /> </span> Arrow down to <code>C-Level</code> and enter .95.<span data-type="newline"><br /> </span> Arrow down to <code>Calculate</code> and press <code>ENTER</code>.<span data-type="newline"><br /> </span> The confidence interval is (0.81003, 0.87397).</p> </div> </div> </div> </div> <div id="fs-idp89365680" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div id="eip-454" data-type="problem"><p id="eip-844">Suppose 250 randomly selected people are surveyed to determine if they own a tablet. Of the 250 surveyed, 98 reported owning a tablet. Using a 95% confidence level, compute a confidence interval estimate for the true proportion of people who own tablets.</p> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><div id="element-2" data-type="exercise"><div id="id1164699825510" data-type="problem"><p>For a class project, a political science student at a large university wants to estimate the percent of students who are registered voters. He surveys 500 students and finds that 300 are registered voters. Compute a 90% confidence interval for the true percent of students who are registered voters, and interpret the confidence interval.</p> </div> <div id="id1164685709005" data-type="solution"><ul id="eip-id1168006547279"><li>The first solution is step-by-step (Solution A).</li> <li>The second solution uses a function of the TI-83, 83+, or 84 calculators (Solution B).</li> </ul> <p id="element-829"><span data-type="title">Solution A</span><em data-effect="italics">x</em> = 300 and <em data-effect="italics">n</em> = 500</p> <p>\({p}^{\prime }=\frac{x}{n}=\frac{300}{500}=0.600\)</p> <p>\({q}^{\prime }=1-{p}^{\prime }=1-0.600=0.400\)</p> <p>Since <em data-effect="italics">CL</em> = 0.90, then <em data-effect="italics">α</em> = 1 – <em data-effect="italics">CL</em> = 1 – 0.90 = 0.10\(\left(\frac{\alpha }{2}\right)\) = 0.05</p> <p id="eip-idm146395264">\({z}_{\frac{\alpha }{2}}\) = <em data-effect="italics">z</em><sub>0.05</sub> = 1.645</p> <p class="finger">Use the TI-83, 83+, or 84+ calculator command invNorm(0.95,0,1) to find <em data-effect="italics">z<sub>0.05</sub></em>. Remember that the area to the right of <em data-effect="italics">z<sub>0.05</sub></em> is 0.05 and the area to the left of <em data-effect="italics">z<sub>0.05</sub></em> is 0.95. This can also be found using appropriate commands on other calculators, using a computer, or using a standard normal probability table.</p> <p>\(EBP=\left({z}_{\frac{\alpha }{2}}\right)\sqrt{\frac{{p}^{\prime }{q}^{\prime }}{n}}=\left(1.645\right)\sqrt{\frac{\left(0.60\right)\left(0.40\right)}{500}}=0.036\)</p> <p>\({p}^{\prime }–EBP=0.60-0.036=0.564\)</p> <p id="fs-idm110100928">\({p}^{\prime }+EBP=0.60+0.036=0.636\)</p> <p id="element-343">The confidence interval for the true binomial population proportion is (<em data-effect="italics">p′</em> – <em data-effect="italics">EBP</em>, <em data-effect="italics">p′</em> + <em data-effect="italics">EBP</em>) = (0.564,0.636).</p> <div data-type="list"><div data-type="title"><strong>Interpretation</strong></div> <ul><li>We estimate with 90% confidence that the true percent of all students that are registered voters is between 56.4% and 63.6%.</li> <li>Alternate Wording: We estimate with 90% confidence that between 56.4% and 63.6% of ALL students are registered voters.</li> </ul> </div> <p id="element-828"><span data-type="title">Explanation of 90% Confidence Level</span>Ninety percent of all confidence intervals constructed in this way contain the true value for the population percent of students that are registered voters.</p> </div> <div id="fs-idp10495376" data-type="solution"><p id="eip-idm56642544"><span data-type="title">Solution B</span></p> <div id="fs-idm49834128" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p>Press <code>STAT</code> and arrow over to <code>TESTS</code>. <span data-type="newline"><br /> </span>Arrow down to <code>A:1-PropZint</code>. Press <code>ENTER</code>. <span data-type="newline"><br /> </span>Arrow down to \(x\) and enter 300. <span data-type="newline"><br /> </span>Arrow down to \(n\) and enter 500. <span data-type="newline"><br /> </span>Arrow down to <code>C-Level</code> and enter 0.90. <span data-type="newline"><br /> </span>Arrow down to <code>Calculate</code> and press <code>ENTER</code>. <span data-type="newline"><br /> </span>The confidence interval is (0.564, 0.636).</p> </div> </div> </div> </div> <div id="fs-idp55027792" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm23330928" data-type="exercise"><div data-type="problem"><p>A student polls his school to see if students in the school district are for or against the new legislation regarding school uniforms. She surveys 600 students and finds that 480 are against the new legislation.</p> <p>&nbsp;</p> <p id="fs-idp17812400">a. Compute a 90% confidence interval for the true percent of students who are against the new legislation, and interpret the confidence interval.</p> </div> </div> <div id="fs-idm29038320" data-type="exercise"><div data-type="problem"><p>b. In a sample of 300 students, 68% said they own an iPod and a smart phone. Compute a 97% confidence interval for the true percent of students who own an iPod and a smartphone.</p> </div> </div> </div> <div id="eip-443" class="bc-section section" data-depth="1"><h3 data-type="title">“Plus Four” Confidence Interval for <em data-effect="italics">p</em></h3> <p id="eip-idm101192016">There is a certain amount of error introduced into the process of calculating a confidence interval for a proportion. Because we do not know the true proportion for the population, we are forced to use point estimates to calculate the appropriate standard deviation of the sampling distribution. Studies have shown that the resulting estimation of the standard deviation can be flawed.</p> <p id="eip-idp9663392">Fortunately, there is a simple adjustment that allows us to produce more accurate confidence intervals. We simply pretend that we have four additional observations. Two of these observations are successes and two are failures. The new sample size, then, is <em data-effect="italics">n</em> + 4, and the new count of successes is <em data-effect="italics">x</em> + 2.</p> <p id="eip-idp9663968">Computer studies have demonstrated the effectiveness of this method. It should be used when the confidence level desired is at least 90% and the sample size is at least ten.</p> </div> <div id="eip-982" class="textbox textbox--examples" data-type="example"><div id="eip-152" data-type="exercise"><div data-type="problem"><p>A random sample of 25 statistics students was asked: “Have you smoked a cigarette in the past week?” Six students reported smoking within the past week. Use the “plus-four” method to find a 95% confidence interval for the true proportion of statistics students who smoke.</p> </div> <div id="eip-19" data-type="solution" data-label=""><p id="eip-idp35102016"><span data-type="title">Solution A</span>Six students out of 25 reported smoking within the past week, so <em data-effect="italics">x</em> = 6 and <em data-effect="italics">n</em> = 25. Because we are using the “plus-four” method, we will use <em data-effect="italics">x</em> = 6 + 2 = 8 and <em data-effect="italics">n</em> = 25 + 4 = 29.</p> <p id="eip-idm32296464">\({p}^{\prime }=\frac{x}{n}=\frac{8}{29}\approx 0.276\)</p> <p id="eip-idm33200496">\({q}^{\prime }=1–{p}^{\prime }=1–0.276=0.724\)</p> <p id="eip-idm27545488">Since <em data-effect="italics">CL</em> = 0.95, we know <em data-effect="italics">α</em> = 1 – 0.95 = 0.05 and \(\frac{\alpha }{2}\) = 0.025.</p> <p id="eip-idm27544960">\({z}_{0.025}=1.96\)</p> <p id="eip-idm54844960">\(EPB=\left({z}_{\frac{\alpha }{2}}\right)\sqrt{\frac{{p}^{\prime }{q}^{\prime }}{n}}=\left(1.96\right)\sqrt{\frac{0.276\left(0.724\right)}{29}}\approx 0.163\)</p> <p id="eip-idm47330704"><em data-effect="italics">p′</em> – <em data-effect="italics">EPB</em> = 0.276 – 0.163 = 0.113</p> <p id="eip-idm47330064"><em data-effect="italics">p′</em> + <em data-effect="italics">EPB</em> = 0.276 + 0.163 = 0.439</p> <p id="eip-idm46666976">We are 95% confident that the true proportion of all statistics students who smoke cigarettes is between 0.113 and 0.439.</p> </div> <div id="fs-idp68718096" data-type="solution"><p id="eip-idm50693904"><span data-type="title">Solution B</span></p> <div id="fs-idp48064384" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="fs-idp140335584">Press STAT and arrow over to TESTS.<span data-type="newline"><br /> </span> Arrow down to A:1-PropZint. Press ENTER.<span data-type="newline"><br /> </span></p> <div id="fs-idp12207696" data-type="note" data-has-label="true" data-label=""><div data-type="title">Reminder</div> <p id="fs-idp183978608">Remember that the plus-four method assume an additional four trials: two successes and two failures. You do not need to change the process for calculating the confidence interval; simply update the values of x and n to reflect these additional trials.</p> </div> <p id="fs-idm64251984">Arrow down to <em data-effect="italics">x</em> and enter eight.<span data-type="newline"><br /> </span> Arrow down to <em data-effect="italics">n</em> and enter 29.<span data-type="newline"><br /> </span> Arrow down to C-Level and enter 0.95.<span data-type="newline"><br /> </span> Arrow down to Calculate and press ENTER.<span data-type="newline"><br /> </span> The confidence interval is (0.113, 0.439).<span data-type="newline"><br /> </span></p> </div> </div> </div> </div> <div id="fs-idp89200688" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div data-type="problem"><p>Out of a random sample of 65 freshmen at State University, 31 students have declared a major. Use the “plus-four” method to find a 96% confidence interval for the true proportion of freshmen at State University who have declared a major.</p> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><div data-type="exercise"><div data-type="problem"><p id="eip-idm57290160">The Berkman Center for Internet &amp; Society at Harvard recently conducted a study analyzing the privacy management habits of teen internet users. In a group of 50 teens, 13 reported having more than 500 friends on Facebook. Use the “plus four” method to find a 90% confidence interval for the true proportion of teens who would report having more than 500 Facebook friends.</p> </div> <div data-type="solution"><p id="eip-idp21168352"><span data-type="title">Solution A</span>Using “plus-four,” we have <em data-effect="italics">x</em> = 13 + 2 = 15 and <em data-effect="italics">n</em> = 50 + 4 = 54.</p> <p id="eip-idp92452528">\({p}^{\text{&#8216;}}=\frac{15}{54}\approx 0.278\)</p> <p id="eip-idp92453168">\({q}^{\text{&#8216;}}=1–{p}^{\text{&#8216;}}=1-0.241=0.722\)</p> <p id="eip-idp23600320">Since <em data-effect="italics">CL</em> = 0.90, we know <em data-effect="italics">α</em> = 1 – 0.90 = 0.10 and \(\frac{\alpha }{2}\) = 0.05.</p> <p id="eip-idp84176480">\({z}_{0.05}=1.645\)</p> <p id="eip-idp73704848">\(EPB=\left({z}_{\frac{\alpha }{2}}\right)\left(\sqrt{\frac{{p}^{\prime }{q}^{\prime }}{n}}\right)=\left(1.645\right)\left(\sqrt{\frac{\left(0.278\right)\left(0.722\right)}{54}}\right)\approx 0.100\)</p> <p id="eip-idp40189152"><em data-effect="italics">p′</em> – <em data-effect="italics">EPB</em> = 0.278 – 0.100 = 0.178</p> <p id="eip-idp61257168"><em data-effect="italics">p′</em> + <em data-effect="italics">EPB</em> = 0.278 + 0.100 = 0.378</p> <p id="eip-idp61146144">We are 90% confident that between 17.8% and 37.8% of all teens would report having more than 500 friends on Facebook.</p> </div> <div id="fs-idp141579680" data-type="solution"><p id="eip-idp20279360"><span data-type="title">Solution B</span></p> <div id="fs-idm93286128" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="fs-idp57846688"><span data-type="newline"><br /> </span>Press STAT and arrow over to TESTS. <span data-type="newline"><br /> </span>Arrow down to A:1-PropZint. Press ENTER. <span data-type="newline"><br /> </span>Arrow down to <em data-effect="italics">x</em> and enter 15. <span data-type="newline"><br /> </span>Arrow down to <em data-effect="italics">n</em> and enter 54. <span data-type="newline"><br /> </span>Arrow down to C-Level and enter 0.90. <span data-type="newline"><br /> </span>Arrow down to Calculate and press ENTER. <span data-type="newline"><br /> </span>The confidence interval is (0.178, 0.378).</p> </div> </div> </div> </div> <div id="fs-idp15135760" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div id="eip-391" data-type="problem"><p id="eip-879">The Berkman Center Study referenced in <a class="autogenerated-content" href="#eip-251">(Figure)</a> talked to teens in smaller focus groups, but also interviewed additional teens over the phone. When the study was complete, 588 teens had answered the question about their Facebook friends with 159 saying that they have more than 500 friends. Use the “plus-four” method to find a 90% confidence interval for the true proportion of teens that would report having more than 500 Facebook friends based on this larger sample. Compare the results to those in <a class="autogenerated-content" href="#eip-251">(Figure)</a>.</p> </div> </div> </div> <div id="fs-idm22068416" class="bc-section section" data-depth="1"><h3 data-type="title">Calculating the Sample Size <em data-effect="italics">n</em></h3> <p>If researchers desire a specific margin of error, then they can use the error bound formula to calculate the required sample size.</p> <p id="element-957">The error bound formula for a population proportion is</p> <ul><li>\(EBP=\left({z}_{\frac{\alpha }{2}}\right)\left(\sqrt{\frac{{p}^{\prime }{q}^{\prime }}{n}}\right)\)</li> <li>Solving for <em data-effect="italics">n</em> gives you an equation for the sample size.</li> <li>\(n=\frac{{\left({z}_{\frac{\alpha }{2}}\right)}^{2}\left({p}^{\prime }{q}^{\prime }\right)}{EB{P}^{2}}\)</li> </ul> <div class="textbox textbox--examples" data-type="example"><div id="fs-idp171900368" data-type="exercise"><div id="fs-idp171900624" data-type="problem"><p>Suppose a mobile phone company wants to determine the current percentage of customers aged 50+ who use text messaging on their cell phones. How many customers aged 50+ should the company survey in order to be 90% confident that the estimated (sample) proportion is within three percentage points of the true population proportion of customers aged 50+ who use text messaging on their cell phones.</p> </div> <div id="fs-idp53892304" data-type="solution"><p>From the problem, we know that <strong><em data-effect="italics">EBP</em> = 0.03</strong> (3%=0.03) and \({z}_{\frac{\alpha }{2}}\) <em data-effect="italics">z</em><sub>0.05</sub> = 1.645 because the confidence level is 90%.</p> <p>However, in order to find <em data-effect="italics">n</em>, we need to know the estimated (sample) proportion <em data-effect="italics">p</em>′. Remember that <em data-effect="italics">q</em>′ = 1 – <em data-effect="italics">p</em>′. But, we do not know <em data-effect="italics">p</em>′ yet. Since we multiply <em data-effect="italics">p</em>′ and <em data-effect="italics">q</em>′ together, we make them both equal to 0.5 because <em data-effect="italics">p</em>′<em data-effect="italics">q</em>′ = (0.5)(0.5) = 0.25 results in the largest possible product. (Try other products: (0.6)(0.4) = 0.24; (0.3)(0.7) = 0.21; (0.2)(0.8) = 0.16 and so on). The largest possible product gives us the largest <em data-effect="italics">n</em>. This gives us a large enough sample so that we can be 90% confident that we are within three percentage points of the true population proportion. To calculate the sample size <em data-effect="italics">n</em>, use the formula and make the substitutions.</p> <p>\(n=\frac{{z}^{2}{p}^{\prime }{q}^{\prime }}{EB{P}^{2}}\) gives \(n=\frac{{1.645}^{2}\left(0.5\right)\left(0.5\right)}{{0.03}^{2}}=751.7\)</p> <p id="eip-979">Round the answer to the next higher value. The sample size should be 752 cell phone customers aged 50+ in order to be 90% confident that the estimated (sample) proportion is within three percentage points of the true population proportion of all customers aged 50+ who use text messaging on their cell phones.</p> </div> </div> </div> <div id="fs-idp13131184" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div data-type="problem"><p>Suppose an internet marketing company wants to determine the current percentage of customers who click on ads on their smartphones. How many customers should the company survey in order to be 90% confident that the estimated proportion is within five percentage points of the true population proportion of customers who click on ads on their smartphones?</p> </div> </div> </div> </div> <div id="eip-566" class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="eip-idp27535536">Jensen, Tom. “Democrats, Republicans Divided on Opinion of Music Icons.” Public Policy Polling. Available online at http://www.publicpolicypolling.com/Day2MusicPoll.pdf (accessed July 2, 2013).</p> <p id="eip-idm28118800">Madden, Mary, Amanda Lenhart, Sandra Coresi, Urs Gasser, Maeve Duggan, Aaron Smith, and Meredith Beaton. “Teens, Social Media, and Privacy.” PewInternet, 2013. Available online at http://www.pewinternet.org/Reports/2013/Teens-Social-Media-And-Privacy.aspx (accessed July 2, 2013).</p> <p id="eip-idm28117680">Prince Survey Research Associates International. “2013 Teen and Privacy Management Survey.” Pew Research Center: Internet and American Life Project. Available online at http://www.pewinternet.org/~/media//Files/Questionnaire/2013/Methods%20and%20Questions_Teens%20and%20Social%20Media.pdf (accessed July 2, 2013).</p> <p id="eip-idm58659488">Saad, Lydia. “Three in Four U.S. Workers Plan to Work Pas Retirement Age: Slightly more say they will do this by choice rather than necessity.” Gallup® Economy, 2013. Available online at http://www.gallup.com/poll/162758/three-four-workers-plan-work-past-retirement-age.aspx (accessed July 2, 2013).</p> <p id="eip-idm58658768">The Field Poll. Available online at http://field.com/fieldpollonline/subscribers/ (accessed July 2, 2013).</p> <p id="eip-idm58658256">Zogby. “New SUNYIT/Zogby Analytics Poll: Few Americans Worry about Emergency Situations Occurring in Their Community; Only one in three have an Emergency Plan; 70% Support Infrastructure ‘Investment’ for National Security.” Zogby Analytics, 2013. Available online at http://www.zogbyanalytics.com/news/299-americans-neither-worried-nor-prepared-in-case-of-a-disaster-sunyit-zogby-analytics-poll (accessed July 2, 2013).</p> <p id="eip-idm58656656">“52% Say Big-Time College Athletics Corrupt Education Process.” Rasmussen Reports, 2013. Available online at http://www.rasmussenreports.com/public_content/lifestyle/sports/may_2013/52_say_big_time_college_athletics_corrupt_education_process (accessed July 2, 2013).</p> </div> <div class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="eip-idm15657920">Some statistical measures, like many survey questions, measure qualitative rather than quantitative data. In this case, the population parameter being estimated is a proportion. It is possible to create a confidence interval for the true population proportion following procedures similar to those used in creating confidence intervals for population means. The formulas are slightly different, but they follow the same reasoning.</p> <p id="eip-idm15657536">Let <em data-effect="italics">p′</em> represent the sample proportion, <em data-effect="italics">x/n</em>, where <em data-effect="italics">x</em> represents the number of successes and <em data-effect="italics">n</em> represents the sample size. Let <em data-effect="italics">q′</em> = 1 – <em data-effect="italics">p′</em>. Then the confidence interval for a population proportion is given by the following formula:</p> <p id="eip-idp121742160">(lower bound, upper bound) \(=\left({p}^{\prime }–EBP,{p}^{\prime } +EBP\right)= \left({p}^{\prime }–z\sqrt{\frac{{p}^{\prime }{q}^{\prime }}{n}},{p}^{\prime }+z\sqrt{\frac{{p}^{\prime }{q}^{\prime }}{n}}\right)\)</p> <p id="eip-idp146455680">The “plus four” method for calculating confidence intervals is an attempt to balance the error introduced by using estimates of the population proportion when calculating the standard deviation of the sampling distribution. Simply imagine four additional trials in the study; two are successes and two are failures. Calculate \({p}^{\prime }=\frac{x+2}{n+4}\), and proceed to find the confidence interval. When sample sizes are small, this method has been demonstrated to provide more accurate confidence intervals than the standard formula used for larger samples.</p> </div> <div class="formula-review" data-depth="1"><h3 data-type="title">Formula Review</h3> <p id="eip-idm44244000"><em data-effect="italics">p′ = x / n</em> where <em data-effect="italics">x</em> represents the number of successes and <em data-effect="italics">n</em> represents the sample size. The variable <em data-effect="italics">p</em>′ is the sample proportion and serves as the point estimate for the true population proportion.</p> <p id="eip-idm61459360"><em data-effect="italics">q</em>′ = 1 – <em data-effect="italics">p</em>′</p> <p id="eip-idm163551632">\({p}^{\prime }~N\left(p,\sqrt{\frac{pq}{n}}\right)\) The variable <em data-effect="italics">p′</em> has a binomial distribution that can be approximated with the normal distribution shown here.</p> <p id="eip-idm90817904"><em data-effect="italics">EBP</em> = the error bound for a proportion = \({z}_{\frac{\alpha }{2}}\sqrt{\frac{{p}^{\prime }{q}^{\prime }}{n}}\)</p> <p id="eip-idm140522368">Confidence interval for a proportion:</p> <p id="eip-idm140521984">\(\left(\text{lower bound, upper bound)}=\left({p}^{\prime }–EBP,{p}^{\prime }+EBP\right)=\left({p}^{\prime }–z\sqrt{\frac{{p}^{\prime }{q}^{\prime }}{n}}, {p}^{\prime }+z\sqrt{\frac{{p}^{\prime }{q}^{\prime }}{n}}\right)\)</p> <p id="eip-idm111094496">\(n= \frac{{z}_{\frac{\alpha }{2}}{}^{2}{p}^{\prime }{q}^{\prime }}{EB{P}^{2}}\) provides the number of participants needed to estimate the population proportion with confidence 1 &#8211; <em data-effect="italics">α</em> and margin of error <em data-effect="italics">EBP</em>.</p> <p id="eip-340">Use the normal distribution for a single population proportion \(p\prime  =\frac{x}{n}\)</p> <p>\(EBP=\left({z}_{\frac{\alpha }{2}}\right)\sqrt{\frac{p\prime q\prime }{n}} p\prime +q\prime =1\)</p> <p>The confidence interval has the format (<em data-effect="italics">p′</em> – <em data-effect="italics">EBP</em>, <em data-effect="italics">p′</em> + <em data-effect="italics">EBP</em>).</p> <p>\(\overline{x}\) is a point estimate for <em data-effect="italics">μ</em></p> <p id="eip-285"><em data-effect="italics">p′</em> is a point estimate for <em data-effect="italics">ρ</em></p> <p><em data-effect="italics">s</em> is a point estimate for <em data-effect="italics">σ</em></p> </div> <div class="practice" data-depth="1"><p><em data-effect="italics">Use the following information to answer the next two exercises:</em> Marketing companies are interested in knowing the population percent of women who make the majority of household purchasing decisions.</p> <div data-type="exercise"><div id="eip-9" data-type="problem"><p>When designing a study to determine this population proportion, what is the minimum number you would need to survey to be 90% confident that the population proportion is estimated to within 0.05?</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>If it were later determined that it was important to be more than 90% confident and a new survey were commissioned, how would it affect the minimum number you need to survey? Why?</p> </div> <div data-type="solution"><p>It would decrease, because the z-score would decrease, which reducing the numerator and lowering the number.</p> </div> </div> <p><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next five exercises:</em> Suppose the marketing company did do a survey. They randomly surveyed 200 households and found that in 120 of them, the woman made the majority of the purchasing decisions. We are interested in the population proportion of households where women make the majority of the purchasing decisions.</p> <div data-type="exercise"><div data-type="problem"><p>Identify the following:</p> <ol id="fs-idm88953968" type="a"><li><em data-effect="italics">x</em> = ______</li> <li><em data-effect="italics">n</em> = ______</li> <li><em data-effect="italics">p′</em> = ______</li> </ol> </div> </div> <div id="eip-55" data-type="exercise"><div data-type="problem"><p id="eip-idp4012128">Define the random variables <em data-effect="italics">X</em> and <em data-effect="italics">P′</em> in words.</p> </div> <div data-type="solution"><p id="eip-idm101446224"><em data-effect="italics">X</em> is the number of “successes” where the woman makes the majority of the purchasing decisions for the household. <em data-effect="italics">P</em>′ is the percentage of households sampled where the woman makes the majority of the purchasing decisions for the household.</p> </div> </div> <div data-type="exercise"><div id="eip-758" data-type="problem"><p id="eip-idp9191824">Which distribution should you use for this problem?</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p id="eip-idp83214256">Construct a 95% confidence interval for the population proportion of households where the women make the majority of the purchasing decisions. State the confidence interval, sketch the graph, and calculate the error bound.</p> </div> <div id="eip-208" data-type="solution"><p id="eip-idp62067840">CI: (0.5321, 0.6679)</p> <div id="fs-idp73910688" class="bc-figure figure"><span id="eip-idp62068224" data-type="media" data-alt="This is a normal distribution curve. The peak of the curve coincides with the point 0.6 on the horizontal axis. A central region is shaded between points 0.5321 and 0.6679."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/CNX_Stats_C08_M04_item002annoN-1.jpg" alt="This is a normal distribution curve. The peak of the curve coincides with the point 0.6 on the horizontal axis. A central region is shaded between points 0.5321 and 0.6679." width="380" data-media-type="image/jpeg" /></span></div> <p id="eip-idp97751072"><em data-effect="italics">EBM</em>: 0.0679</p> </div> </div> <div id="eip-839" data-type="exercise"><div data-type="problem"><p id="eip-idp170819776">List two difficulties the company might have in obtaining random results, if this survey were done by email.</p> </div> </div> <p><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next five exercises:</em> Of 1,050 randomly selected adults, 360 identified themselves as manual laborers, 280 identified themselves as non-manual wage earners, 250 identified themselves as mid-level managers, and 160 identified themselves as executives. In the survey, 82% of manual laborers preferred trucks, 62% of non-manual wage earners preferred trucks, 54% of mid-level managers preferred trucks, and 26% of executives preferred trucks.</p> <div data-type="exercise"><div data-type="problem"><p id="eip-idm93454256">We are interested in finding the 95% confidence interval for the percent of executives who prefer trucks. Define random variables <em data-effect="italics">X</em> and <em data-effect="italics">P</em>′ in words.</p> </div> <div id="eip-307" data-type="solution"><p id="eip-idp374416"><em data-effect="italics">X</em> is the number of “successes” where an executive prefers a truck. <em data-effect="italics">P</em>′ is the percentage of executives sampled who prefer a truck.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p id="eip-idm12643968">Which distribution should you use for this problem?</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p id="eip-idp15976608">Construct a 95% confidence interval. State the confidence interval, sketch the graph, and calculate the error bound.</p> </div> <div id="eip-284" data-type="solution"><p id="eip-idp28777568">CI: (0.19432, 0.33068)</p> <div id="fs-idm39446528" class="bc-figure figure"><span id="eip-idp150796640" data-type="media" data-alt="This is a normal distribution curve. The peak of the curve coincides with the point 0.26 on the horizontal axis. A central region is shaded between points 0.1943 and 0.3307."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C08_M04_item002anno-1.jpg" alt="This is a normal distribution curve. The peak of the curve coincides with the point 0.26 on the horizontal axis. A central region is shaded between points 0.1943 and 0.3307." width="380" data-media-type="image/jpeg" /></span></div> <p id="eip-idp173934176"><em data-effect="italics">EBM</em>: 0.0707</p> </div> </div> <div id="eip-218" data-type="exercise"><div data-type="problem"><p id="eip-idp111500528">Suppose we want to lower the sampling error. What is one way to accomplish that?</p> </div> </div> <div id="eip-978" data-type="exercise"><div data-type="problem"><p id="eip-idm82424912">The sampling error given in the survey is ±2%. Explain what the ±2% means.</p> </div> <div data-type="solution"><p id="eip-idp36786320">The sampling error means that the true mean can be 2% above or below the sample mean.</p> </div> </div> <p><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next five exercises:</em> A poll of 1,200 voters asked what the most significant issue was in the upcoming election. Sixty-five percent answered the economy. We are interested in the population proportion of voters who feel the economy is the most important.</p> <div data-type="exercise"><div id="eip-294" data-type="problem"><p id="eip-idp73640048">Define the random variable <em data-effect="italics">X</em> in words.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p id="eip-idp47473088">Define the random variable <em data-effect="italics">P</em>′ in words.</p> </div> <div data-type="solution"><p id="eip-idp31035264"><em data-effect="italics">P</em>′ is the proportion of voters sampled who said the economy is the most important issue in the upcoming election.</p> </div> </div> <div data-type="exercise"><div id="eip-398" data-type="problem"><p id="eip-idm26324624">Which distribution should you use for this problem?</p> </div> </div> <div id="eip-820" data-type="exercise"><div id="eip-679" data-type="problem"><p id="eip-idp659904">Construct a 90% confidence interval, and state the confidence interval and the error bound.</p> </div> <div id="eip-87" data-type="solution"><p id="eip-idp108579488">CI: (0.62735, 0.67265)</p> <p id="eip-idp108579872"><em data-effect="italics">EBM</em>: 0.02265</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p id="eip-idp87563392">What would happen to the confidence interval if the level of confidence were 95%?</p> </div> </div> <p id="fs-idp85734416"><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next 16 exercises:</em> The Ice Chalet offers dozens of different beginning ice-skating classes. All of the class names are put into a bucket. The 5 P.M., Monday night, ages 8 to 12, beginning ice-skating class was picked. In that class were 64 girls and 16 boys. Suppose that we are interested in the true proportion of girls, ages 8 to 12, in all beginning ice-skating classes at the Ice Chalet. Assume that the children in the selected class are a random sample of the population.</p> <div data-type="exercise"><div id="id8180231" data-type="problem"><p>What is being counted?</p> </div> <div id="fs-idp114007216" data-type="solution"><p id="fs-idp114007472">The number of girls, ages 8 to 12, in the 5 P.M. Monday night beginning ice-skating class.</p> </div> </div> <div data-type="exercise"><div id="id3220390" data-type="problem"><p>In words, define the random variable <em data-effect="italics">X</em>.</p> </div> </div> <div data-type="exercise"><div id="id9755908" data-type="problem"><p>Calculate the following:</p> <ol id="fs-idp23913648" type="a"><li><em data-effect="italics">x</em> = _______</li> <li><em data-effect="italics">n</em> = _______</li> <li><em data-effect="italics">p</em>′ = _______</li> </ol> </div> <div id="id3263628" data-type="solution"><ol id="fs-idp45823456" type="a"><li><em data-effect="italics">x</em> = 64</li> <li><em data-effect="italics">n</em> = 80</li> <li><em data-effect="italics">p</em>′ = 0.8</li> </ol> </div> </div> <div id="element-296" data-type="exercise"><div id="id8027106" data-type="problem"><p>State the estimated distribution of <em data-effect="italics">X</em>. <em data-effect="italics">X</em>~________</p> </div> </div> <div data-type="exercise"><div id="id3370663" data-type="problem"><p>Define a new random variable <em data-effect="italics">P</em>′. What is <em data-effect="italics">p</em>′ estimating?</p> </div> <div id="id3301619" data-type="solution"><p id="element-393"><em data-effect="italics">p</em></p> </div> </div> <div data-type="exercise"><div id="id3304622" data-type="problem"><p>In words, define the random variable <em data-effect="italics">P</em>′.</p> </div> </div> <div id="element-759" data-type="exercise"><div id="id9812677" data-type="problem"><p>State the estimated distribution of <em data-effect="italics">P</em>′. Construct a 92% Confidence Interval for the true proportion of girls in the ages 8 to 12 beginning ice-skating classes at the Ice Chalet.</p> </div> <div id="fs-idm95519424" data-type="solution"><p id="fs-idm36051552">\({P}^{\prime }~N\left(0.8,\sqrt{\frac{\left(0.8\right)\left(0.2\right)}{80}}\right)\). (0.72171, 0.87829).</p> </div> </div> <div data-type="exercise"><div id="id9812907" data-type="problem"><p>How much area is in both tails (combined)?</p> </div> </div> <div data-type="exercise"><div id="id3366191" data-type="problem"><p>How much area is in each tail?</p> </div> <div id="id3371993" data-type="solution"><p>0.04</p> </div> </div> <div id="element-709" data-type="exercise"><div id="id3722586" data-type="problem"><p id="element-23487325">Calculate the following:</p> <ol id="list-876238746" type="a"><li>lower limit</li> <li>upper limit</li> <li>error bound</li> </ol> </div> </div> <div id="element-846" data-type="exercise"><div id="id8490179" data-type="problem"><p>The 92% confidence interval is _______.</p> </div> <div id="id9759141" data-type="solution"><p>(0.72; 0.88)</p> </div> </div> <div data-type="exercise"><div id="id9759165" data-type="problem"><p>Fill in the blanks on the graph with the areas, upper and lower limits of the confidence interval, and the sample proportion.</p> <div id="five-Ohno" class="bc-figure figure"><span id="id7994111" data-type="media" data-alt="Normal distribution curve with two vertical upward lines from the x-axis to the curve. The confidence interval is between these two lines. The residual areas are on either side."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch08_08_01-1.jpg" alt="Normal distribution curve with two vertical upward lines from the x-axis to the curve. The confidence interval is between these two lines. The residual areas are on either side." width="380" data-media-type="image/jpg" /></span></div> </div> </div> <div data-type="exercise"><div id="id9811149" data-type="problem"><p>In one complete sentence, explain what the interval means.</p> </div> <div id="fs-idp27973488" data-type="solution"><p id="fs-idp27973616">With 92% confidence, we estimate the proportion of girls, ages 8 to 12, in a beginning ice-skating class at the Ice Chalet to be between 72% and 88%.</p> </div> </div> <div data-type="exercise"><div id="id5516509" data-type="problem"><p>Using the same <em data-effect="italics">p</em>′ and level of confidence, suppose that <em data-effect="italics">n</em> were increased to 100. Would the error bound become larger or smaller? How do you know?</p> </div> </div> <div id="element-478" data-type="exercise"><div id="id9923031" data-type="problem"><p>Using the same <em data-effect="italics">p</em>′ and <em data-effect="italics">n</em> = 80, how would the error bound change if the confidence level were increased to 98%? Why?</p> </div> <div id="fs-idm19388800" data-type="solution"><p id="fs-idm19388672">The error bound would increase. Assuming all other variables are kept constant, as the confidence level increases, the area under the curve corresponding to the confidence level becomes larger, which creates a wider interval and thus a larger error.</p> </div> </div> <div data-type="exercise"><div id="id3186484" data-type="problem"><p>If you decreased the allowable error bound, why would the minimum sample size increase (keeping the same level of confidence)?</p> </div> </div> </div> <div id="fs-idm34727072" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <div id="eip-77" data-type="exercise"><div data-type="problem"><p>1) Insurance companies are interested in knowing the population percent of drivers who always buckle up before riding in a car.</p> <ol id="fs-idm156928880" type="a"><li>When designing a study to determine this population proportion, what is the minimum number you would need to survey to be 95% confident that the population proportion is estimated to within 0.03?</li> <li>If it were later determined that it was important to be more than 95% confident and a new survey was commissioned, how would that affect the minimum number you would need to survey? Why?</li> </ol> <p>&nbsp;</p> </div> <div id="fs-idm22600112" data-type="solution"></div> </div> <div id="eip-344" data-type="exercise"><div data-type="problem"><p id="eip-1">2) Suppose that the insurance companies did do a survey. They randomly surveyed 400 drivers and found that 320 claimed they always buckle up. We are interested in the population proportion of drivers who claim they always buckle up.</p> <ol id="eip-idp159069360" type="a"><li><ol id="eip-idp141224128" type="i"><li><em data-effect="italics">x</em> = __________</li> <li><em data-effect="italics">n</em> = __________</li> <li><em data-effect="italics">p</em>′ = __________</li> </ol> </li> <li>Define the random variables <em data-effect="italics">X</em> and <em data-effect="italics">P</em>′, in words.</li> <li>Which distribution should you use for this problem? Explain your choice.</li> <li>Construct a 95% confidence interval for the population proportion who claim they always buckle up. <ol id="eip-idp7029968" type="i"><li>State the confidence interval.</li> <li>Sketch the graph.</li> <li>Calculate the error bound.</li> </ol> </li> <li>If this survey were done by telephone, list three difficulties the companies might have in obtaining random results.</li> </ol> <p>&nbsp;</p> </div> </div> <div id="exerc4" data-type="exercise"><div id="id5938058" data-type="problem"><p id="para31">3) According to a recent survey of 1,200 people, 61% feel that the president is doing an acceptable job. We are interested in the population proportion of people who feel the president is doing an acceptable job.</p> <ol id="list31" type="a"><li>Define the random variables <em data-effect="italics">X</em> and <em data-effect="italics">P</em>′ in words.</li> <li>Which distribution should you use for this problem? Explain your choice.</li> <li>Construct a 90% confidence interval for the population proportion of people who feel the president is doing an acceptable job. <ol id="list32" type="i"><li>State the confidence interval.</li> <li>Sketch the graph.</li> <li>Calculate the error bound. <ol id="list_solution1" type="i"></ol> </li> </ol> </li> </ol> <p>&nbsp;</p> </div> <div id="id6199005" data-type="solution"></div> </div> <div id="eip-idp49345824" data-type="exercise"><div id="eip-idp49345952" data-type="problem"><p id="eip-idp49346080">4) An article regarding interracial dating and marriage recently appeared in the <span data-type="cite-title">Washington Post</span>. Of the 1,709 randomly selected adults, 315 identified themselves as Latinos, 323 identified themselves as blacks, 254 identified themselves as Asians, and 779 identified themselves as whites. In this survey, 86% of blacks said that they would welcome a white person into their families. Among Asians, 77% would welcome a white person into their families, 71% would welcome a Latino, and 66% would welcome a black person.</p> <ol id="eip-idm65104512" type="a"><li>We are interested in finding the 95% confidence interval for the percent of all black adults who would welcome a white person into their families. Define the random variables <em data-effect="italics">X</em> and <em data-effect="italics">P</em>′, in words.</li> <li>Which distribution should you use for this problem? Explain your choice.</li> <li>Construct a 95% confidence interval. <ol id="eip-idm5084416" type="i"><li>State the confidence interval.</li> <li>Sketch the graph.</li> <li>Calculate the error bound.</li> </ol> </li> </ol> <p>&nbsp;</p> </div> </div> <div id="exe1" data-type="exercise"><div id="id5306840" data-type="problem"><p id="para37">5) Refer to the information in <a class="autogenerated-content" href="#eip-idp49345824">(Figure)</a>.</p> <ol id="list37" type="a"><li>Construct three 95% confidence intervals. <ol id="eip-id1164398498310" type="i"><li>percent of all Asians who would welcome a white person into their families.</li> <li>percent of all Asians who would welcome a Latino into their families.</li> <li>percent of all Asians who would welcome a black person into their families.</li> </ol> </li> <li>Even though the three point estimates are different, do any of the confidence intervals overlap? Which?</li> <li>For any intervals that do overlap, in words, what does this imply about the significance of the differences in the true proportions?</li> <li>For any intervals that do not overlap, in words, what does this imply about the significance of the differences in the true proportions? <ol id="fs-idm25152352" type="i"></ol> </li> </ol> </div> <div id="fs-idp39314896" data-type="solution"></div> </div> <div id="exerr1" data-type="exercise"><div id="id5266159" data-type="problem"><p id="para41">6) Stanford University conducted a study of whether running is healthy for men and women over age 50. During the first eight years of the study, 1.5% of the 451 members of the 50-Plus Fitness Association died. We are interested in the proportion of people over 50 who ran and died in the same eight-year period.</p> <ol id="list41_5" type="a"><li>Define the random variables <em data-effect="italics">X</em> and <em data-effect="italics">P</em>′ in words.</li> <li>Which distribution should you use for this problem? Explain your choice.</li> <li>Construct a 97% confidence interval for the population proportion of people over 50 who ran and died in the same eight–year period. <ol id="list42" type="i"><li>State the confidence interval.</li> <li>Sketch the graph.</li> <li>Calculate the error bound.</li> </ol> </li> <li>Explain what a “97% confidence interval” means for this study.</li> </ol> <p>&nbsp;</p> </div> </div> <div id="exercise111" data-type="exercise"><div id="id5280738" data-type="problem"><p id="para45">7) A telephone poll of 1,000 adult Americans was reported in an issue of <span data-type="cite-title">Time Magazine</span>. One of the questions asked was “What is the main problem facing the country?” Twenty percent answered “crime.” We are interested in the population proportion of adult Americans who feel that crime is the main problem.</p> <ol id="list45" type="a"><li>Define the random variables <em data-effect="italics">X</em> and <em data-effect="italics">P</em>′ in words.</li> <li>Which distribution should you use for this problem? Explain your choice.</li> <li>Construct a 95% confidence interval for the population proportion of adult Americans who feel that crime is the main problem. <ol id="list45_5" type="i"><li>State the confidence interval.</li> <li>Sketch the graph.</li> <li>Calculate the error bound.</li> </ol> </li> <li>Suppose we want to lower the sampling error. What is one way to accomplish that?</li> <li>The sampling error given by Yankelovich Partners, Inc. (which conducted the poll) is ±3%. In one to three complete sentences, explain what the ±3% represents.</li> </ol> <p>&nbsp;</p> </div> <div id="fs-idm130896" data-type="solution"></div> </div> <div id="exerrrrcise1" data-type="exercise"><div id="id5247798" data-type="problem"><p id="para47">8) Refer to <a class="autogenerated-content" href="#exercise111">(Figure)</a>. Another question in the poll was “[How much are] you worried about the quality of education in our schools?” Sixty-three percent responded “a lot”. We are interested in the population proportion of adult Americans who are worried a lot about the quality of education in our schools.</p> <ol id="list47" type="a"><li>Define the random variables <em data-effect="italics">X</em> and <em data-effect="italics">P</em>′ in words.</li> <li>Which distribution should you use for this problem? Explain your choice.</li> <li>Construct a 95% confidence interval for the population proportion of adult Americans who are worried a lot about the quality of education in our schools. <ol id="list47_5" type="i"><li>State the confidence interval.</li> <li>Sketch the graph.</li> <li>Calculate the error bound.</li> </ol> </li> <li>The sampling error given by Yankelovich Partners, Inc. (which conducted the poll) is ±3%. In one to three complete sentences, explain what the ±3% represents.</li> </ol> </div> </div> <p id="id7660726"><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next three exercises:</em> According to a Field Poll, 79% of California adults (actual results are 400 out of 506 surveyed) feel that “education and our schools” is one of the top issues facing California. We wish to construct a 90% confidence interval for the true proportion of California adults who feel that education and the schools is one of the top issues facing California.</p> <div id="exercise3289720" data-type="exercise"><div id="id6273772" data-type="problem"><p id="para52">9) A point estimate for the true population proportion is:</p> <ol id="list52" type="a"><li>0.90</li> <li>1.27</li> <li>0.79</li> <li>400</li> </ol> </div> <div id="id5608706" data-type="solution"><p id="sol11"></p></div> </div> <div id="eip-idm27625664" data-type="exercise"><div id="eip-idm27625536" data-type="problem"><p id="eip-idm27625408">10) A 90% confidence interval for the population proportion is _______.</p> <ol id="eip-idm10149840" type="a"><li>(0.761, 0.820)</li> <li>(0.125, 0.188)</li> <li>(0.755, 0.826)</li> <li>(0.130, 0.183)</li> </ol> <p>&nbsp;</p> </div> </div> <div id="eeexxxx" data-type="exercise"><div id="id5222734" data-type="problem"><p id="para56">11) The error bound is approximately _____.</p> <ol id="list56" type="a"><li>1.581</li> <li>0.791</li> <li>0.059</li> <li>0.030</li> </ol> </div> <div id="id6332368" data-type="solution"><p id="sol11aa"></p></div> </div> <p id="id7661014"><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next two exercises:</em> Five hundred and eleven (511) homes in a certain southern California community are randomly surveyed to determine if they meet minimal earthquake preparedness recommendations. One hundred seventy-three (173) of the homes surveyed met the minimum recommendations for earthquake preparedness, and 338 did not.</p> <div id="exercising1" data-type="exercise"><div id="id6353916" data-type="problem"><p id="para64">12) Find the confidence interval at the 90% Confidence Level for the true population proportion of southern California community homes meeting at least the minimum recommendations for earthquake preparedness.</p> <ol id="list64" type="a"><li>(0.2975, 0.3796)</li> <li>(0.6270, 0.6959)</li> <li>(0.3041, 0.3730)</li> <li>(0.6204, 0.7025)</li> </ol> <p>&nbsp;</p> </div> </div> <div id="exer12345" data-type="exercise"><div id="id5970138" data-type="problem"><p id="para66">13) The point estimate for the population proportion of homes that do not meet the minimum recommendations for earthquake preparedness is ______.</p> <ol id="list66" type="a"><li>0.6614</li> <li>0.3386</li> <li>173</li> <li>338</li> </ol> </div> <div id="id6170513" data-type="solution"><p id="sol11jjkkasdf"></p></div> </div> <div data-type="exercise"><div data-type="problem"><p>14) On May 23, 2013, Gallup reported that of the 1,005 people surveyed, 76% of U.S. workers believe that they will continue working past retirement age. The confidence level for this study was reported at 95% with a ±3% margin of error.</p> <ol id="eip-idp139727584424128" type="a"><li>Determine the estimated proportion from the sample.</li> <li>Determine the sample size.</li> <li>Identify <em data-effect="italics">CL</em> and <em data-effect="italics">α</em>.</li> <li>Calculate the error bound based on the information provided.</li> <li>Compare the error bound in part d to the margin of error reported by Gallup. Explain any differences between the values.</li> <li>Create a confidence interval for the results of this study.</li> <li>A reporter is covering the release of this study for a local news station. How should she explain the confidence interval to her audience?</li> </ol> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>15) A national survey of 1,000 adults was conducted on May 13, 2013 by Rasmussen Reports. It concluded with 95% confidence that 49% to 55% of Americans believe that big-time college sports programs corrupt the process of higher education.</p> <ol id="eip-idp36043200" type="a"><li>Find the point estimate and the error bound for this confidence interval.</li> <li>Can we (with 95% confidence) conclude that more than half of all American adults believe this?</li> <li>Use the point estimate from part a and <em data-effect="italics">n</em> = 1,000 to calculate a 75% confidence interval for the proportion of American adults that believe that major college sports programs corrupt higher education.</li> <li>Can we (with 75% confidence) conclude that at least half of all American adults believe this?</li> </ol> <p>&nbsp;</p> </div> <div data-type="solution"></div> </div> <div data-type="exercise"><div id="eip-111" data-type="problem"><p>16) Public Policy Polling recently conducted a survey asking adults across the U.S. about music preferences. When asked, 80 of the 571 participants admitted that they have illegally downloaded music.</p> <ol id="eip-idm613264" type="a"><li>Create a 99% confidence interval for the true proportion of American adults who have illegally downloaded music.</li> <li>This survey was conducted through automated telephone interviews on May 6 and 7, 2013. The error bound of the survey compensates for sampling error, or natural variability among samples. List some factors that could affect the survey’s outcome that are not covered by the margin of error.</li> <li>Without performing any calculations, describe how the confidence interval would change if the confidence level changed from 99% to 90%.</li> </ol> </div> </div> <div id="eip-934" data-type="exercise"><div data-type="problem"><p id="eip-754">17) You plan to conduct a survey on your college campus to learn about the political awareness of students. You want to estimate the true proportion of college students on your campus who voted in the 2012 presidential election with 95% confidence and a margin of error no greater than five percent. How many students must you interview?</p> </div> <div data-type="solution"><p id="eip-idp11287232"></p></div> </div> <div id="eip-701" data-type="exercise"><div data-type="problem"><p>18) In a recent Zogby International Poll, nine of 48 respondents rated the likelihood of a terrorist attack in their community as “likely” or “very likely.” Use the “plus four” method to create a 97% confidence interval for the proportion of American adults who believe that a terrorist attack in their community is likely or very likely. Explain what this confidence interval means in the context of the problem.</p> </div> <p><strong>Answers to odd questions</strong></p> <p>1)</p> <ol id="fs-idm22599856" type="a"><li>1,068</li> <li>The sample size would need to be increased since the critical value increases as the confidence level increases.</li> </ol> <p>3)</p> <ol id="list_solution" type="a"><li><p id="fs-idm2730480"><em data-effect="italics">X</em> = the number of people who feel that the president is doing an acceptable job;</p> <p id="fs-idm2729840"><em data-effect="italics">P</em>′ = the proportion of people in a sample who feel that the president is doing an acceptable job.</p> </li> <li>\(N\left(0.61,\sqrt{\frac{\left(0.61\right)\left(0.39\right)}{1200}}\right)\)</li> <li><ol type="i"><li>CI: (0.59, 0.63)</li> <li>Check student’s solution</li> <li><em data-effect="italics">EBM</em>: 0.02</li> </ol> </li> </ol> <p>5)</p> <ol id="fs-idm25153232" type="a"><li><ol type="i"><li>(0.72, 0.82)</li> <li>(0.65, 0.76)</li> <li>(0.60, 0.72)</li> </ol> </li> <li>Yes, the intervals (0.72, 0.82) and (0.65, 0.76) overlap, and the intervals (0.65, 0.76) and (0.60, 0.72) overlap.</li> <li>We can say that there does not appear to be a significant difference between the proportion of Asian adults who say that their families would welcome a white person into their families and the proportion of Asian adults who say that their families would welcome a Latino person into their families.</li> <li>We can say that there is a significant difference between the proportion of Asian adults who say that their families would welcome a white person into their families and the proportion of Asian adults who say that their families would welcome a black person into their families.</li> </ol> <p>7)</p> <ol id="fs-idm130640" type="a"><li><em data-effect="italics">X</em> = the number of adult Americans who feel that crime is the main problem; <em data-effect="italics">P′</em> = the proportion of adult Americans who feel that crime is the main problem</li> <li>Since we are estimating a proportion, given <em data-effect="italics">P′</em> = 0.2 and <em data-effect="italics">n</em> = 1000, the distribution we should use is \(N\left(0.2,\sqrt{\frac{\left(0.2\right)\left(0.8\right)}{1000}}\right)\).</li> <li><ol id="fs-idm23582576" type="i"><li>CI: (0.18, 0.22)</li> <li>Check student’s solution.</li> <li><em data-effect="italics">EBM</em>: 0.02</li> </ol> </li> <li>One way to lower the sampling error is to increase the sample size.</li> <li>The stated “± 3%” represents the maximum error bound. This means that those doing the study are reporting a maximum error of 3%. Thus, they estimate the percentage of adult Americans who feel that crime is the main problem to be between 18% and 22%.</li> </ol> <p>9) c</p> <p>11) d</p> <p>13) a</p> <p>15)</p> <ol id="eip-idp44228944" type="a"><li><em data-effect="italics">p′</em> = \(\frac{\text{(0}\text{.55 + 0}\text{.49)}}{\text{2}}\) = 0.52; <em data-effect="italics">EBP</em> = 0.55 &#8211; 0.52 = 0.03</li> <li>No, the confidence interval includes values less than or equal to 0.50. It is possible that less than half of the population believe this.</li> <li><em data-effect="italics">CL</em> = 0.75, so <em data-effect="italics">α</em> = 1 – 0.75 = 0.25 and \(\frac{\alpha }{2}=0.125 {z}_{\frac{\alpha }{2}}=1.150\). (The area to the right of this <em data-effect="italics">z</em> is 0.125, so the area to the left is 1 – 0.125 = 0.875.) <span data-type="newline"><br /> </span>\(EBP=\left(1.150\right)\sqrt{\frac{0.52\left(0.48\right)}{1,000}}\approx 0.018\) <span data-type="newline"><br /> </span>(<em data-effect="italics">p</em>′ &#8211; <em data-effect="italics">EBP</em>, <em data-effect="italics">p</em>′ + <em data-effect="italics">EBP</em>) = (0.52 – 0.018, 0.52 + 0.018) = (0.502, 0.538) <p id="eip-idp88904352">Alternate Solution</p> <div id="fs-idp94228912" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="fs-idm22151456">STAT TESTS A: 1-PropZinterval with <em data-effect="italics">x</em> = (0.52)(1,000), <em data-effect="italics">n</em> = 1,000, CL = 0.75.</p> <p id="fs-idm20874816">Answer is (0.502, 0.538)</p> </div> </li> <li>Yes – this interval does not fall less than 0.50 so we can conclude that at least half of all American adults believe that major sports programs corrupt education – but we do so with only 75% confidence.</li> </ol> <p>17)</p> <p><em data-effect="italics">CL</em> = 0.95 <em data-effect="italics">α</em> = 1 – 0.95 = 0.05 \(\frac{\alpha }{2}\) = 0.025 \({z}_{\frac{\alpha }{2}}\) = 1.96. Use <em data-effect="italics">p</em>′ = <em data-effect="italics">q</em>′ = 0.5.</p> <p id="eip-idm23070384">\(n=\frac{ {z}_{\frac{\alpha }{2}}{}^{2}{p}^{\prime }{q}^{\prime }}{EB{P}^{2}}= \frac{{1.96}^{2}\left(0.5\right)\left(0.5\right)}{{0.05}^{2}}=384.16\)</p> <p>You need to interview at least 385 students to estimate the proportion to within 5% at 95% confidence.</p> <p>&nbsp;</p> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl><dt>Binomial Distribution</dt> <dd>a discrete random variable (RV) which arises from Bernoulli trials; there are a fixed number, <em data-effect="italics">n</em>, of independent trials. “Independent” means that the result of any trial (for example, trial 1) does not affect the results of the following trials, and all trials are conducted under the same conditions. Under these circumstances the binomial RV <em data-effect="italics">X</em> is defined as the number of successes in <em data-effect="italics">n</em> trials. The notation is: <em data-effect="italics">X</em>~<em data-effect="italics">B</em>(<strong>n</strong>,<strong>p</strong>). The mean is <em data-effect="italics">μ</em> = <em data-effect="italics">np</em> and the standard deviation is <em data-effect="italics">σ</em> = \(\sqrt{npq}\). The probability of exactly <em data-effect="italics">x</em> successes in <em data-effect="italics">n</em> trials is \(P\left(X=x\right)=\left(\genfrac{}{}{0}{}{n}{x}\right){p}^{x}{q}^{n-x}\).</dd> </dl> <dl id="ebpbound"><dt>Error Bound for a Population Proportion (EBP)</dt> <dd id="fs-id1166722361636">the margin of error; depends on the confidence level, the sample size, and the estimated (from the sample) proportion of successes.</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-a-single-population-mean-using-the-student-t-distribution" title="Chapter 9.3: A Single Population Mean using the Student t Distribution"><div class="chapter-title-wrap"><h3 class="chapter-number">52</h3><h2 class="chapter-title"><span class="display-none">Chapter 9.3: A Single Population Mean using the Student t Distribution</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p>In practice, we rarely know the population <span data-type="term">standard deviation</span>. In the past, when the sample size was large, this did not present a problem to statisticians. They used the sample standard deviation <em data-effect="italics">s</em> as an estimate for <em data-effect="italics">σ</em> and proceeded as before to calculate a <span data-type="term">confidence interval</span> with close enough results. However, statisticians ran into problems when the sample size was small. A small sample size caused inaccuracies in the confidence interval.</p> <p id="element-643">William S. Goset (1876–1937) of the Guinness brewery in Dublin, Ireland ran into this problem. His experiments with hops and barley produced very few samples. Just replacing <em data-effect="italics">σ</em> with <em data-effect="italics">s</em> did not produce accurate results when he tried to calculate a confidence interval. He realized that he could not use a normal distribution for the calculation; he found that the actual distribution depends on the sample size. This problem led him to &#8220;discover&#8221; what is called the <span data-type="term">Student&#8217;s t-distribution</span>. The name comes from the fact that Gosset wrote under the pen name &#8220;Student.&#8221;</p> <p>Up until the mid-1970s, some statisticians used the <span data-type="term">normal distribution</span> approximation for large sample sizes and used the Student&#8217;s t-distribution only for sample sizes of at most 30. With graphing calculators and computers, the practice now is to use the Student&#8217;s t-distribution whenever <em data-effect="italics">s</em> is used as an estimate for <em data-effect="italics">σ</em>.</p> <p id="eip-449">If you draw a simple random sample of size <em data-effect="italics">n</em> from a population that has an approximately normal distribution with mean <em data-effect="italics">μ</em> and unknown population standard deviation <em data-effect="italics">σ</em> and calculate the <em data-effect="italics">t</em>-score <em data-effect="italics">t</em> = \(\frac{\overline{x}–\mu }{\left(\frac{s}{\sqrt{n}}\right)}\), then the <em data-effect="italics">t</em>-scores follow a <strong>Student&#8217;s t-distribution with <em data-effect="italics">n</em> – 1 degrees of freedom</strong>. The <em data-effect="italics">t</em>-score has the same interpretation as the <span data-type="term"><em data-effect="italics">z</em>-score</span>. It measures how far \(\overline{x}\) is from its mean <em data-effect="italics">μ</em>. For each sample size <em data-effect="italics">n</em>, there is a different Student&#8217;s t-distribution.</p> <p>The <span data-type="term">degrees of freedom</span>, <strong><em data-effect="italics">n</em> – 1</strong>, come from the calculation of the sample standard deviation <strong><em data-effect="italics">s</em></strong>. In <a class="autogenerated-content" href="/contents/dcec5517-4e51-42f5-9b11-8cb11c02d8ae">(Figure)</a>, we used <em data-effect="italics">n</em> deviations \(\left(x–\overline{x}\text{values}\right)\) to calculate <strong><em data-effect="italics">s</em></strong>. Because the sum of the deviations is zero, we can find the last deviation once we know the other <strong><em data-effect="italics">n</em> – 1</strong> deviations. The other <strong><em data-effect="italics">n</em> – 1</strong> deviations can change or vary freely. <strong>We call the number <em data-effect="italics">n</em> – 1 the degrees of freedom (df).</strong></p> <div data-type="list"><div data-type="title">Properties of the Student&#8217;s t-Distribution</div> <ul><li>The graph for the Student&#8217;s t-distribution is similar to the standard normal curve.</li> <li>The mean for the Student&#8217;s t-distribution is zero and the distribution is symmetric about zero.</li> <li>The Student&#8217;s t-distribution has more probability in its tails than the standard normal distribution because the spread of the t-distribution is greater than the spread of the standard normal. So the graph of the Student&#8217;s t-distribution will be thicker in the tails and shorter in the center than the graph of the standard normal distribution.</li> <li>The exact shape of the Student&#8217;s t-distribution depends on the degrees of freedom. As the degrees of freedom increases, the graph of Student&#8217;s t-distribution becomes more like the graph of the standard normal distribution.</li> <li>The underlying population of individual observations is assumed to be normally distributed with unknown population mean <em data-effect="italics">μ</em> and unknown population standard deviation <em data-effect="italics">σ</em>. The size of the underlying population is generally not relevant unless it is very small. If it is bell shaped (normal) then the assumption is met and doesn&#8217;t need discussion. Random sampling is assumed, but that is a completely separate assumption from normality.</li> </ul> </div> <p class="finger">Calculators and computers can easily calculate any Student&#8217;s t-probabilities. The TI-83,83+, and 84+ have a tcdf function to find the probability for given values of <em data-effect="italics">t</em>. The grammar for the tcdf command is tcdf(lower bound, upper bound, degrees of freedom). However for confidence intervals, we need to use <strong>inverse</strong> probability to find the value of <em data-effect="italics">t</em> when we know the probability.</p> <p>For the TI-84+ you can use the invT command on the DISTRibution menu. The invT command works similarly to the invnorm. The invT command requires two inputs: <strong>invT(area to the left, degrees of freedom)</strong> The output is the t-score that corresponds to the area we specified. <span data-type="newline"><br /> </span> <span data-type="newline"><br /> </span> The TI-83 and 83+ do not have the invT command. (The TI-89 has an inverse T command.)</p> <p>A probability table for the Student&#8217;s t-distribution can also be used. The table gives t-scores that correspond to the confidence level (column) and degrees of freedom (row). (The TI-86 does not have an invT program or command, so if you are using that calculator, you need to use a probability table for the Student&#8217;s t-Distribution.) When using a <em data-effect="italics">t</em>-table, note that some tables are formatted to show the confidence level in the column headings, while the column headings in some tables may show only corresponding area in one or both tails.<span data-type="newline"><br /> </span> <span data-type="newline"><br /> </span>A Student&#8217;s t table (See <a class="autogenerated-content" href="/contents/dcec5517-4e51-42f5-9b11-8cb11c02d8ae">(Figure)</a>) gives <em data-effect="italics">t</em>-scores given the degrees of freedom and the right-tailed probability. The table is very limited. <strong>Calculators and computers can easily calculate any Student&#8217;s t-probabilities.</strong><span data-type="newline"><br /> </span></p> <div data-type="list"><div data-type="title"><strong>The notation for the Student&#8217;s t-distribution (using <em data-effect="italics">T</em> as the random variable) is:</strong></div> <ul><li><em data-effect="italics">T ~ t<sub>df</sub></em> where <em data-effect="italics">df</em> = <em data-effect="italics">n</em> – 1.</li> <li>For example, if we have a sample of size <em data-effect="italics">n</em> = 20 items, then we calculate the degrees of freedom as <em data-effect="italics">df</em> = <em data-effect="italics">n</em> &#8211; 1 = 20 &#8211; 1 = 19 and we write the distribution as <em data-effect="italics">T ~ t<sub>19</sub></em>.</li> </ul> </div> <p id="eip-737"><strong>If the population standard deviation is not known</strong>, the <span data-type="term">error bound for a population mean</span> is:</p> <ul id="eip-168"><li>\(EBM=\left({t}_{\frac{\alpha }{2}}\right)\left(\frac{s}{\sqrt{n}}\right)\),</li> <li>\({t}_{\frac{\sigma }{2}}\) is the <em data-effect="italics">t</em>-score with area to the right equal to \(\frac{\alpha }{2}\),</li> <li>use <em data-effect="italics">df</em> = <em data-effect="italics">n</em> – 1 degrees of freedom, and</li> <li><em data-effect="italics">s</em> = sample standard deviation.</li> </ul> <p><strong>The format for the confidence interval is:</strong><span data-type="newline"><br /> </span>\(\left(\overline{x}-EBM,\overline{x}+EBM\right)\).</p> <div id="fs-idm181176784" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p>To calculate the confidence interval directly: <span data-type="newline"><br /> </span>Press STAT. <span data-type="newline"><br /> </span>Arrow over to TESTS. <span data-type="newline"><br /> </span>Arrow down to 8:TInterval and press ENTER (or just press 8).</p> </div> <div class="textbox textbox--examples" data-type="example"><div data-type="exercise"><div id="id1171101195216" data-type="problem"><p>Suppose you do a study of acupuncture to determine how effective it is in relieving pain. You measure sensory rates for 15 subjects with the results given. Use the sample data to construct a 95% confidence interval for the mean sensory rate for the population (assumed normal) from which you took the data. <span data-type="newline"><br /> </span>The solution is shown step-by-step and by using the TI-83, 83+, or 84+ calculators.</p> <div id="id1171109556457"><span id="set-328" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">8.6  </span><span data-type="item">9.4  </span><span data-type="item">7.9  </span><span data-type="item">6.8  </span><span data-type="item">8.3  </span><span data-type="item">7.3  </span><span data-type="item">9.2  </span><span data-type="item">9.6  </span><span data-type="item">8.7  </span><span data-type="item">11.4  </span><span data-type="item">10.3  </span><span data-type="item">5.4  </span><span data-type="item">8.1  </span><span data-type="item">5.5  </span><span data-type="item">6.9</span></span></div> </div> <div id="id1171108330127" data-type="solution"><ul id="eip-id1169415158393"><li>The first solution is step-by-step (Solution A).</li> <li>The second solution uses the TI-83+ and TI-84 calculators (Solution B).</li> </ul> <p><span data-type="title">Solution A</span>To find the confidence interval, you need the sample mean, \(\overline{x}\), and the <em data-effect="italics">EBM</em>.</p> <p id="element-809">\(\overline{x}\) = 8.2267 <em data-effect="italics">s</em> = 1.6722 <em data-effect="italics">n</em> = 15</p> <p id="element-152"><em data-effect="italics">df</em> = 15 – 1 = 14 <em data-effect="italics">CL</em> so <em data-effect="italics">α</em> = 1 – <em data-effect="italics">CL</em> = 1 – 0.95 = 0.05</p> <p>\(\frac{\alpha }{2}\) = 0.025 \({t}_{\frac{\alpha }{2}}={t}_{0.025}\)</p> <p id="element-331">The area to the right of <em data-effect="italics">t</em><sub>0.025</sub> is 0.025, and the area to the left of <em data-effect="italics">t</em><sub>0.025</sub> is 1 – 0.025 = 0.975</p> <p>\({t}_{\frac{\alpha }{2}}={t}_{0.025}=2.14\) using invT(.975,14) on the TI-84+ calculator.</p> <p id="fs-idm31388048">\(EBM=\left({t}_{\frac{\alpha }{2}}\right)\left(\frac{s}{\sqrt{n}}\right)\)</p> <p>\(EBM=\left(2.14\right)\left(\frac{1.6722}{\sqrt{15}}\right)=0.924\)</p> <p>\(\overline{x}\) – <em data-effect="italics">EBM</em> = 8.2267 – 0.9240 = 7.3</p> <p id="element-947">\(\overline{x}\) + <em data-effect="italics">EBM</em> = 8.2267 + 0.9240 = 9.15</p> <p>The 95% confidence interval is (7.30, 9.15).</p> <p>We estimate with 95% confidence that the true population mean sensory rate is between 7.30 and 9.15.</p> </div> <div id="fs-idp148096976" data-type="solution"><div id="fs-idm8032736" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="fs-idm125793568">Press <code>STAT</code> and arrow over to <code>TESTS</code>. <span data-type="newline"><br /> </span>Arrow down to <code>8:TInterval</code> and press <code>ENTER</code> (or you can just press <code>8</code>). <span data-type="newline"><br /> </span>Arrow to <code>Data</code> and press <code>ENTER</code>. <span data-type="newline"><br /> </span>Arrow down to <code>List</code> and enter the list name where you put the data. <span data-type="newline"><br /> </span>There should be a 1 after <code>Freq</code>. <span data-type="newline"><br /> </span>Arrow down to <code>C-level</code> and enter 0.95 <span data-type="newline"><br /> </span>Arrow down to <code>Calculate</code> and press <code>ENTER</code>. <span data-type="newline"><br /> </span>The 95% confidence interval is (7.3006, 9.1527)</p> </div> <div data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="fs-idm137079696">When calculating the error bound, a probability table for the Student&#8217;s t-distribution can also be used to find the value of <em data-effect="italics">t</em>. The table gives <em data-effect="italics">t</em>-scores that correspond to the confidence level (column) and degrees of freedom (row); the <em data-effect="italics">t</em>-score is found where the row and column intersect in the table.</p> </div> </div> </div> </div> <div id="fs-idm181001040" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div data-type="problem"><p>You do a study of hypnotherapy to determine how effective it is in increasing the number of hours of sleep subjects get each night. You measure hours of sleep for 12 subjects with the following results. Construct a 95% confidence interval for the mean number of hours slept for the population (assumed normal) from which you took the data.</p> <p id="eip-idm108202576">8.2;   9.1;   7.7;   8.6;   6.9;   11.2;   10.1;   9.9;   8.9;   9.2;   7.5;   10.5</p> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><div data-type="exercise"><div data-type="problem"><p id="eip-645">The Human Toxome Project (HTP) is working to understand the scope of industrial pollution in the human body. Industrial chemicals may enter the body through pollution or as ingredients in consumer products. In October 2008, the scientists at HTP tested cord blood samples for 20 newborn infants in the United States. The cord blood of the &#8220;In utero/newborn&#8221; group was tested for 430 industrial compounds, pollutants, and other chemicals, including chemicals linked to brain and nervous system toxicity, immune system toxicity, and reproductive toxicity, and fertility problems. There are health concerns about the effects of some chemicals on the brain and nervous system. <a class="autogenerated-content" href="#eip-222">(Figure)</a> shows how many of the targeted chemicals were found in each infant’s cord blood.</p> <table id="eip-222" summary=".."><tbody><tr><td>79</td> <td>145</td> <td>147</td> <td>160</td> <td>116</td> <td>100</td> <td>159</td> <td>151</td> <td>156</td> <td>126</td> </tr> <tr><td>137</td> <td>83</td> <td>156</td> <td>94</td> <td>121</td> <td>144</td> <td>123</td> <td>114</td> <td>139</td> <td>99</td> </tr> </tbody> </table> <p id="eip-idp159676448">Use this sample data to construct a 90% confidence interval for the mean number of targeted industrial chemicals to be found in an in infant’s blood.</p> </div> <div id="eip-262" data-type="solution" data-label=""><p><span data-type="title">Solution A</span>From the sample, you can calculate \(\overline{x}\) = 127.45 and <em data-effect="italics">s</em> = 25.965. There are 20 infants in the sample, so <em data-effect="italics">n</em> = 20, and <em data-effect="italics">df</em> = 20 – 1 = 19.</p> <p id="eip-idm108372720">You are asked to calculate a 90% confidence interval: <em data-effect="italics">CL</em> = 0.90, so <em data-effect="italics">α</em> = 1 – <em data-effect="italics">CL</em> = 1 – 0.90 = 0.10 <span id="eip-idm108372304" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">\(\frac{\alpha }{2}=0.05,\phantom{\rule{2pt}{0ex}}{t}_{\frac{\alpha }{2}}={t}_{0.05}\)</span></span></p> <p id="eip-idm186375552">By definition, the area to the right of <em data-effect="italics">t</em><sub>0.05</sub> is 0.05 and so the area to the left of <em data-effect="italics">t</em><sub>0.05</sub> is 1 – 0.05 = 0.95.</p> <p id="eip-idm38662128" class="finger">Use a table, calculator, or computer to find that <em data-effect="italics">t</em><sub>0.05</sub> = 1.729.</p> <p id="eip-idm126306320">\(EBM={t}_{\frac{\alpha }{2}}\left(\frac{s}{\sqrt{n}}\right)=1.729\left(\frac{25.965}{\sqrt{20}}\right) \approx  10.038\)</p> <p id="eip-idm123738016">\(\overline{x}\) – <em data-effect="italics">EBM</em> = 127.45 – 10.038 = 117.412</p> <p id="eip-idm1765136">\(\overline{x}\) + <em data-effect="italics">EBM</em> = 127.45 + 10.038 = 137.488</p> <p id="eip-idm27533168">We estimate with 90% confidence that the mean number of all targeted industrial chemicals found in cord blood in the United States is between 117.412 and 137.488.</p> </div> <div id="fs-idp111080464" data-type="solution" data-label=""><p id="eip-idp189552432"><span data-type="title">Solution B</span></p> <div id="fs-idm115636704" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="fs-idm150426592">Enter the data as a list. <span data-type="newline"><br /> </span>Press <code>STAT</code> and arrow over to <code>TESTS</code>. <span data-type="newline"><br /> </span>Arrow down to <code>8:TInterval</code> and press <code>ENTER</code> (or you can just press <code>8</code>). Arrow to Data and press <code>ENTER</code>. <span data-type="newline"><br /> </span>Arrow down to <code>List</code> and enter the list name where you put the data. <span data-type="newline"><br /> </span>Arrow down to <code>Freq</code> and enter 1. <span data-type="newline"><br /> </span>Arrow down to <code>C-level</code> and enter 0.90 <span data-type="newline"><br /> </span>Arrow down to <code>Calculate</code> and press <code>ENTER</code>. <span data-type="newline"><br /> </span>The 90% confidence interval is (117.41, 137.49).</p> </div> </div> </div> </div> <div id="fs-idm12320384" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div data-type="problem"><p>A random sample of statistics students were asked to estimate the total number of hours they spend watching television in an average week. The responses are recorded in <a class="autogenerated-content" href="#eip-672">(Figure)</a>. Use this sample data to construct a 98% confidence interval for the mean number of hours statistics students will spend watching television in one week.</p> <table id="eip-672" summary=".."><tbody><tr><td>0</td> <td>3</td> <td>1</td> <td>20</td> <td>9</td> </tr> <tr><td>5</td> <td>10</td> <td>1</td> <td>10</td> <td>4</td> </tr> <tr><td>14</td> <td>2</td> <td>4</td> <td>4</td> <td>5</td> </tr> </tbody> </table> </div> </div> </div> <div class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="eip-idp72771968">“America’s Best Small Companies.” Forbes, 2013. Available online at http://www.forbes.com/best-small-companies/list/ (accessed July 2, 2013).</p> <p id="eip-idm62303392">Data from <em data-effect="italics">Microsoft Bookshelf</em>.</p> <p id="eip-idm46745392">Data from http://www.businessweek.com/.</p> <p id="eip-idp72771584">Data from http://www.forbes.com/.</p> <p id="eip-idp90955712">“Disclosure Data Catalog: Leadership PAC and Sponsors Report, 2012.” Federal Election Commission. Available online at http://www.fec.gov/data/index.jsp (accessed July 2,2013).</p> <p id="eip-idm101744816">“Human Toxome Project: Mapping the Pollution in People.” Environmental Working Group. Available online at http://www.ewg.org/sites/humantoxome/participants/participant-group.php?group=in+utero%2Fnewborn (accessed July 2, 2013).</p> <p id="eip-idp90954816">“Metadata Description of Leadership PAC List.” Federal Election Commission. Available online at http://www.fec.gov/finance/disclosure/metadata/metadataLeadershipPacList.shtml (accessed July 2, 2013).</p> </div> <div class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="eip-895">In many cases, the researcher does not know the population standard deviation, <em data-effect="italics">σ</em>, of the measure being studied. In these cases, it is common to use the sample standard deviation, <em data-effect="italics">s</em>, as an estimate of <em data-effect="italics">σ</em>. The normal distribution creates accurate confidence intervals when <em data-effect="italics">σ</em> is known, but it is not as accurate when <em data-effect="italics">s</em> is used as an estimate. In this case, the Student’s t-distribution is much better. Define a t-score using the following formula:</p> <p id="eip-828">\(t= \frac{\overline{x}- \mu }{s}{\sqrt{n}}}\)</p> <p id="eip-305">The <em data-effect="italics">t</em>-score follows the Student’s t-distribution with <em data-effect="italics">n</em> – 1 degrees of freedom. The confidence interval under this distribution is calculated with <em data-effect="italics">EBM</em> = \(\left({t}_{\frac{\alpha }{2}}\right)\frac{s}{\sqrt{n}}\) where \({t}_{\frac{\alpha }{2}}\) is the <em data-effect="italics">t</em>-score with area to the right equal to \(\frac{\alpha }{2}\), <em data-effect="italics">s</em> is the sample standard deviation, and <em data-effect="italics">n</em> is the sample size. Use a table, calculator, or computer to find \({t}_{\frac{\alpha }{2}}\) for a given <em data-effect="italics">α</em>.</p> </div> <div class="formula-review" data-depth="1"><h3 data-type="title">Formula Review</h3> <p id="eip-987"><em data-effect="italics">s</em> = the standard deviation of sample values.</p> <p>\(t= \frac{\overline{x}-\mu }{\frac{s}{\sqrt{n}}}\) is the formula for the <em data-effect="italics">t</em>-score which measures how far away a measure is from the population mean in the Student’s t-distribution</p> <p id="eip-995"><em data-effect="italics">df</em> = <em data-effect="italics">n</em> &#8211; 1; the degrees of freedom for a Student’s t-distribution where n represents the size of the sample</p> <p><em data-effect="italics">T</em>~<em data-effect="italics">t<sub>df</sub></em> the random variable, <em data-effect="italics">T</em>, has a Student’s t-distribution with <em data-effect="italics">df</em> degrees of freedom</p> <p>\(EBM={t}_{\frac{\alpha }{2}}\frac{s}{\sqrt{n}}\) = the error bound for the population mean when the population standard deviation is unknown</p> <p>\({t}_{\frac{\alpha }{2}}\) is the <em data-effect="italics">t</em>-score in the Student’s t-distribution with area to the right equal to \(\frac{\alpha }{2}\)</p> <p id="fs-idm169440128">The general form for a confidence interval for a single mean, population standard deviation unknown, Student&#8217;s t is given by (lower bound, upper bound) <span data-type="newline"><br /> </span>= (point estimate – <em data-effect="italics">EBM</em>, point estimate + <em data-effect="italics">EBM</em>) <span data-type="newline"><br /> </span>= \(\left(\overline{x}–\frac{ts}{\sqrt{n}},\overline{x}\text{+ }\frac{ts}{\sqrt{n}}\right)\)</p> </div> <div class="practice" data-depth="1"><p><em data-effect="italics">Use the following information to answer the next five exercises.</em> A hospital is trying to cut down on emergency room wait times. It is interested in the amount of time patients must wait before being called back to be examined. An investigation committee randomly surveyed 70 patients. The sample mean was 1.5 hours with a sample standard deviation of 0.5 hours.</p> <div data-type="exercise"><div id="eip-280" data-type="problem"><p>Identify the following:</p> <ol id="fs-idm169488192" type="a"><li>\(\overline{x}\) =_______</li> <li>\({s}_{x}\) =_______</li> <li><em data-effect="italics">n</em> =_______</li> <li><em data-effect="italics">n</em> – 1 =_______</li> </ol> </div> </div> <div data-type="exercise"><div data-type="problem"><p>Define the random variables <em data-effect="italics">X</em> and \(\overline{X}\) in words.</p> </div> <div data-type="solution"><p><em data-effect="italics">X</em> is the number of hours a patient waits in the emergency room before being called back to be examined. \(\overline{X}\) is the mean wait time of 70 patients in the emergency room.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>Which distribution should you use for this problem?</p> </div> </div> <div data-type="exercise"><div id="eip-477" data-type="problem"><p>Construct a 95% confidence interval for the population mean time spent waiting. State the confidence interval, sketch the graph, and calculate the error bound.</p> </div> <div data-type="solution"><p>CI: (1.3808, 1.6192)</p> <div id="fs-idm150372688" class="bc-figure figure"><span id="eip-idp2023440" data-type="media" data-alt="This is a normal distribution curve. The peak of the curve coincides with the point 1.5 on the horizontal axis. A central region is shaded between points 1.38 and 1.62."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/CNX_Stats_C08_M03_item001annoN-1.jpg" alt="This is a normal distribution curve. The peak of the curve coincides with the point 1.5 on the horizontal axis. A central region is shaded between points 1.38 and 1.62." width="380" data-media-type="image/jpeg" /></span></div> <p id="eip-idm67620928"><em data-effect="italics">EBM</em> = 0.12</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p id="eip-785">Explain in complete sentences what the confidence interval means.</p> </div> </div> <p><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next six exercises:</em> One hundred eight Americans were surveyed to determine the number of hours they spend watching television each month. It was revealed that they watched an average of 151 hours each month with a standard deviation of 32 hours. Assume that the underlying population distribution is normal.</p> <div data-type="exercise"><div id="eip-671" data-type="problem"><p id="eip-716">Identify the following:</p> <ol id="fs-idm137662144" type="a"><li>\(\overline{x}\) =_______</li> <li>\({s}_{x}\) =_______</li> <li><em data-effect="italics">n</em> =_______</li> <li><em data-effect="italics">n</em> – 1 =_______</li> </ol> </div> <div id="eip-123" data-type="solution"><ol id="fs-idm117900288" type="a"><li>\(\overline{x}\) = 151</li> <li>\({s}_{x}\) = 32</li> <li><em data-effect="italics">n</em> = 108</li> <li><em data-effect="italics">n</em> – 1 = 107</li> </ol> </div> </div> <div data-type="exercise"><div data-type="problem"><p id="eip-675">Define the random variable <em data-effect="italics">X</em> in words.</p> </div> </div> <div data-type="exercise"><div id="eip-816" data-type="problem"><p>Define the random variable \(\overline{X}\) in words.</p> </div> <div data-type="solution"><p>\(\overline{X}\) is the mean number of hours spent watching television per month from a sample of 108 Americans.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p id="eip-578">Which distribution should you use for this problem?</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p id="eip-489">Construct a 99% confidence interval for the population mean hours spent watching television per month. (a) State the confidence interval, (b) sketch the graph, and (c) calculate the error bound.</p> </div> <div data-type="solution"><p id="eip-id1169812910484">CI: (142.92, 159.08)<span data-type="newline"><br /> </span></p> <div id="fs-idm53713472" class="bc-figure figure"><span id="eip-idm24726752" data-type="media" data-alt="This is a normal distribution curve. The peak of the curve coincides with the point 151 on the horizontal axis. A central region is shaded between points 142.92 and 159.08."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C08_M03_item002annoN-1.jpg" alt="This is a normal distribution curve. The peak of the curve coincides with the point 151 on the horizontal axis. A central region is shaded between points 142.92 and 159.08." width="380" data-media-type="image/jpeg" /></span></div> <p><span data-type="newline"><br /> </span><em data-effect="italics">EBM</em> = 8.08</p> </div> </div> <div id="eip-793" data-type="exercise"><div data-type="problem"><p>Why would the error bound change if the confidence level were lowered to 95%?</p> </div> </div> <p><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next 13 exercises:</em> The data in <a class="autogenerated-content" href="#element-181">(Figure)</a> are the result of a random survey of 39 national flags (with replacement between picks) from various countries. We are interested in finding a confidence interval for the true mean number of colors on a national flag. Let <em data-effect="italics">X</em> = the number of colors on a national flag.</p> <table summary="This table presents the X values (1-5) in the first column and the frequency in the second column."><thead><tr><th><em data-effect="italics">X</em></th> <th>Freq.</th> </tr> </thead> <tbody><tr><td data-align="center">1</td> <td data-align="center">1</td> </tr> <tr><td data-align="center">2</td> <td data-align="center">7</td> </tr> <tr><td data-align="center">3</td> <td data-align="center">18</td> </tr> <tr><td data-align="center">4</td> <td data-align="center">7</td> </tr> <tr><td data-align="center">5</td> <td data-align="center">6</td> </tr> </tbody> </table> <div data-type="exercise"><div id="id45583482" data-type="problem"><p id="element-234525">Calculate the following:</p> <ol id="list-87678" type="a"><li>\(\overline{x}\) =______</li> <li>\({s}_{x}\) =______</li> <li><em data-effect="italics">n</em> =______</li> </ol> </div> <div id="id45593982" data-type="solution"><ol id="list-827634" type="a"><li>3.26</li> <li>1.02</li> <li>39</li> </ol> </div> </div> <div data-type="exercise"><div id="id45594056" data-type="problem"><p>Define the random variable \(\overline{X}\) in words.</p> </div> </div> <div data-type="exercise"><div id="id45594158" data-type="problem"><p id="element-703">What is \(\overline{x}\) estimating?</p> </div> <div id="id45594200" data-type="solution"><p><em data-effect="italics">μ</em></p> </div> </div> <div data-type="exercise"><div id="id45594258" data-type="problem"><p>Is \({\sigma }_{x}\) known?</p> </div> </div> <div data-type="exercise"><div id="id45594330" data-type="problem"><p>As a result of your answer to <a class="autogenerated-content" href="#element-704">(Figure)</a>, state the exact distribution to use when calculating the confidence interval.</p> </div> <div id="id45594350" data-type="solution"><p><em data-effect="italics">t</em><sub>38</sub></p> </div> </div> <p><span data-type="newline" data-count="1"><br /> </span><em data-effect="italics">Construct a 95% confidence interval for the true mean number of colors on national flags.</em></p> <div data-type="exercise"><div id="id45959436" data-type="problem"><p>How much area is in both tails (combined)?</p> </div> </div> <div id="element-945" data-type="exercise"><div id="id45959507" data-type="problem"><p>How much area is in each tail?</p> </div> <div id="id45959554" data-type="solution"><p>0.025</p> </div> </div> <div id="element-526" data-type="exercise"><div id="id45959582" data-type="problem"><p id="element-2354253">Calculate the following:</p> <ol id="list-76787865" type="a"><li>lower limit</li> <li>upper limit</li> <li>error bound</li> </ol> </div> </div> <div data-type="exercise"><div id="id45959730" data-type="problem"><p>The 95% confidence interval is_____.</p> </div> <div id="id45959750" data-type="solution"><p id="element-350">(2.93, 3.59)</p> </div> </div> <div data-type="exercise"><div id="id45959778" data-type="problem"><p>Fill in the blanks on the graph with the areas, the upper and lower limits of the Confidence Interval and the sample mean.</p> <div id="five5" class="bc-figure figure"><span id="id45959805" data-type="media" data-alt="This is a template of a normal distribution curve with the central region shaded to represent a confidence interval. The residual areas are on either side of the shaded region. Blanks indicate that students should label the confidence level, residual areas, and points that define the confidence interval."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch08_07_01-1.jpg" alt="This is a template of a normal distribution curve with the central region shaded to represent a confidence interval. The residual areas are on either side of the shaded region. Blanks indicate that students should label the confidence level, residual areas, and points that define the confidence interval." width="380" data-media-type="image/jpg" /></span></div> </div> </div> <div data-type="exercise"><div id="id45959843" data-type="problem"><p id="fs-idp7770160">In one complete sentence, explain what the interval means.</p> </div> <div id="eip-idm51678224" data-type="solution"><p id="eip-idp98208704">We are 95% confident that the true mean number of colors for national flags is between 2.93 colors and 3.59 colors.</p> </div> </div> <div data-type="exercise"><div id="id46274989" data-type="problem"><p id="one">Using the same \(\overline{x}\), \({s}_{x}\), and level of confidence, suppose that <em data-effect="italics">n</em> were 69 instead of 39. Would the error bound become larger or smaller? How do you know?</p> </div> <div id="eip-idm61120128" data-type="solution"><p id="eip-idm61119872">The error bound would become EBM = 0.245. This error bound decreases because as sample sizes increase, variability decreases and we need less interval length to capture the true mean.</p> </div> </div> <div data-type="exercise"><div id="id46275081" data-type="problem"><p>Using the same \(\overline{x}\), \({s}_{x}\), and <em data-effect="italics">n</em> = 39, how would the error bound change if the confidence level were reduced to 90%? Why?</p> </div> </div> </div> <div id="fs-idp293092128" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <div data-type="exercise"><div id="id6354520" data-type="problem"><p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id5182442" data-type="problem"><p id="para7">1) A random survey of enrollment at 35 community colleges across the United States yielded the following figures: 6,414; 1,550; 2,109; 9,350; 21,828; 4,300; 5,944; 5,722; 2,825; 2,044; 5,481; 5,200; 5,853; 2,750; 10,012; 6,357; 27,000; 9,414; 7,681; 3,200; 17,500; 9,200; 7,380; 18,314; 6,557; 13,713; 17,768; 7,493; 2,771; 2,861; 1,263; 7,285; 28,165; 5,080; 11,622. Assume the underlying population is normal.</p> <ol id="list10" type="a"><li><ol id="list9" type="i"><li>\(\overline{x}\) = __________</li> <li>\({s}_{x}\) = __________</li> <li><em data-effect="italics">n</em> = __________</li> <li><em data-effect="italics">n</em> – 1 = __________</li> </ol> </li> <li>Define the random variables \(X\) and \(\overline{X}\) in words.</li> <li>Which distribution should you use for this problem? Explain your choice.</li> <li>Construct a 95% confidence interval for the population mean enrollment at community colleges in the United States. <ol id="list11" type="i"><li>State the confidence interval.</li> <li>Sketch the graph.</li> <li>Calculate the error bound.</li> </ol> </li> <li>What will happen to the error bound and confidence interval if 500 community colleges were surveyed? Why?</li> </ol> </div> <div id="id6025360" data-type="solution"><p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id5647026" data-type="problem"><p id="para8">2) Suppose that a committee is studying whether or not there is waste of time in our judicial system. It is interested in the mean amount of time individuals waste at the courthouse waiting to be called for jury duty. The committee randomly surveyed 81 people who recently served as jurors. The sample mean wait time was eight hours with a sample standard deviation of four hours.</p> <ol id="list13" type="a"><li><ol id="list12" type="i"><li>\(\overline{x}\) = __________</li> <li>\({s}_{x}\) = __________</li> <li><em data-effect="italics">n</em> = __________</li> <li><em data-effect="italics">n</em> – 1 = __________</li> </ol> </li> <li>Define the random variables \(X\) and \(\overline{X}\) in words.</li> <li>Which distribution should you use for this problem? Explain your choice.</li> <li>Construct a 95% confidence interval for the population mean time wasted. <ol id="list14" type="i"><li>State the confidence interval.</li> <li>Sketch the graph.</li> <li>Calculate the error bound.</li> </ol> </li> <li>Explain in a complete sentence what the confidence interval means.</li> </ol> <p>&nbsp;</p> </div> </div> <div id="exer10" data-type="exercise"><div id="id6300844" data-type="problem"><p id="para21">3) A pharmaceutical company makes tranquilizers. It is assumed that the distribution for the length of time they last is approximately normal. Researchers in a hospital used the drug on a random sample of nine patients. The effective period of the tranquilizer for each patient (in hours) was as follows: 2.7; 2.8; 3.0; 2.3; 2.3; 2.2; 2.8; 2.1; and 2.4.</p> <ol id="list21" type="a"><li><ol id="list21a" type="i"><li>\(\overline{x}\) = __________</li> <li>\({s}_{x}\) = __________</li> <li><em data-effect="italics">n</em> = __________</li> <li><em data-effect="italics">n</em> – 1 = __________</li> </ol> </li> <li>Define the random variable \(X\) in words.</li> <li>Define the random variable \(\overline{X}\) in words.</li> <li>Which distribution should you use for this problem? Explain your choice.</li> <li>Construct a 95% confidence interval for the population mean length of time. <ol id="list22_5" type="i"><li>State the confidence interval.</li> <li>Sketch the graph.</li> <li>Calculate the error bound.</li> </ol> </li> <li>What does it mean to be “95% confident” in this problem?</li> </ol> </div> <div id="fs-idm131198032" data-type="solution"><p>&nbsp;</p> </div> </div> <div id="exer11" data-type="exercise"><div id="id5850872" data-type="problem"><p id="para23">4) Suppose that 14 children, who were learning to ride two-wheel bikes, were surveyed to determine how long they had to use training wheels. It was revealed that they used them an average of six months with a sample standard deviation of three months. Assume that the underlying population distribution is normal.</p> <ol id="list23_5" type="a"><li><ol id="list24" type="i"><li>\(\overline{x}\) = __________</li> <li>\({s}_{x}\) = __________</li> <li><em data-effect="italics">n</em> = __________</li> <li><em data-effect="italics">n</em> – 1 = __________</li> </ol> </li> <li>Define the random variable \(X\) in words.</li> <li>Define the random variable\(\overline{X}\) in words.</li> <li>Which distribution should you use for this problem? Explain your choice.</li> <li>Construct a 99% confidence interval for the population mean length of time using training wheels. <ol id="list24_5" type="i"><li>State the confidence interval.</li> <li>Sketch the graph.</li> <li>Calculate the error bound.</li> </ol> </li> <li>Why would the error bound change if the confidence level were lowered to 90%?</li> </ol> </div> </div> <div id="eip-390" data-type="exercise"><div data-type="problem"><p>5) The Federal Election Commission (FEC) collects information about campaign contributions and disbursements for candidates and political committees each election cycle. A political action committee (PAC) is a committee formed to raise money for candidates and campaigns. A Leadership PAC is a PAC formed by a federal politician (senator or representative) to raise money to help other candidates’ campaigns.</p> <p id="eip-idm79133520">The FEC has reported financial information for 556 Leadership PACs that operating during the 2011–2012 election cycle. The following table shows the total receipts during this cycle for a random selection of 30 Leadership PACs.</p> <table summary=".."><tbody><tr><td>\$46,500.00</td> <td>\$0</td> <td>\$40,966.50</td> <td>\$105,887.20</td> <td>\$5,175.00</td> </tr> <tr><td>\$29,050.00</td> <td>\$19,500.00</td> <td>\$181,557.20</td> <td>\$31,500.00</td> <td>\$149,970.80</td> </tr> <tr><td>\$2,555,363.20</td> <td>\$12,025.00</td> <td>\$409,000.00</td> <td>\$60,521.70</td> <td>\$18,000.00</td> </tr> <tr><td>\$61,810.20</td> <td>\$76,530.80</td> <td>\$119,459.20</td> <td>\$0</td> <td>\$63,520.00</td> </tr> <tr><td>\$6,500.00</td> <td>\$502,578.00</td> <td>\$705,061.10</td> <td>\$708,258.90</td> <td>\$135,810.00</td> </tr> <tr><td>\$2,000.00</td> <td>\$2,000.00</td> <td>\$0</td> <td>\$1,287,933.80</td> <td>\$219,148.30</td> </tr> </tbody> </table> <p id="fs-idm101832672">\(\overline{x}=\$251,854.23\)</p> <p id="fs-idm60266896">\(s=\text{ }\$521,130.41\)</p> <p id="eip-idp69132832">Use this sample data to construct a 96% confidence interval for the mean amount of money raised by all Leadership PACs during the 2011–2012 election cycle. Use the Student&#8217;s t-distribution.</p> </div> <div id="eip-770" data-type="solution"><p id="eip-idp167827024"></p></div> </div> <div data-type="exercise"><div data-type="problem"><p id="para2">6) In six packages of “The Flintstones® Real Fruit Snacks” there were five Bam-Bam snack pieces. The total number of snack pieces in the six bags was 68. We wish to calculate a 96% confidence interval for the population proportion of Bam-Bam snack pieces.</p> <ol id="list5" type="a"><li>Define the random variables <em data-effect="italics">X</em> and <em data-effect="italics">P</em>′ in words.</li> <li>Which distribution should you use for this problem? Explain your choice</li> <li>Calculate <em data-effect="italics">p</em>′.</li> <li>Construct a 96% confidence interval for the population proportion of Bam-Bam snack pieces per bag. <ol id="list6" type="i"><li>State the confidence interval.</li> <li>Sketch the graph.</li> <li>Calculate the error bound.</li> </ol> </li> <li>Do you think that six packages of fruit snacks yield enough data to give accurate results? Why or why not?</li> </ol> <p id="eip-184"><em data-effect="italics">7) Forbes</em> magazine published data on the best small firms in 2012. These were firms that had been publicly traded for at least a year, have a stock price of at least \$5 per share, and have reported annual revenue between \$5 million and \$1 billion. The <a class="autogenerated-content" href="#eip-884">(Figure)</a> shows the ages of the corporate CEOs for a random sample of these firms.</p> <table summary=".."><tbody><tr><td>48</td> <td>58</td> <td>51</td> <td>61</td> <td>56</td> </tr> <tr><td>59</td> <td>74</td> <td>63</td> <td>53</td> <td>50</td> </tr> <tr><td>59</td> <td>60</td> <td>60</td> <td>57</td> <td>46</td> </tr> <tr><td>55</td> <td>63</td> <td>57</td> <td>47</td> <td>55</td> </tr> <tr><td>57</td> <td>43</td> <td>61</td> <td>62</td> <td>49</td> </tr> <tr><td>67</td> <td>67</td> <td>55</td> <td>55</td> <td>49</td> </tr> </tbody> </table> <p id="eip-idm30943456">Use this sample data to construct a 90% confidence interval for the mean age of CEO’s for these top small firms. Use the Student&#8217;s t-distribution.</p> </div> </div> <div data-type="exercise"><div id="eip-190" data-type="problem"><p id="eip-579">Unoccupied seats on flights cause airlines to lose revenue. Suppose a large airline wants to estimate its mean number of unoccupied seats per flight over the past year. To accomplish this, the records of 225 flights are randomly selected and the number of unoccupied seats is noted for each of the sampled flights. The sample mean is 11.6 seats and the sample standard deviation is 4.1 seats.</p> <ol id="eip-idm49907776" type="a"><li><ol id="eip-idp70657312" type="i"><li>\(\overline{x}\) = __________</li> <li>\({s}_{x}\) = __________</li> <li><em data-effect="italics">n</em> = __________</li> <li><em data-effect="italics">n</em>-1 = __________</li> </ol> </li> <li>Define the random variables \(X\) and \(\overline{X}\) in words.</li> <li>Which distribution should you use for this problem? Explain your choice.</li> <li>Construct a 92% confidence interval for the population mean number of unoccupied seats per flight. <ol id="eip-idm535456" type="i"><li>State the confidence interval.</li> <li>Sketch the graph.</li> <li>Calculate the error bound.</li> </ol> </li> </ol> </div> <div id="fs-idm33461392" data-type="solution"><p>&nbsp;</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="eip-idm117422496" data-type="problem"><p id="eip-idp46649712">8) In a recent sample of 84 used car sales costs, the sample mean was \$6,425 with a standard deviation of \$3,156. Assume the underlying distribution is approximately normal.</p> <ol id="eip-idp46650288" type="a"><li>Which distribution should you use for this problem? Explain your choice.</li> <li>Define the random variable \(\overline{X}\) in words.</li> <li>Construct a 95% confidence interval for the population mean cost of a used car. <ol id="eip-idp2064304" type="i"><li>State the confidence interval.</li> <li>Sketch the graph.</li> <li>Calculate the error bound.</li> </ol> </li> <li>Explain what a “95% confidence interval” means for this study.</li> </ol> </div> </div> <div id="eip-789" data-type="exercise"><div id="eip-idp101366832" data-type="problem"><p id="eip-idp114357168">9) Six different national brands of chocolate chip cookies were randomly selected at the supermarket. The grams of fat per serving are as follows: 8; 8; 10; 7; 9; 9. Assume the underlying distribution is approximately normal.</p> <ol id="eip-idp114357792" type="a"><li>Construct a 90% confidence interval for the population mean grams of fat per serving of chocolate chip cookies sold in supermarkets. <ol id="eip-idm3541392" type="i"><li>State the confidence interval.</li> <li>Sketch the graph.</li> <li>Calculate the error bound.</li> </ol> </li> <li>If you wanted a smaller error bound while keeping the same level of confidence, what should have been changed in the study before it was done?</li> <li>Go to the store and record the grams of fat per serving of six brands of chocolate chip cookies.</li> <li>Calculate the mean.</li> <li>Is the mean within the interval you calculated in part a? Did you expect it to be? Why or why not?</li> </ol> </div> <div id="fs-idm144387776" data-type="solution"><p>&nbsp;</p> </div> </div> <div id="fs-idm170967312" data-type="exercise"><div id="fs-idm181156640" data-type="problem"><p id="fs-idm181156384">10) A survey of the mean number of cents off that coupons give was conducted by randomly surveying one coupon per page from the coupon sections of a recent San Jose Mercury News. The following data were collected: 20¢; 75¢; 50¢; 65¢; 30¢; 55¢; 40¢; 40¢; 30¢; 55¢; ?1.50; 40¢; 65¢; 40¢. Assume the underlying distribution is approximately normal.</p> <ol id="fs-idm115727600" type="a"><li><ol id="fs-idm158307360" type="i"><li>\(\overline{x}\) = __________</li> <li>\({s}_{x}\) = __________</li> <li><em data-effect="italics">n</em> = __________</li> <li><em data-effect="italics">n</em>-1 = __________</li> </ol> </li> <li>Define the random variables \(X\) and \(\overline{X}\) in words.</li> <li>Which distribution should you use for this problem? Explain your choice.</li> <li>Construct a 95% confidence interval for the population mean worth of coupons. <ol id="fs-idm87013712" type="i"><li>State the confidence interval.</li> <li>Sketch the graph.</li> <li>Calculate the error bound.</li> </ol> </li> <li>If many random samples were taken of size 14, what percent of the confidence intervals constructed should contain the population mean worth of coupons? Explain why.</li> </ol> </div> </div> <p id="id7660865"><span data-type="newline"><br /> </span><em data-effect="italics">11) Use the following information to answer the next two exercises:</em> A quality control specialist for a restaurant chain takes a random sample of size 12 to check the amount of soda served in the 16 oz. serving size. The sample mean is 13.30 with a sample standard deviation of 1.55. Assume the underlying population is normally distributed.</p> <div id="eeeeeeeeeeeeeeex" data-type="exercise"><div id="id5710642" data-type="problem"><p id="para58">Find the 95% Confidence Interval for the true population mean for the amount of soda served.</p> <ol id="list58" type="a"><li>(12.42, 14.18)</li> <li>(12.32, 14.29)</li> <li>(12.50, 14.10)</li> <li>Impossible to determine</li> </ol> </div> <div id="id6215942" data-type="solution"><p id="solb11"></p></div> </div> <div id="exxxx" data-type="exercise"><div id="id6176057" data-type="problem"><p id="para60">12) What is the error bound?</p> <ol id="list60" type="a"><li>0.87</li> <li>1.98</li> <li>0.99</li> <li>1.74</li> </ol> <p><strong>Answers to odd questions</strong></p> <p>1)</p> <ol id="list_s4" type="a"><li><ol id="list_s5" type="i"><li>8629</li> <li>6944</li> <li>35</li> <li>34</li> </ol> </li> <li>\({t}_{34}\)</li> <li><ol id="list_s7" type="i"><li>CI: (6244, 11,014)</li> <li><div id="fs-idp12588848" class="bc-figure figure"><span id="mmm" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch08_10_01-1.jpg" alt="" width="380" data-media-type="image/jpeg" /></span></div> </li> <li>EB = 2385</li> </ol> </li> <li>It will become smaller</li> </ol> <p>3)</p> <ol id="fs-idm86189312" type="a"><li><ol id="fs-idm33946400" type="i"><li>\(\overline{x}\) = 2.51</li> <li>\({s}_{x}\) = 0.318</li> <li><em data-effect="italics">n</em> = 9</li> <li><em data-effect="italics">n</em> &#8211; 1 = 8</li> </ol> </li> <li>the effective length of time for a tranquilizer</li> <li>the mean effective length of time of tranquilizers from a sample of nine patients</li> <li>We need to use a Student’s-t distribution, because we do not know the population standard deviation.</li> <li><ol id="fs-idm270080896" type="i"><li>CI: (2.27, 2.76)</li> <li>Check student&#8217;s solution.</li> <li><em data-effect="italics">EBM</em>: 0.25</li> </ol> </li> <li>If we were to sample many groups of nine patients, 95% of the samples would contain the true population mean length of time.</li> </ol> <p>5)</p> <p id="fs-idm151956208">\(\overline{x}=\$251,854.23\)</p> <p id="fs-idp3598576">\(s=\text{ }\$521,130.41\)</p> <p id="fs-idm60939744">Note that we are not given the population standard deviation, only the standard deviation of the sample.</p> <p id="eip-idm63664256">There are 30 measures in the sample, so <em data-effect="italics">n</em> = 30, and <em data-effect="italics">df</em> = 30 &#8211; 1 = 29</p> <p id="fs-idm117819488"><em data-effect="italics">CL</em> = 0.96, so <em data-effect="italics">α</em> = 1 &#8211; <em data-effect="italics">CL</em> = 1 &#8211; 0.96 = 0.04</p> <p id="fs-idm140901120">\(\frac{\alpha }{2}=0.02{t}_{\frac{\alpha }{2}}={t}_{0.02}\) = 2.150</p> <p id="eip-idp903520">\(EBM={t}_{\frac{\alpha }{2}}\left(\frac{s}{\sqrt{n}}\right)=2.150\left(\frac{521,130.41}{\sqrt{30}}\right) ~ \$204,561.66\)</p> <p id="eip-idm139616976">\(\overline{x}\) &#8211; <em data-effect="italics">EBM</em> = \$251,854.23 &#8211; \$204,561.66 = \$47,292.57</p> <p id="eip-idm95497616">\(\overline{x}\) + <em data-effect="italics">EBM</em> = \$251,854.23+ \$204,561.66 = \$456,415.89</p> <p id="eip-idm43369008">We estimate with 96% confidence that the mean amount of money raised by all Leadership PACs during the 2011–2012 election cycle lies between \$47,292.57 and \$456,415.89.<span data-type="newline"><br /> </span></p> <p id="eip-idm19923248">Alternate Solution</p> <div id="fs-idm33267104" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="fs-idm130936496">Enter the data as a list.</p> <p id="fs-idm33521776">Press <code>STAT</code> and arrow over to <code>TESTS</code>.</p> <p id="fs-idm74740208">Arrow down to <code>8:TInterval</code>.</p> <p id="fs-idm137623792">Press <code>ENTER</code>.</p> <p id="fs-idm117237136">Arrow to Data and press <code>ENTER</code>.</p> <p id="fs-idm121069488">Arrow down and enter the name of the list where the data is stored.</p> <p id="fs-idm76431648">Enter <code>Freq</code>: 1</p> <p id="fs-idm157410432">Enter <code>C-Level</code>: 0.96</p> <p id="fs-idm14318528">Arrow down to <code>Calculate</code> and press <code>Enter</code>.</p> <p id="fs-idm131790576">The 96% confidence interval is (\$47,262, \$456,447).</p> </div> <p>The difference between solutions arises from rounding differences.</p> <p>7)</p> <ol id="fs-idp16278992" type="a"><li><ol id="fs-idm150491184" type="i"><li>\(\overline{x}\) = 11.6</li> <li>\({s}_{x}\) = 4.1</li> <li><em data-effect="italics">n</em> = 225</li> <li><em data-effect="italics">n</em> &#8211; 1 = 224</li> </ol> </li> <li><em data-effect="italics">X</em> is the number of unoccupied seats on a single flight. \(\overline{X}\) is the mean number of unoccupied seats from a sample of 225 flights.</li> <li>We will use a Student’s-t distribution, because we do not know the population standard deviation.</li> <li><ol id="fs-idm19293168" type="i"><li>CI: (11.12 , 12.08)</li> <li>Check student&#8217;s solution.</li> <li><em data-effect="italics">EBM</em>: 0.48</li> </ol> </li> </ol> <p>9)</p> <ol id="fs-idm144387520" type="a"><li><ol id="fs-idm104467472" type="i"><li>CI: (7.64 , 9.36)</li> <li><div id="fs-idm59520752" class="bc-figure figure"><span id="mmm3" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch08_12_01-1.jpg" alt="" width="380" data-media-type="image/jpeg" /></span></div> </li> <li><em data-effect="italics">EBM</em>: 0.86</li> </ol> </li> <li>The sample should have been increased.</li> <li>Answers will vary.</li> <li>Answers will vary.</li> <li>Answers will vary.</li> </ol> <p>11) b</p> </div> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="degrefree"><dt>Degrees of Freedom (<em data-effect="italics">df</em>)</dt> <dd id="id17799455">the number of objects in a sample that are free to vary</dd> </dl> <dl><dt>Normal Distribution</dt> <dd>a continuous random variable (RV) with pdf \(f\text{(}x\text{)}=\frac{1}{\sigma \sqrt{2\pi }}{e}^{–{\left(x–\mu \right)}^{2}/2{\sigma }^{2}}\), where <em data-effect="italics">μ</em> is the mean of the distribution and <em data-effect="italics">σ</em> is the standard deviation, notation: <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em>(<em data-effect="italics">μ</em>,<em data-effect="italics">σ</em>). If <em data-effect="italics">μ</em> = 0 and <em data-effect="italics">σ</em> = 1, the RV is called <strong>the standard normal distribution</strong>.</dd> </dl> <dl><dt>Standard Deviation</dt> <dd>a number that is equal to the square root of the variance and measures how far data values are from their mean; notation: <em data-effect="italics">s</em> for sample standard deviation and <em data-effect="italics">σ</em> for population standard deviation</dd> </dl> <dl id="studenttdist"><dt>Student&#8217;s <strong>t</strong>-Distribution</dt> <dd id="id8759760">investigated and reported by William S. Gossett in 1908 and published under the pseudonym Student; the major characteristics of the random variable (RV) are: <ul id="fs-idp209408240"><li>It is continuous and assumes any real values.</li> <li>The pdf is symmetrical about its mean of zero. However, it is more spread out and flatter at the apex than the normal distribution.</li> <li>It approaches the standard normal distribution as <em data-effect="italics">n</em> get larger.</li> <li>There is a &#8220;family of t–distributions: each representative of the family is completely defined by the number of degrees of freedom, which is one less than the number of data.</li> </ul> </dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-confidence-interval-place-of-birth" title="Activity 9.4: Confidence Interval (Place of Birth)"><div class="chapter-title-wrap"><h3 class="chapter-number">53</h3><h2 class="chapter-title"><span class="display-none">Activity 9.4: Confidence Interval (Place of Birth)</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-id1164881012721" class="statistics lab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Confidence Interval (Place of Birth)</div> <p id="id6317858">Class Time:</p> <p id="id6317864">Names:</p> <div id="id6317875" data-type="list"><div data-type="title">Student Learning Outcomes</div> <ul><li>The student will calculate the 90% confidence interval the proportion of students in this school who were born in this state.</li> <li>The student will interpret confidence intervals.</li> <li>The student will determine the effects of changing conditions on the confidence interval.</li> </ul> </div> <div id="list-8758234" data-type="list"><div data-type="title">Collect the Data</div> <ol><li>Survey the students in your class, asking them if they were born in this state. Let <em data-effect="italics">X</em> = the number that were born in this state. <ol id="list986y" type="a"><li><em data-effect="italics">n</em> = ____________</li> <li><em data-effect="italics">x</em> = ____________</li> </ol> </li> <li>In words, define the random variable <em data-effect="italics">P′</em>.</li> <li>State the estimated distribution to use.</li> </ol> </div> <div id="list-9758575342" data-type="list"><div data-type="title">Find the Confidence Interval and Error Bound</div> <ol><li>Calculate the confidence interval and the error bound. <ol id="list-858645876" type="a"><li>Confidence Interval: _____</li> <li>Error Bound: _____</li> </ol> </li> <li>How much area is in both tails (combined)? α = _____</li> <li>How much area is in each tail? \(\frac{\alpha }{2}\) = _____</li> <li>Fill in the blanks on the graph with the area in each section. Then, fill in the number line with the upper and lower limits of the confidence interval and the sample proportion. <div id="id9596836" class="bc-figure figure"><span id="id9596840" data-type="media" data-alt="Normal distribution curve with two vertical upward lines from the x-axis to the curve. The confidence interval is between these two lines. The residual areas are on either side."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch08_08_01-1.png" alt="Normal distribution curve with two vertical upward lines from the x-axis to the curve. The confidence interval is between these two lines. The residual areas are on either side." width="380" data-media-type="image/png" /></span></div> </li> </ol> </div> <div id="list-97584" data-type="list"><div data-type="title">Describe the Confidence Interval</div> <ol><li>In two to three complete sentences, explain what a confidence interval means (in general), as though you were talking to someone who has not taken statistics.</li> <li>In one to two complete sentences, explain what this confidence interval means for this particular study.</li> <li>Construct a confidence interval for each confidence level given.<br /> <table id="id6699poi200" summary="Partially blank table with confidence level in the first column, EBM/Error Bound in the blank second column, and confidence interval in the blank third column."><thead><tr><th>Confidence level</th> <th>EBP/Error Bound</th> <th>Confidence Interval</th> </tr> </thead> <tbody><tr><td data-align="center">50%</td> <td></td> <td></td> </tr> <tr><td data-align="center">80%</td> <td></td> <td></td> </tr> <tr><td data-align="center">95%</td> <td></td> <td></td> </tr> <tr><td data-align="center">99%</td> <td></td> <td></td> </tr> </tbody> </table> </li> <li>What happens to the EBP as the confidence level increases? Does the width of the confidence interval increase or decrease? Explain why this happens.</li> </ol> </div> </div> </div></div>
<div class="chapter standard" id="chapter-confidence-interval-home-costs" title="Activity 9.5: Confidence Interval (Home Costs)"><div class="chapter-title-wrap"><h3 class="chapter-number">54</h3><h2 class="chapter-title"><span class="display-none">Activity 9.5: Confidence Interval (Home Costs)</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-id1167595196093" class="statistics lab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Confidence Interval (Home Costs)</div> <p id="id3282603">Class Time:</p> <p id="id9826831">Names:</p> <div data-type="list"><div data-type="title">Student Learning Outcomes</div> <ul><li>The student will calculate the 90% confidence interval for the mean cost of a home in the area in which this school is located.</li> <li>The student will interpret confidence intervals.</li> <li>The student will determine the effects of changing conditions on the confidence interval.</li> </ul> </div> <p><span data-type="title">Collect the Data</span> Check the Real Estate section in your local newspaper. Record the sale prices for 35 randomly selected homes recently listed in the county.</p> <div id="fs-idp39253392" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="fs-idm76245808">Many newspapers list them only one day per week. Also, we will assume that homes come up for sale randomly.</p> </div> <ol id="list-87564324"><li>Complete the table:<br /> <table id="tableone32" summary="Blank table of 35 empty cells"><tbody><tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> <tr><td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> <td>__________</td> </tr> </tbody> </table> </li> </ol> <div id="list-98326954" data-type="list"><div data-type="title">Describe the Data</div> <ol><li>Compute the following: <ol id="list-234758" type="a"><li>\(\overline{x}\) = _____</li> <li>\({s}_{x}\) = _____</li> <li><em data-effect="italics">n</em> = _____</li> </ol> </li> <li>In words, define the random variable \(\overline{X}\).</li> <li>State the estimated distribution to use. Use both words and symbols.</li> </ol> </div> <div id="list-97258644" data-type="list"><div data-type="title">Find the Confidence Interval</div> <ol><li>Calculate the confidence interval and the error bound. <ol id="list-927354875234" type="a"><li>Confidence Interval: _____</li> <li>Error Bound: _____</li> </ol> </li> <li>How much area is in both tails (combined)? <em data-effect="italics">α</em> = _____</li> <li>How much area is in each tail? \(\frac{\alpha }{2}\) = _____</li> <li>Fill in the blanks on the graph with the area in each section. Then, fill in the number line with the upper and lower limits of the confidence interval and the sample mean. <div id="id7469736" class="bc-figure figure"><span id="id7469740" data-type="media" data-alt="Normal distribution curve with two vertical upward lines from the x-axis to the curve. The confidence interval is between these two lines. The residual areas are on either side."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch08_07_01-1.png" alt="Normal distribution curve with two vertical upward lines from the x-axis to the curve. The confidence interval is between these two lines. The residual areas are on either side." width="380" data-media-type="image/png" /></span></div> </li> <li>Some students think that a 90% confidence interval contains 90% of the data. Use the list of data on the first page and count how many of the data values lie within the confidence interval. What percent is this? Is this percent close to 90%? Explain why this percent should or should not be close to 90%.</li> </ol> </div> <div id="list-23875744" data-type="list"><div data-type="title">Describe the Confidence Interval</div> <ol><li>In two to three complete sentences, explain what a confidence interval means (in general), as if you were talking to someone who has not taken statistics.</li> <li>In one to two complete sentences, explain what this confidence interval means for this particular study.</li> </ol> </div> <div id="list-89758644" data-type="list"><div data-type="title">Use the Data to Construct Confidence Intervals</div> <ol><li>Using the given information, construct a confidence interval for each confidence level given.<br /> <table id="id3675asdffgh75848jhgf538" summary="Partially blank table with confidence level in the first column, EBM/Error Bound in the blank second column, and confidence interval in the blank third column."><thead><tr><th>Confidence level</th> <th>EBM/Error Bound</th> <th>Confidence Interval</th> </tr> </thead> <tbody><tr><td data-align="center">50%</td> <td></td> <td></td> </tr> <tr><td data-align="center">80%</td> <td></td> <td></td> </tr> <tr><td data-align="center">95%</td> <td></td> <td></td> </tr> <tr><td data-align="center">99%</td> <td></td> <td></td> </tr> </tbody> </table> </li> <li>What happens to the EBM as the confidence level increases? Does the width of the confidence interval increase or decrease? Explain why this happens.</li> </ol> </div> </div> </div></div>
<div class="part " id="part-hypothesis-testing-with-one-sample"><div class="part-title-wrap"><h3 class="part-number">X</h3><h1 class="part-title">Chapter 10: Hypothesis Testing with One Sample</h1></div><div class="ugc part-ugc"></div></div>
<div class="chapter standard" id="chapter-introduction-21" title="Chapter 10.1: Introduction"><div class="chapter-title-wrap"><h3 class="chapter-number">55</h3><h2 class="chapter-title"><span class="display-none">Chapter 10.1: Introduction</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-idm165993648" class="splash"><div class="bc-figcaption figcaption">You can use a hypothesis test to decide if a dog breeder’s claim that every Dalmatian has 35 spots is statistically sound. (Credit: Robert Neff)</div> <p><span id="fs-idm52702960" data-type="media" data-alt="This is a picture of a Dalmation dog covered in black spots. He is wearing a red color, appears to be in a nature setting, and there is a spout of water from a water fountain in the foreground."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/CNX_Stats_C09_CO-1.jpg" alt="This is a picture of a Dalmation dog covered in black spots. He is wearing a red color, appears to be in a nature setting, and there is a spout of water from a water fountain in the foreground." width="380" data-media-type="image/jpeg" /></span></p> </div> <div id="fs-idp21504400" class="chapter-objectives" data-type="note" data-has-label="true" data-label=""><div data-type="title">Chapter Objectives</div> <p>By the end of this chapter, the student should be able to:</p> <ul id="list67"><li>Differentiate between Type I and Type II Errors</li> <li>Describe hypothesis testing in general and in practice</li> <li>Conduct and interpret hypothesis tests for a single population mean, population standard deviation known.</li> <li>Conduct and interpret hypothesis tests for a single population mean, population standard deviation unknown.</li> <li>Conduct and interpret hypothesis tests for a single population proportion.</li> </ul> </div> <p>One job of a statistician is to make statistical inferences about populations based on samples taken from the population. <span data-type="term">Confidence intervals</span> are one way to estimate a population parameter. Another way to make a statistical inference is to make a decision about a parameter. For instance, a car dealer advertises that its new small truck gets 35 miles per gallon, on average. A tutoring service claims that its method of tutoring helps 90% of its students get an A or a B. A company says that women managers in their company earn an average of $60,000 per year.</p> <p>A statistician will make a decision about these claims. This process is called &#8220;<span data-type="term">hypothesis testing</span>.&#8221; A hypothesis test involves collecting data from a sample and evaluating the data. Then, the statistician makes a decision as to whether or not there is sufficient evidence, based upon analyses of the data, to reject the null hypothesis.</p> <p>In this chapter, you will conduct hypothesis tests on single means and single proportions. You will also learn about the errors associated with these tests.</p> <p>Hypothesis testing consists of two contradictory hypotheses or statements, a decision based on the data, and a conclusion. To perform a hypothesis test, a statistician will:</p> <ol><li>Set up two contradictory hypotheses.</li> <li>Collect sample data (in homework problems, the data or summary statistics will be given to you).</li> <li>Determine the correct distribution to perform the hypothesis test.</li> <li>Analyze sample data by performing the calculations that ultimately will allow you to reject or decline to reject the null hypothesis.</li> <li>Make a decision and write a meaningful conclusion.</li> </ol> <p>&nbsp;</p> <div id="id23787041" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="eip-idm82174448">To do the hypothesis test homework problems for this chapter and later chapters, make copies of the appropriate special solution sheets. See <a href="/contents/c0449c55-aa47-4f1c-bd5f-0521652f0e82">Appendix E</a>.</p> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl><dt>Confidence Interval (CI)</dt> <dd>an interval estimate for an unknown population parameter. This depends on: <ul><li>The desired confidence level.</li> <li>Information that is known about the distribution (for example, known standard deviation).</li> <li>The sample and its size.</li> </ul> </dd> </dl> <dl id="hypotest"><dt>Hypothesis Testing</dt> <dd id="id23787132">Based on sample evidence, a procedure for determining whether the hypothesis stated is a reasonable statement and should not be rejected, or is unreasonable and should be rejected.</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-null-and-alternative-hypotheses" title="Chapter 10.2: Null and Alternative Hypotheses"><div class="chapter-title-wrap"><h3 class="chapter-number">56</h3><h2 class="chapter-title"><span class="display-none">Chapter 10.2: Null and Alternative Hypotheses</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p>The actual test begins by considering two <span data-type="term">hypotheses</span>. They are called the <span data-type="term">null hypothesis</span> and the <span data-type="term">alternative hypothesis</span>. These hypotheses contain opposing viewpoints.</p> <p><em data-effect="italics">H<sub>0</sub></em>: <strong>The null hypothesis:</strong> It is a statement of no difference between sample means or proportions or no difference between a sample mean or proportion and a population mean or proportion. In other words, the difference equals 0.</p> <p><em data-effect="italics">H<sub>a</sub></em>: <strong>The alternative hypothesis:</strong> It is a claim about the population that is contradictory to <em data-effect="italics">H<sub>0</sub></em> and what we conclude when we reject <em data-effect="italics">H<sub>0</sub></em>.</p> <p>Since the null and alternative hypotheses are contradictory, you must examine evidence to decide if you have enough evidence to reject the null hypothesis or not. The evidence is in the form of sample data.</p> <p id="element-862">After you have determined which hypothesis the sample supports, you make a <strong>decision.</strong> There are two options for a decision. They are &#8220;reject <em data-effect="italics">H<sub>0</sub></em>&#8221; if the sample information favors the alternative hypothesis or &#8220;do not reject <em data-effect="italics">H<sub>0</sub></em>&#8221; or &#8220;decline to reject <em data-effect="italics">H<sub>0</sub></em>&#8221; if the sample information is insufficient to reject the null hypothesis.</p> <p>Mathematical Symbols Used in <em data-effect="italics">H<sub>0</sub></em> and <em data-effect="italics">H<sub>a</sub></em>:</p> <table summary="Table presenting mathematical symbols used with the null and alternate hypothesis. The first column contains the null hypothesis and its 3 mathematical symbols and the second column has the alternate hypothesis and its mathematical symbols." data-frame="none"><thead valign="middle"><tr><th data-align="center"><em data-effect="italics">H<sub>0</sub></em></th> <th data-align="center"><em data-effect="italics">H<sub>a</sub></em></th> </tr> </thead> <tbody><tr><td>equal (=)</td> <td>not equal (≠) <strong>or</strong> greater than (&gt;) <strong>or</strong> less than (&lt;)</td> </tr> <tr><td>greater than or equal to (≥)</td> <td>less than (&lt;)</td> </tr> <tr><td>less than or equal to (≤)</td> <td>more than (&gt;)</td> </tr> </tbody> </table> <div id="id1168218302693" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="eip-idp37732128"><em data-effect="italics">H<sub>0</sub></em> always has a symbol with an equal in it. <em data-effect="italics">H<sub>a</sub></em> never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers (including one of the co-authors in research work) use = in the null hypothesis, even with &gt; or &lt; as the symbol in the alternative hypothesis. This practice is acceptable because we only make the decision to reject or not reject the null hypothesis.</p> </div> <div class="textbox textbox--examples" data-type="example"><p id="element-283"><em data-effect="italics">H<sub>0</sub></em>: No more than 30% of the registered voters in Santa Clara County voted in the primary election. <em data-effect="italics">p</em> ≤ 30 <span data-type="newline"><br /> </span><em data-effect="italics">H<sub>a</sub></em>: More than 30% of the registered voters in Santa Clara County voted in the primary election. <em data-effect="italics">p</em> &gt; 30</p> </div> <div id="fs-idp10178800" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="eip-665" data-type="exercise"><div data-type="problem"><p>A medical trial is conducted to test whether or not a new medicine reduces cholesterol by 25%. State the null and alternative hypotheses.</p> </div> </div> </div> <div id="element-641" class="textbox textbox--examples" data-type="example"><p>We want to test whether the mean GPA of students in American colleges is different from 2.0 (out of 4.0). The null and alternative hypotheses are: <span data-type="newline"><br /> </span><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ</em> = 2.0 <span data-type="newline"><br /> </span><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> ≠ 2.0</p> </div> <div id="fs-idp91068816" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm576512" data-type="exercise"><div id="fs-idm16416" data-type="problem"><p>We want to test whether the mean height of eighth graders is 66 inches. State the null and alternative hypotheses. Fill in the correct symbol (=, ≠, ≥, &lt;, ≤, &gt;) for the null and alternative hypotheses.</p> <ol id="eip-idm122826912" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ</em> __ 66</li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> __ 66</li> </ol> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><p>We want to test if college students take less than five years to graduate from college, on the average. The null and alternative hypotheses are: <span data-type="newline"><br /> </span><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ</em> ≥ 5 <span data-type="newline"><br /> </span><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> &lt; 5</p> </div> <div id="fs-idp119641264" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm2856320" data-type="exercise"><div id="fs-idp126181968" data-type="problem"><p id="eip-414">We want to test if it takes fewer than 45 minutes to teach a lesson plan. State the null and alternative hypotheses. Fill in the correct symbol ( =, ≠, ≥, &lt;, ≤, &gt;) for the null and alternative hypotheses.</p> <ol id="eip-idm84545584" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ</em> __ 45</li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> __ 45</li> </ol> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><p id="fs-idm30028400">In an issue of <em data-effect="italics">U. S. News and World Report</em>, an article on school standards stated that about half of all students in France, Germany, and Israel take advanced placement exams and a third pass. The same article stated that 6.6% of U.S. students take advanced placement exams and 4.4% pass. Test if the percentage of U.S. students who take advanced placement exams is more than 6.6%. State the null and alternative hypotheses. <span data-type="newline"><br /> </span><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p</em> ≤ 0.066 <span data-type="newline"><br /> </span><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p</em> &gt; 0.066</p> </div> <div id="fs-idm2566848" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm126344352" data-type="exercise"><div id="fs-idm54604944" data-type="problem"><p id="fs-idm131727840">On a state driver’s test, about 40% pass the test on the first try. We want to test if more than 40% pass on the first try. Fill in the correct symbol (=, ≠, ≥, &lt;, ≤, &gt;) for the null and alternative hypotheses.</p> <ol id="eip-idp142654064" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p</em> __ 0.40</li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p</em> __ 0.40</li> </ol> </div> </div> </div> <p>&nbsp;</p> <div id="fs-idm73791920" class="statistics collab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Collaborative Exercise</div> <p id="element-171">Bring to class a newspaper, some news magazines, and some Internet articles . In groups, find articles from which your group can write null and alternative hypotheses. Discuss your hypotheses with the rest of the class.</p> </div> <div class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p>In a <strong>hypothesis test</strong>, sample data is evaluated in order to arrive at a decision about some type of claim. If certain conditions about the sample are satisfied, then the claim can be evaluated for a population. In a hypothesis test, we:</p> <ol id="eip-idp72518864" type="1"><li>Evaluate the <strong>null hypothesis</strong>, typically denoted with <em data-effect="italics">H<sub>0</sub></em>. The null is not rejected unless the hypothesis test shows otherwise. The null statement must always contain some form of equality (=, ≤ or ≥)</li> <li>Always write the <strong>alternative hypothesis</strong>, typically denoted with <em data-effect="italics">H<sub>a</sub></em> or <em data-effect="italics">H<sub>1</sub></em>, using less than, greater than, or not equals symbols, i.e., (≠, &gt;, or &lt;).</li> <li>If we reject the null hypothesis, then we can assume there is enough evidence to support the alternative hypothesis.</li> <li>Never state that a claim is proven true or false. Keep in mind the underlying fact that hypothesis testing is based on probability laws; therefore, we can talk only in terms of non-absolute certainties.</li> </ol> </div> <div id="fs-idm23881072" class="formula-review" data-depth="1"><h3 data-type="title">Formula Review</h3> <p><em data-effect="italics">H<sub>0</sub></em> and <em data-effect="italics">H<sub>a</sub></em> are contradictory.</p> <table id="element-377" summary="This table states the mathematical symbols associated with either null or alternate hypothesis. The first row is for the null hypothesis and the second row is for the alternate hypothesis. There are three columns of mathematical symbols for each."><tbody><tr><td><strong>If <em data-effect="italics">H<sub>o</sub></em> has:</strong></td> <td>equal (=)</td> <td>greater than or equal to (≥)</td> <td>less than or equal to (≤)</td> </tr> <tr><td><strong>then <em data-effect="italics">H<sub>a</sub></em> has:</strong></td> <td>not equal (≠) <strong>or</strong> greater than (&gt;) <strong>or</strong> less than (&lt;)</td> <td>less than (&lt;)</td> <td>greater than (&gt;)</td> </tr> </tbody> </table> <p>If <em data-effect="italics">α</em> ≤ <em data-effect="italics">p</em>-value, then do not reject <em data-effect="italics">H<sub>0</sub></em>.</p> <p>If <em data-effect="italics">α</em> &gt; <em data-effect="italics">p</em>-value, then reject <em data-effect="italics">H<sub>0</sub></em>.</p> <p><em data-effect="italics">α</em> is preconceived. Its value is set before the hypothesis test starts. The <em data-effect="italics">p</em>-value is calculated from the data.</p> </div> <div id="fs-idm15064784" class="practice" data-depth="1"><div data-type="exercise"><div data-type="problem"><p id="fs-idm78419312">You are testing that the mean speed of your cable Internet connection is more than three Megabits per second. What is the random variable? Describe in words.</p> </div> <div data-type="solution"><p id="fs-idm73404464">The random variable is the mean Internet speed in Megabits per second.</p> </div> </div> <div id="eip-411" data-type="exercise"><div data-type="problem"><p>You are testing that the mean speed of your cable Internet connection is more than three Megabits per second. State the null and alternative hypotheses.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>The American family has an average of two children. What is the random variable? Describe in words.</p> </div> <div id="eip-644" data-type="solution"><p>The random variable is the mean number of children an American family has.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>The mean entry level salary of an employee at a company is \$58,000. You believe it is higher for IT professionals in the company. State the null and alternative hypotheses.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>A sociologist claims the probability that a person picked at random in Times Square in New York City is visiting the area is 0.83. You want to test to see if the proportion is actually less. What is the random variable? Describe in words.</p> </div> <div data-type="solution"><p>The random variable is the proportion of people picked at random in Times Square visiting the city.</p> </div> </div> <div id="eip-235" data-type="exercise"><div id="eip-100" data-type="problem"><p>A sociologist claims the probability that a person picked at random in Times Square in New York City is visiting the area is 0.83. You want to test to see if the claim is correct. State the null and alternative hypotheses.</p> </div> </div> <div id="eip-75" data-type="exercise"><div data-type="problem"><p>In a population of fish, approximately 42% are female. A test is conducted to see if, in fact, the proportion is less. State the null and alternative hypotheses.</p> </div> <div data-type="solution"><ol id="eip-idm74385712" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p</em> = 0.42</li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p</em> &lt; 0.42</li> </ol> </div> </div> <div data-type="exercise"><div id="id7396007" data-type="problem"><p>Suppose that a recent article stated that the mean time spent in jail by a first–time convicted burglar is 2.5 years. A study was then done to see if the mean time has increased in the new century. A random sample of 26 first-time convicted burglars in a recent year was picked. The mean length of time in jail from the survey was 3 years with a standard deviation of 1.8 years. Suppose that it is somehow known that the population standard deviation is 1.5. If you were conducting a hypothesis test to determine if the mean length of jail time has increased, what would the null and alternative hypotheses be? The distribution of the population is normal.</p> <ol id="list-986798234" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: ________</li> <li><em data-effect="italics">H<sub>a</sub></em>: ________</li> </ol> </div> </div> <div data-type="exercise"><div id="id13907062" data-type="problem"><p id="eip-idm29584912">A random survey of 75 death row inmates revealed that the mean length of time on death row is 17.4 years with a standard deviation of 6.3 years. If you were conducting a hypothesis test to determine if the population mean time on death row could likely be 15 years, what would the null and alternative hypotheses be?</p> <ol id="list-92369" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: __________</li> <li><em data-effect="italics">H<sub>a</sub></em>: __________</li> </ol> </div> <div id="id13907281" data-type="solution"><ol id="list-8927365476" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ</em> = 15</li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> ≠ 15</li> </ol> </div> </div> <div data-type="exercise"><div id="id13929352" data-type="problem"><p>The National Institute of Mental Health published an article stating that in any one-year period, approximately 9.5 percent of American adults suffer from depression or a depressive illness. Suppose that in a survey of 100 people in a certain town, seven of them suffered from depression or a depressive illness. If you were conducting a hypothesis test to determine if the true proportion of people in that town suffering from depression or a depressive illness is lower than the percent in the general adult American population, what would the null and alternative hypotheses be?</p> <ol id="eip-idm4137600" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: ________</li> <li><em data-effect="italics">H<sub>a</sub></em>: ________</li> </ol> </div> </div> </div> <div id="fs-idm8472096" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <div data-type="exercise"><div id="id8382572" data-type="problem"><p id="element-802">1) Some of the following statements refer to the null hypothesis, some to the alternate hypothesis.</p> <p>State the null hypothesis, <em data-effect="italics">H<sub>0</sub></em>, and the alternative hypothesis. <em data-effect="italics">H<sub>a</sub></em>, in terms of the appropriate parameter (<em data-effect="italics">μ</em> or <em data-effect="italics">p</em>).</p> <ol type="a"><li>The mean number of years Americans work before retiring is 34.</li> <li>At most 60% of Americans vote in presidential elections.</li> <li>The mean starting salary for San Jose State University graduates is at least \$100,000 per year.</li> <li>Twenty-nine percent of high school seniors get drunk each month.</li> <li>Fewer than 5% of adults ride the bus to work in Los Angeles.</li> <li>The mean number of cars a person owns in her lifetime is not more than ten.</li> <li>About half of Americans prefer to live away from cities, given the choice.</li> <li>Europeans have a mean paid vacation each year of six weeks.</li> <li>The chance of developing breast cancer is under 11% for women.</li> <li>Private universities&#8217; mean tuition cost is more than \$20,000 per year.</li> </ol> <p>&nbsp;</p> </div> <div id="id9205906" data-type="solution"></div> </div> <div data-type="exercise"><div id="id9404524" data-type="problem"><p id="id5049073">2) Over the past few decades, public health officials have examined the link between weight concerns and teen girls&#8217; smoking. Researchers surveyed a group of 273 randomly selected teen girls living in Massachusetts (between 12 and 15 years old). After four years the girls were surveyed again. Sixty-three said they smoked to stay thin. Is there good evidence that more than thirty percent of the teen girls smoke to stay thin? The alternative hypothesis is:</p> <ol id="id12166958" type="a"><li><em data-effect="italics">p</em> &lt; 0.30</li> <li><em data-effect="italics">p</em> ≤ 0.30</li> <li><em data-effect="italics">p</em> ≥ 0.30</li> <li><em data-effect="italics">p</em> &gt; 0.30</li> </ol> </div> </div> <div id="exer13" data-type="exercise"><div id="id8184432" data-type="problem"><p id="eip-idp133963488">3) A statistics instructor believes that fewer than 20% of Evergreen Valley College (EVC) students attended the opening night midnight showing of the latest Harry Potter movie. She surveys 84 of her students and finds that 11 attended the midnight showing. An appropriate alternative hypothesis is:</p> <ol id="id8618736" type="a"><li><em data-effect="italics">p</em> = 0.20</li> <li><em data-effect="italics">p</em> &gt; 0.20</li> <li><em data-effect="italics">p</em> &lt; 0.20</li> <li><em data-effect="italics">p</em> ≤ 0.20</li> </ol> </div> <div id="id8184727" data-type="solution"><p id="paragrams"></p></div> </div> <div id="exer18" data-type="exercise"><div id="id8185544" data-type="problem"><p id="eip-idm3083104">4) Previously, an organization reported that teenagers spent 4.5 hours per week, on average, on the phone. The organization thinks that, currently, the mean is higher. Fifteen randomly chosen teenagers were asked how many hours per week they spend on the phone. The sample mean was 4.75 hours with a sample standard deviation of 2.0. Conduct a hypothesis test. The null and alternative hypotheses are:</p> <ol id="id9908251" type="a"><li><em data-effect="italics">H<sub>o</sub></em>: \(\overline{x}\) = 4.5, <em data-effect="italics">H<sub>a</sub></em> : \(\overline{x}\) &gt; 4.5</li> <li><em data-effect="italics">H<sub>o</sub></em>: <em data-effect="italics">μ</em> ≥ 4.5, <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> &lt; 4.5</li> <li><em data-effect="italics">H<sub>o</sub></em>: <em data-effect="italics">μ</em> = 4.75, <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> &gt; 4.75</li> <li><em data-effect="italics">H<sub>o</sub></em>: <em data-effect="italics">μ</em> = 4.5, <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> &gt; 4.5</li> </ol> </div> </div> </div> <div id="fs-idm48036544" class="footnotes" data-depth="1"><h3 data-type="title"><strong>Answers to odd questions</strong></h3> <p>1)</p> <p><em data-effect="italics">a. H<sub>0</sub></em>: <em data-effect="italics">μ</em> = 34; <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> ≠ 34<br /> <em data-effect="italics">b. H<sub>0</sub></em>: <em data-effect="italics">p</em> ≤ 0.60; <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p</em> &gt; 0.60<br /> <em data-effect="italics">c. H<sub>0</sub></em>: <em data-effect="italics">μ</em> ≥ 100,000; <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> &lt; 100,000<br /> <em data-effect="italics">d. H<sub>0</sub></em>: <em data-effect="italics">p</em> = 0.29; <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p</em> ≠ 0.29<br /> <em data-effect="italics">e. H<sub>0</sub></em>: <em data-effect="italics">p</em> = 0.05; <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p</em> &lt; 0.05<br /> <em data-effect="italics">f. H<sub>0</sub></em>: <em data-effect="italics">μ</em> ≤ 10; <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> &gt; 10<br /> <em data-effect="italics">g. H<sub>0</sub></em>: <em data-effect="italics">p</em> = 0.50; <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p</em> ≠ 0.50<br /> <em data-effect="italics">h. H<sub>0</sub></em>: <em data-effect="italics">μ</em> = 6; <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> ≠ 6<br /> <em data-effect="italics">I. H<sub>0</sub></em>: <em data-effect="italics">p</em> ≥ 0.11; <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p</em> &lt; 0.11<br /> <em data-effect="italics">j. H<sub>0</sub></em>: <em data-effect="italics">μ</em> ≤ 20,000; <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> &gt; 20,000</p> <p>3) c</p> <h3 data-type="title">References</h3> <p id="fs-idp5641168">Data from the National Institute of Mental Health. Available online at http://www.nimh.nih.gov/publicat/depression.cfm.</p> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="hypothesis"><dt>Hypothesis</dt> <dd id="id1168218799446">a statement about the value of a population parameter, in case of two hypotheses, the statement assumed to be true is called the null hypothesis (notation <em data-effect="italics">H</em><sub>0</sub>) and the contradictory statement is called the alternative hypothesis (notation <em data-effect="italics">H<sub>a</sub></em>).</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-outcomes-and-the-type-i-and-type-ii-errors" title="Chapter 10.3: Outcomes and the Type I and Type II Errors"><div class="chapter-title-wrap"><h3 class="chapter-number">57</h3><h2 class="chapter-title"><span class="display-none">Chapter 10.3: Outcomes and the Type I and Type II Errors</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p>When you perform a hypothesis test, there are four possible outcomes depending on the actual truth (or falseness) of the null hypothesis <em data-effect="italics">H<sub>0</sub></em> and the decision to reject or not. The outcomes are summarized in the following table:</p> <table summary=""><thead><tr><th>ACTION</th> <th><em data-effect="italics">H<sub>0</sub></em> IS ACTUALLY</th> <th>&#8230;</th> </tr> </thead> <tbody><tr><td></td> <td>True</td> <td>False</td> </tr> <tr><td><strong>Do not reject <em data-effect="italics">H<sub>0</sub></em></strong></td> <td>Correct Outcome</td> <td>Type II error</td> </tr> <tr><td><strong>Reject <em data-effect="italics">H<sub>0</sub></em></strong></td> <td>Type I Error</td> <td>Correct Outcome</td> </tr> </tbody> </table> <p>The four possible outcomes in the table are:</p> <ol><li>The decision is <strong>not to reject <em data-effect="italics">H<sub>0</sub></em></strong> when <strong><em data-effect="italics">H<sub>0</sub></em> is true (correct decision).</strong></li> <li>The decision is to <strong>reject <em data-effect="italics">H<sub>0</sub></em></strong> when <strong><em data-effect="italics">H<sub>0</sub></em> is true</strong> (incorrect decision known as a<span data-type="term">Type I error</span>).</li> <li>The decision is <strong>not to reject <em data-effect="italics">H<sub>0</sub></em></strong> when, in fact, <strong><em data-effect="italics">H<sub>0</sub></em> is false</strong> (incorrect decision known as a <span data-type="term">Type II error</span>).</li> <li>The decision is to <strong>reject <em data-effect="italics">H<sub>0</sub></em></strong> when <strong><em data-effect="italics">H<sub>0</sub></em> is false</strong> (<strong>correct decision</strong> whose probability is called the <strong>Power of the Test</strong>).</li> </ol> <p>Each of the errors occurs with a particular probability. The Greek letters <em data-effect="italics">α</em> and <em data-effect="italics">β</em> represent the probabilities.</p> <p id="element-95"><em data-effect="italics">α</em> = probability of a Type I error = <strong><em data-effect="italics">P</em>(Type I error)</strong> = probability of rejecting the null hypothesis when the null hypothesis is true.</p> <p><em data-effect="italics">β</em> = probability of a Type II error = <strong><em data-effect="italics">P</em>(Type II error)</strong> = probability of not rejecting the null hypothesis when the null hypothesis is false.</p> <p id="fs-idp40594672"><em data-effect="italics">α</em> and <em data-effect="italics">β</em> should be as small as possible because they are probabilities of errors. They are rarely zero.</p> <p>The Power of the Test is 1 – <em data-effect="italics">β</em>. Ideally, we want a high power that is as close to one as possible. Increasing the sample size can increase the Power of the Test.</p> <p>The following are examples of Type I and Type II errors.</p> <div class="textbox textbox--examples" data-type="example"><p>Suppose the null hypothesis, <em data-effect="italics">H<sub>0</sub></em>, is: Frank&#8217;s rock climbing equipment is safe.</p> <p><strong>Type I error</strong>: Frank thinks that his rock climbing equipment may not be safe when, in fact, it really is safe. <strong>Type II error</strong>: Frank thinks that his rock climbing equipment may be safe when, in fact, it is not safe.</p> <p id="element-821"><strong><em data-effect="italics">α</em> = probability</strong> that Frank thinks his rock climbing equipment may not be safe when, in fact, it really is safe. <strong><em data-effect="italics">β</em> = probability</strong> that Frank thinks his rock climbing equipment may be safe when, in fact, it is not safe.</p> <p>Notice that, in this case, the error with the greater consequence is the Type II error. (If Frank thinks his rock climbing equipment is safe, he will go ahead and use it.)</p> </div> <div id="fs-idm72148080" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div id="eip-569" data-type="problem"><p>Suppose the null hypothesis, <em data-effect="italics">H<sub>0</sub></em>, is: the blood cultures contain no traces of pathogen <em data-effect="italics">X</em>. State the Type I and Type II errors.</p> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><p>Suppose the null hypothesis, <em data-effect="italics">H<sub>0</sub></em>, is: The victim of an automobile accident is alive when he arrives at the emergency room of a hospital.</p> <p id="element-157"><strong>Type I error</strong>: The emergency crew thinks that the victim is dead when, in fact, the victim is alive. <strong>Type II error</strong>: The emergency crew does not know if the victim is alive when, in fact, the victim is dead.</p> <p id="element-558"><strong><em data-effect="italics">α</em> = probability</strong> that the emergency crew thinks the victim is dead when, in fact, he is really alive = <em data-effect="italics">P</em>(Type I error). <strong><em data-effect="italics">β</em> = probability</strong> that the emergency crew does not know if the victim is alive when, in fact, the victim is dead = <em data-effect="italics">P</em>(Type II error).</p> <p>The error with the greater consequence is the Type I error. (If the emergency crew thinks the victim is dead, they will not treat him.)</p> </div> <div id="fs-idp97501776" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="eip-200" data-type="exercise"><div data-type="problem"><p>Suppose the null hypothesis, <em data-effect="italics">H<sub>0</sub></em>, is: a patient is not sick. Which type of error has the greater consequence, Type I or Type II?</p> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><p>It’s a Boy Genetic Labs claim to be able to increase the likelihood that a pregnancy will result in a boy being born. Statisticians want to test the claim. Suppose that the null hypothesis, <em data-effect="italics">H<sub>0</sub></em>, is: It’s a Boy Genetic Labs has no effect on gender outcome.</p> <p id="eip-idp12866608"><strong>Type I error</strong>: This results when a true null hypothesis is rejected. In the context of this scenario, we would state that we believe that It’s a Boy Genetic Labs influences the gender outcome, when in fact it has no effect. The probability of this error occurring is denoted by the Greek letter alpha, <em data-effect="italics">α</em>.</p> <p id="eip-idp92458272"><strong>Type II error</strong>: This results when we fail to reject a false null hypothesis. In context, we would state that It’s a Boy Genetic Labs does not influence the gender outcome of a pregnancy when, in fact, it does. The probability of this error occurring is denoted by the Greek letter beta, <em data-effect="italics">β</em>.</p> <p id="eip-idp109344384">The error of greater consequence would be the Type I error since couples would use the It’s a Boy Genetic Labs product in hopes of increasing the chances of having a boy.</p> </div> <div id="fs-idp114758848" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="eip-idp106437392" data-type="exercise"><div id="eip-idm60347984" data-type="problem"><p id="eip-326">“Red tide” is a bloom of poison-producing algae–a few different species of a class of plankton called dinoflagellates. When the weather and water conditions cause these blooms, shellfish such as clams living in the area develop dangerous levels of a paralysis-inducing toxin. In Massachusetts, the Division of Marine Fisheries (DMF) monitors levels of the toxin in shellfish by regular sampling of shellfish along the coastline. If the mean level of toxin in clams exceeds 800 μg (micrograms) of toxin per kg of clam meat in any area, clam harvesting is banned there until the bloom is over and levels of toxin in clams subside. Describe both a Type I and a Type II error in this context, and state which error has the greater consequence.</p> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><p id="eip-90">A certain experimental drug claims a cure rate of at least 75% for males with prostate cancer. Describe both the Type I and Type II errors in context. Which error is the more serious?</p> <p id="eip-141"><strong>Type I</strong>: A cancer patient believes the cure rate for the drug is less than 75% when it actually is at least 75%.</p> <p><strong>Type II</strong>: A cancer patient believes the experimental drug has at least a 75% cure rate when it has a cure rate that is less than 75%.</p> <p>In this scenario, the Type II error contains the more severe consequence. If a patient believes the drug works at least 75% of the time, this most likely will influence the patient’s (and doctor’s) choice about whether to use the drug as a treatment option.</p> </div> <div id="fs-idm49523424" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <p id="eip-198">Determine both Type I and Type II errors for the following scenario:</p> <p id="eip-idp26159712">Assume a null hypothesis, <em data-effect="italics">H<sub>0</sub></em>, that states the percentage of adults with jobs is at least 88%.</p> <div data-type="exercise"><div id="eip-339" data-type="problem"><p id="eip-idm76027600">Identify the Type I and Type II errors from these four statements.</p> <ol id="eip-idm170151664" type="a"><li>Not to reject the null hypothesis that the percentage of adults who have jobs is at least 88% when that percentage is actually less than 88%</li> <li>Not to reject the null hypothesis that the percentage of adults who have jobs is at least 88% when the percentage is actually at least 88%.</li> <li>Reject the null hypothesis that the percentage of adults who have jobs is at least 88% when the percentage is actually at least 88%.</li> <li>Reject the null hypothesis that the percentage of adults who have jobs is at least 88% when that percentage is actually less than 88%.</li> </ol> </div> </div> </div> <div class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="eip-483">In every hypothesis test, the outcomes are dependent on a correct interpretation of the data. Incorrect calculations or misunderstood summary statistics can yield errors that affect the results. A <strong>Type I</strong> error occurs when a true null hypothesis is rejected. A <strong>Type II error</strong> occurs when a false null hypothesis is not rejected.</p> <p id="eip-idm63626992">The probabilities of these errors are denoted by the Greek letters <em data-effect="italics">α</em> and <em data-effect="italics">β</em>, for a Type I and a Type II error respectively. The power of the test, 1 – <em data-effect="italics">β</em>, quantifies the likelihood that a test will yield the correct result of a true alternative hypothesis being accepted. A high power is desirable.</p> </div> <div id="fs-idp7813920" class="formula-review" data-depth="1"><h3 data-type="title">Formula Review</h3> <p><em data-effect="italics">α</em> = probability of a Type I error = <em data-effect="italics">P</em>(Type I error) = probability of rejecting the null hypothesis when the null hypothesis is true.</p> <p><em data-effect="italics">β</em> = probability of a Type II error = <em data-effect="italics">P</em>(Type II error) = probability of not rejecting the null hypothesis when the null hypothesis is false.</p> </div> <div class="practice" data-depth="1"><div data-type="exercise"><div data-type="problem"><p>The mean price of mid-sized cars in a region is \$32,000. A test is conducted to see if the claim is true. State the Type I and Type II errors in complete sentences.</p> </div> <div data-type="solution"><p>Type I: The mean price of mid-sized cars is \$32,000, but we conclude that it is not \$32,000.</p> <p id="eip-idp25290720">Type II: The mean price of mid-sized cars is not \$32,000, but we conclude that it is \$32,000.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p id="eip-384">A sleeping bag is tested to withstand temperatures of –15 °F. You think the bag cannot stand temperatures that low. State the Type I and Type II errors in complete sentences.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>For <a href="#eip-611">Exercise 9.12</a>, what are <em data-effect="italics">α</em> and <em data-effect="italics">β</em> in words?</p> </div> <div data-type="solution"><p id="eip-989"><em data-effect="italics">α</em> = the probability that you think the bag cannot withstand -15 degrees F, when in fact it can</p> <p id="eip-idp30620656"><em data-effect="italics">β</em> = the probability that you think the bag can withstand -15 degrees F, when in fact it cannot</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p id="eip-659">In words, describe 1 – <em data-effect="italics">β</em> For <a href="#eip-611">Exercise 9.12</a>.</p> </div> </div> <div id="eip-861" data-type="exercise"><div id="eip-338" data-type="problem"><p id="eip-46">A group of doctors is deciding whether or not to perform an operation. Suppose the null hypothesis, <em data-effect="italics">H<sub>0</sub></em>, is: the surgical procedure will go well. State the Type I and Type II errors in complete sentences.</p> </div> <div data-type="solution"><p>Type I: The procedure will go well, but the doctors think it will not.</p> <p id="eip-idp34803264">Type II: The procedure will not go well, but the doctors think it will.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>A group of doctors is deciding whether or not to perform an operation. Suppose the null hypothesis, <em data-effect="italics">H<sub>0</sub></em>, is: the surgical procedure will go well. Which is the error with the greater consequence?</p> </div> </div> <div data-type="exercise"><div id="eip-877" data-type="problem"><p>The power of a test is 0.981. What is the probability of a Type II error?</p> </div> <div data-type="solution"><p>0.019</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>A group of divers is exploring an old sunken ship. Suppose the null hypothesis, <em data-effect="italics">H<sub>0</sub></em>, is: the sunken ship does not contain buried treasure. State the Type I and Type II errors in complete sentences.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>A microbiologist is testing a water sample for E-coli. Suppose the null hypothesis, <em data-effect="italics">H<sub>0</sub></em>, is: the sample does not contain E-coli. The probability that the sample does not contain E-coli, but the microbiologist thinks it does is 0.012. The probability that the sample does contain E-coli, but the microbiologist thinks it does not is 0.002. What is the power of this test?</p> </div> <div data-type="solution"><p id="eip-61">0.998</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>A microbiologist is testing a water sample for E-coli. Suppose the null hypothesis, <em data-effect="italics">H<sub>0</sub></em>, is: the sample contains E-coli. Which is the error with the greater consequence?</p> </div> </div> </div> <div id="fs-idp36704512" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <div data-type="exercise"><div id="id8384344" data-type="problem"><p>1) State the Type I and Type II errors in complete sentences given the following statements.</p> <ol id="fs-idm56908224" type="a"><li>The mean number of years Americans work before retiring is 34.</li> <li>At most 60% of Americans vote in presidential elections.</li> <li>The mean starting salary for San Jose State University graduates is at least \$100,000 per year.</li> <li>Twenty-nine percent of high school seniors get drunk each month.</li> <li>Fewer than 5% of adults ride the bus to work in Los Angeles.</li> <li>The mean number of cars a person owns in his or her lifetime is not more than ten.</li> <li>About half of Americans prefer to live away from cities, given the choice.</li> <li>Europeans have a mean paid vacation each year of six weeks.</li> <li>The chance of developing breast cancer is under 11% for women.</li> <li>Private universities mean tuition cost is more than \$20,000 per year.</li> </ol> </div> <div id="eip-idm137925328" data-type="solution"><p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id8395653" data-type="problem"><p>2) For statements a-j in <a href="#element-612">Exercise 9.109</a>, answer the following in complete sentences.</p> <ol type="a"><li>State a consequence of committing a Type I error.</li> <li>State a consequence of committing a Type II error.</li> </ol> </div> </div> <div data-type="exercise"><div id="id9404374" data-type="problem"><p id="id8618928">3) When a new drug is created, the pharmaceutical company must subject it to testing before receiving the necessary permission from the Food and Drug Administration (FDA) to market the drug. Suppose the null hypothesis is “the drug is unsafe.” What is the Type II Error?</p> <ol id="id5046364" type="a"><li>To conclude the drug is safe when in, fact, it is unsafe.</li> <li>Not to conclude the drug is safe when, in fact, it is safe.</li> <li>To conclude the drug is safe when, in fact, it is safe.</li> <li>Not to conclude the drug is unsafe when, in fact, it is unsafe.</li> </ol> </div> <div id="id9404499" data-type="solution"><p id="paragraphs"></p></div> </div> <div id="exer15" data-type="exercise"><div id="id8184871" data-type="problem"><p id="id8435364">4) A statistics instructor believes that fewer than 20% of Evergreen Valley College (EVC) students attended the opening midnight showing of the latest Harry Potter movie. She surveys 84 of her students and finds that 11 of them attended the midnight showing. The Type I error is to conclude that the percent of EVC students who attended is ________.</p> <ol id="id7995329" type="a"><li>at least 20%, when in fact, it is less than 20%.</li> <li>20%, when in fact, it is 20%.</li> <li>less than 20%, when in fact, it is at least 20%.</li> <li>less than 20%, when in fact, it is less than 20%.</li> </ol> <p>&nbsp;</p> </div> </div> <div id="exer17" data-type="exercise"><div id="id8185396" data-type="problem"><p id="eip-idm45808736">5) It is believed that Lake Tahoe Community College (LTCC) Intermediate Algebra students get less than seven hours of sleep per night, on average. A survey of 22 LTCC Intermediate Algebra students generated a mean of 7.24 hours with a standard deviation of 1.93 hours. At a level of significance of 5%, do LTCC Intermediate Algebra students get less than seven hours of sleep per night, on average?</p> <p id="id4936080">The Type II error is not to reject that the mean number of hours of sleep LTCC students get per night is at least seven when, in fact, the mean number of hours</p> <ol id="id8434145" type="a"><li>is more than seven hours.</li> <li>is at most seven hours.</li> <li>is at least seven hours.</li> <li>is less than seven hours.</li> </ol> </div> <div id="id8185494" data-type="solution"><p id="parrrra"></p></div> </div> <div id="exer20" data-type="exercise"><div id="id8186343" data-type="problem"><p id="id9140322">6) Previously, an organization reported that teenagers spent 4.5 hours per week, on average, on the phone. The organization thinks that, currently, the mean is higher. Fifteen randomly chosen teenagers were asked how many hours per week they spend on the phone. The sample mean was 4.75 hours with a sample standard deviation of 2.0. Conduct a hypothesis test, the Type I error is:</p> <ol id="id9140329" type="a"><li>to conclude that the current mean hours per week is higher than 4.5, when in fact, it is higher</li> <li>to conclude that the current mean hours per week is higher than 4.5, when in fact, it is the same</li> <li>to conclude that the mean hours per week currently is 4.5, when in fact, it is higher</li> <li>to conclude that the mean hours per week currently is no higher than 4.5, when in fact, it is not higher</li> </ol> <p><strong>Answers to odd questions </strong></p> <p>1)</p> <ol id="fs-idm18858336" type="a"><li>Type I error: We conclude that the mean is not 34 years, when it really is 34 years. Type II error: We conclude that the mean is 34 years, when in fact it really is not 34 years.</li> <li>Type I error: We conclude that more than 60% of Americans vote in presidential elections, when the actual percentage is at most 60%.Type II error: We conclude that at most 60% of Americans vote in presidential elections when, in fact, more than 60% do.</li> <li>Type I error: We conclude that the mean starting salary is less than \$100,000, when it really is at least \$100,000. Type II error: We conclude that the mean starting salary is at least \$100,000 when, in fact, it is less than \$100,000.</li> <li>Type I error: We conclude that the proportion of high school seniors who get drunk each month is not 29%, when it really is 29%. Type II error: We conclude that the proportion of high school seniors who get drunk each month is 29% when, in fact, it is not 29%.</li> <li>Type I error: We conclude that fewer than 5% of adults ride the bus to work in Los Angeles, when the percentage that do is really 5% or more. Type II error: We conclude that 5% or more adults ride the bus to work in Los Angeles when, in fact, fewer that 5% do.</li> <li>Type I error: We conclude that the mean number of cars a person owns in his or her lifetime is more than 10, when in reality it is not more than 10. Type II error: We conclude that the mean number of cars a person owns in his or her lifetime is not more than 10 when, in fact, it is more than 10.</li> <li>Type I error: We conclude that the proportion of Americans who prefer to live away from cities is not about half, though the actual proportion is about half. Type II error: We conclude that the proportion of Americans who prefer to live away from cities is half when, in fact, it is not half.</li> <li>Type I error: We conclude that the duration of paid vacations each year for Europeans is not six weeks, when in fact it is six weeks. Type II error: We conclude that the duration of paid vacations each year for Europeans is six weeks when, in fact, it is not.</li> <li>Type I error: We conclude that the proportion is less than 11%, when it is really at least 11%. Type II error: We conclude that the proportion of women who develop breast cancer is at least 11%, when in fact it is less than 11%.</li> <li>Type I error: We conclude that the average tuition cost at private universities is more than \$20,000, though in reality it is at most \$20,000. Type II error: We conclude that the average tuition cost at private universities is at most \$20,000 when, in fact, it is more than \$20,000.</li> </ol> <p>3) c</p> <p>5) d</p> </div> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="type1err"><dt>Type 1 Error</dt> <dd id="id2292288">The decision is to reject the null hypothesis when, in fact, the null hypothesis is true.</dd> </dl> <dl id="type2err"><dt>Type 2 Error</dt> <dd id="id1169217218486">The decision is not to reject the null hypothesis when, in fact, the null hypothesis is false.</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-distribution-needed-for-hypothesis-testing" title="Chapter 10.4: Distribution Needed for Hypothesis Testing"><div class="chapter-title-wrap"><h3 class="chapter-number">58</h3><h2 class="chapter-title"><span class="display-none">Chapter 10.4: Distribution Needed for Hypothesis Testing</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p id="fs-idm27402752">Earlier in the course, we discussed sampling distributions. <strong>Particular distributions are associated with hypothesis testing.</strong> Perform tests of a population mean using a <span data-type="term">normal distribution</span> or a <span data-type="term">Student&#8217;s <em data-effect="italics">t</em>-distribution</span>. (Remember, use a Student&#8217;s <em data-effect="italics">t</em>-distribution when the population <span data-type="term">standard deviation</span> is unknown and the distribution of the sample mean is approximately normal.) We perform tests of a population proportion using a normal distribution (usually <em data-effect="italics">n</em> is large).</p> <p id="element-418">If you are testing a <span data-type="term">single population mean</span>, the distribution for the test is for <strong>means</strong>:</p> <p>\(\overline{X}~N\left({\mu }_{X},\frac{{\sigma }_{X}}{\sqrt{n}}\right)\) or \({t}_{df}\)</p> <p>The population parameter is <em data-effect="italics">μ</em>. The estimated value (point estimate) for μ is \(\overline{x}\), the sample mean.</p> <p>If you are testing a <span data-type="term">single population proportion</span>, the distribution for the test is for proportions or percentages:</p> <p>\({P}^{\prime }~N\left(p,\sqrt{\frac{p\cdot q}{n}}\right)\)</p> <p id="element-128">The population parameter is <em data-effect="italics">p</em>. The estimated value (point estimate) for <em data-effect="italics">p</em> is <em data-effect="italics">p′</em>. <em data-effect="italics">p′</em> = \(\frac{x}{n}\) where <em data-effect="italics">x</em> is the number of successes and <em data-effect="italics">n</em> is the sample size.</p> <div id="fs-idp38268576" class="bc-section section" data-depth="1"><h3 data-type="title">Assumptions</h3> <p id="fs-idp18826944">When you perform a <span data-type="term">hypothesis test</span> <strong>of a single population mean <em data-effect="italics">μ</em></strong> using a <span data-type="term">Student&#8217;s <em data-effect="italics">t</em>-distribution</span> (often called a t-test), there are fundamental assumptions that need to be met in order for the test to work properly. Your data should be a <span data-type="term">simple random sample</span> that comes from a population that is approximately <span data-type="term">normally distributed</span>. You use the sample <span data-type="term">standard deviation</span> to approximate the population standard deviation. (Note that if the sample size is sufficiently large, a t-test will work even if the population is not approximately normally distributed).</p> <p>When you perform a <strong>hypothesis test of a single population mean <em data-effect="italics">μ</em></strong> using a normal distribution (often called a <em data-effect="italics">z</em>-test), you take a simple random sample from the population. The population you are testing is normally distributed or your sample size is sufficiently large. You know the value of the population standard deviation which, in reality, is rarely known.</p> <p>When you perform a <strong>hypothesis test of a single population proportion <em data-effect="italics">p</em></strong>, you take a simple random sample from the population. You must meet the conditions for a <span data-type="term">binomial distribution</span> which are: there are a certain number <em data-effect="italics">n</em> of independent trials, the outcomes of any trial are success or failure, and each trial has the same probability of a success <em data-effect="italics">p</em>. The shape of the binomial distribution needs to be similar to the shape of the normal distribution. To ensure this, the quantities <em data-effect="italics">np</em> and <em data-effect="italics">nq</em> must both be greater than five (<em data-effect="italics">np</em> &gt; 5 and <em data-effect="italics">nq</em> &gt; 5). Then the binomial distribution of a sample (estimated) proportion can be approximated by the normal distribution with <em data-effect="italics">μ</em> = <em data-effect="italics">p</em> and \(\sigma =\sqrt{\frac{pq}{n}}\). Remember that <em data-effect="italics">q</em> = 1 – <em data-effect="italics">p</em>.</p> </div> <div id="eip-952" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="eip-948">In order for a hypothesis test’s results to be generalized to a population, certain requirements must be satisfied.</p> <p id="eip-996">When testing for a single population mean:</p> <ol id="eip-idp106869104" type="1" data-mark-suffix="."><li>A Student&#8217;s <em data-effect="italics">t</em>-test should be used if the data come from a simple, random sample and the population is approximately normally distributed, or the sample size is large, with an unknown standard deviation.</li> <li>The normal test will work if the data come from a simple, random sample and the population is approximately normally distributed, or the sample size is large, with a known standard deviation.</li> </ol> <p id="fs-idp12045392">When testing a single population proportion use a normal test for a single population proportion if the data comes from a simple, random sample, fill the requirements for a binomial distribution, and the mean number of success and the mean number of failures satisfy the conditions: <em data-effect="italics">np</em> &gt; 5 and <em data-effect="italics">nq</em> &gt; <em data-effect="italics">n</em> where <em data-effect="italics">n</em> is the sample size, <em data-effect="italics">p</em> is the probability of a success, and <em data-effect="italics">q</em> is the probability of a failure.</p> </div> <div class="formula-review" data-depth="1"><h3 data-type="title">Formula Review</h3> <p>If there is no given preconceived <em data-effect="italics">α</em>, then use <em data-effect="italics">α</em> = 0.05.</p> <div data-type="list"><div data-type="title">Types of Hypothesis Tests</div> <ul><li>Single population mean, <strong>known</strong> population variance (or standard deviation): <strong>Normal test</strong>.</li> <li>Single population mean, <strong>unknown</strong> population variance (or standard deviation): <strong>Student&#8217;s <em data-effect="italics">t</em>-test</strong>.</li> <li>Single population proportion: <strong>Normal test</strong>.</li> <li>For a <strong>single population mean</strong>, we may use a normal distribution with the following mean and standard deviation. Means: \(\mu ={\mu }_{\overline{x}}\) and \({\sigma }_{\overline{x}}=\frac{{\sigma }_{x}}{\sqrt{n}}\)</li> <li>A <strong>single population proportion</strong>, we may use a normal distribution with the following mean and standard deviation. Proportions: <em data-effect="italics">µ</em> = <strong><em data-effect="italics">p</em></strong> and \(\sigma =\sqrt{\frac{pq}{n}}\).</li> </ul> </div> </div> <div class="practice" data-depth="1"><div data-type="exercise"><div data-type="problem"><p>Which two distributions can you use for hypothesis testing for this chapter?</p> </div> <div id="eip-482" data-type="solution"><p>A normal distribution or a Student’s <em data-effect="italics">t</em>-distribution</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p id="eip-696">Which distribution do you use when you are testing a population mean and the population standard deviation is known? Assume a normal distribution, with n ≥ 30.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>Which distribution do you use when the standard deviation is not known and you are testing one population mean? Assume sample size is large.</p> </div> <div data-type="solution"><p>Use a Student’s <em data-effect="italics">t</em>-distribution</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>A population mean is 13. The sample mean is 12.8, and the sample standard deviation is two. The sample size is 20. What distribution should you use to perform a hypothesis test? Assume the underlying population is normal.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>A population has a mean is 25 and a standard deviation of five. The sample mean is 24, and the sample size is 108. What distribution should you use to perform a hypothesis test?</p> </div> <div data-type="solution"><p>a normal distribution for a single population mean</p> </div> </div> <div id="eip-138" data-type="exercise"><div data-type="problem"><p>It is thought that 42% of respondents in a taste test would prefer Brand <em data-effect="italics">A</em>. In a particular test of 100 people, 39% preferred Brand <em data-effect="italics">A</em>. What distribution should you use to perform a hypothesis test?</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p id="eip-239">You are performing a hypothesis test of a single population mean using a Student’s <em data-effect="italics">t</em>-distribution. What must you assume about the distribution of the data?</p> </div> <div data-type="solution"><p>It must be approximately normally distributed.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>You are performing a hypothesis test of a single population mean using a Student’s <em data-effect="italics">t</em>-distribution. The data are not from a simple random sample. Can you accurately perform the hypothesis test?</p> </div> </div> <div data-type="exercise"><div id="eip-543" data-type="problem"><p id="eip-372">You are performing a hypothesis test of a single population proportion. What must be true about the quantities of <em data-effect="italics">np</em> and <em data-effect="italics">nq</em>?</p> </div> <div data-type="solution"><p>They must both be greater than five.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>You are performing a hypothesis test of a single population proportion. You find out that <em data-effect="italics">np</em> is less than five. What must you do to be able to perform a valid hypothesis test?</p> </div> </div> <div id="fs-idp17688688" data-type="exercise"><div data-type="problem"><p id="eip-129">You are performing a hypothesis test of a single population proportion. The data come from which distribution?</p> </div> <div data-type="solution"><p>binomial distribution<span data-type="newline"><br /> </span></p> </div> </div> </div> <div id="fs-idm173512608" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <div id="fs-idm140519760" data-type="exercise"><div id="fs-idm140519632" data-type="problem"><p id="fs-idp12923264">1) It is believed that Lake Tahoe Community College (LTCC) Intermediate Algebra students get less than seven hours of sleep per night, on average. A survey of 22 LTCC Intermediate Algebra students generated a mean of 7.24 hours with a standard deviation of 1.93 hours. At a level of significance of 5%, do LTCC Intermediate Algebra students get less than seven hours of sleep per night, on average? The distribution to be used for this test is \(\overline{X}\) ~ ________________</p> <ol id="fs-idm141858480" type="a"><li>\(N\left(7.24,\frac{1.93}{\sqrt{22}}\right)\)</li> <li>\(N\left(7.24,1.93\right)\)</li> <li><em data-effect="italics">t</em><sub>22</sub></li> <li><em data-effect="italics">t</em><sub>21</sub></li> </ol> </div> <div id="fs-idp8543664" data-type="solution"><p id="fs-idm76617584">Answer</p> <p>d</p> </div> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl><dt>Binomial Distribution</dt> <dd id="fs-idm119055264">a discrete random variable (RV) that arises from Bernoulli trials. There are a fixed number, <em data-effect="italics">n</em>, of independent trials. “Independent” means that the result of any trial (for example, trial 1) does not affect the results of the following trials, and all trials are conducted under the same conditions. Under these circumstances the binomial RV Χ is defined as the number of successes in <em data-effect="italics">n</em> trials. The notation is: <em data-effect="italics">X ~ B(n, p)</em> <em data-effect="italics">μ</em> = <em data-effect="italics">np</em> and the standard deviation is \(\sigma = \sqrt{npq}\). The probability of exactly <em data-effect="italics">x</em> successes in <em data-effect="italics">n</em> trials is \(P\left(X=x\right)=\left(\begin{array}{c}n\\ x\end{array}\right){p}^{x}{q}^{n-x}\).</dd> </dl> <dl><dt>Normal Distribution</dt> <dd>a continuous random variable (RV) with pdf \(f\left(x\right)= \frac{1}{\sigma \sqrt{2\pi }}{e}^{\frac{-{\left(x-\mu \right)}^{2}}{2{\sigma }^{2}}}\), where <em data-effect="italics">μ</em> is the mean of the distribution, and <em data-effect="italics">σ</em> is the standard deviation, notation: <em data-effect="italics">X ~ N</em>(<em data-effect="italics">μ</em>, <em data-effect="italics">σ</em>). If <em data-effect="italics">μ</em> = 0 and <em data-effect="italics">σ</em> = 1, the RV is called <strong>the standard normal distribution</strong>.</dd> </dl> <dl><dt>Standard Deviation</dt> <dd>a number that is equal to the square root of the variance and measures how far data values are from their mean; notation: <em data-effect="italics">s</em> for sample standard deviation and <em data-effect="italics">σ</em> for population standard deviation.</dd> </dl> <dl><dt>Student&#8217;s <em data-effect="italics">t</em>-Distribution</dt> <dd>investigated and reported by William S. Gossett in 1908 and published under the pseudonym Student. The major characteristics of the random variable (RV) are: <ul id="tdist1"><li>It is continuous and assumes any real values.</li> <li>The pdf is symmetrical about its mean of zero. However, it is more spread out and flatter at the apex than the normal distribution.</li> <li>It approaches the standard normal distribution as <em data-effect="italics">n</em> gets larger.</li> <li>There is a &#8220;family&#8221; of t distributions: every representative of the family is completely defined by the number of degrees of freedom which is one less than the number of data items.</li> </ul> </dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-rare-events-the-sample-decision-and-conclusion" title="Chapter 10.5: Rare Events, the Sample, Decision and Conclusion"><div class="chapter-title-wrap"><h3 class="chapter-number">59</h3><h2 class="chapter-title"><span class="display-none">Chapter 10.5: Rare Events, the Sample, Decision and Conclusion</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p id="fs-idm39235328">Establishing the type of distribution, sample size, and known or unknown standard deviation can help you figure out how to go about a hypothesis test. However, there are several other factors you should consider when working out a hypothesis test.</p> <div id="fs-idm88599200" class="bc-section section" data-depth="1"><h3 data-type="title">Rare Events</h3> <p id="fs-idp23331840">Suppose you make an assumption about a property of the population (this assumption is the <span data-type="term">null hypothesis</span>). Then you gather sample data randomly. If the sample has properties that would be very <strong>unlikely</strong> to occur if the assumption is true, then you would conclude that your assumption about the population is probably incorrect. (Remember that your assumption is just an <span data-type="term">assumption</span>—it is not a fact and it may or may not be true. But your sample data are real and the data are showing you a fact that seems to contradict your assumption.)</p> <p>For example, Didi and Ali are at a birthday party of a very wealthy friend. They hurry to be first in line to grab a prize from a tall basket that they cannot see inside because they will be blindfolded. There are 200 plastic bubbles in the basket and Didi and Ali have been told that there is only one with a ?100 bill. Didi is the first person to reach into the basket and pull out a bubble. Her bubble contains a ?100 bill. The probability of this happening is \(\frac{1}{200}\) = 0.005. Because this is so unlikely, Ali is hoping that what the two of them were told is wrong and there are more ?100 bills in the basket. A &#8220;rare event&#8221; has occurred (Didi getting the ?100 bill) so Ali doubts the assumption about only one ?100 bill being in the basket.</p> </div> <div id="fs-idp139727497741440" class="bc-section section" data-depth="1"><h3 data-type="title">Using the Sample to Test the Null Hypothesis</h3> <p id="fs-idp165373232">Use the sample data to calculate the actual probability of getting the test result, called the <span data-type="term"><em data-effect="italics">p</em>-value</span>. The <em data-effect="italics">p</em>-value is the <strong>probability that, if the null hypothesis is true, the results from another randomly selected sample will be as extreme or more extreme as the results obtained from the given sample.</strong></p> <p>A large <em data-effect="italics">p</em>-value calculated from the data indicates that we should not reject the <span data-type="term">null hypothesis</span>. The smaller the <em data-effect="italics">p</em>-value, the more unlikely the outcome, and the stronger the evidence is against the null hypothesis. We would reject the null hypothesis if the evidence is strongly against it.</p> <p id="fs-idm72636976"><strong>Draw a graph that shows the <em data-effect="italics">p</em>-value. The hypothesis test is easier to perform if you use a graph because you see the problem more clearly.</strong></p> <div class="textbox textbox--examples" data-type="example"><p>Suppose a baker claims that his bread height is more than 15 cm, on average. Several of his customers do not believe him. To persuade his customers that he is right, the baker decides to do a hypothesis test. He bakes 10 loaves of bread. The mean height of the sample loaves is 17 cm. The baker knows from baking hundreds of loaves of bread that the <span data-type="term">standard deviation</span> for the height is 0.5 cm. and the distribution of heights is normal.</p> <p>The null hypothesis could be <em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ</em> ≤ 15 The alternate hypothesis is <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> &gt; 15</p> <p>The words <strong>&#8220;is more than&#8221;</strong> translates as a &#8220;&gt;&#8221; so &#8220;<em data-effect="italics">μ</em> &gt; 15&#8243; goes into the alternate hypothesis. The null hypothesis must contradict the alternate hypothesis.</p> <p>Since <strong><em data-effect="italics">σ</em> is known</strong> (<em data-effect="italics">σ</em> = 0.5 cm.), the distribution for the population is known to be normal with mean <em data-effect="italics">μ</em> = 15 and standard deviation \(\frac{\sigma }{\sqrt{n}}=\frac{0.5}{\sqrt{10}}=0.16\).</p> <p>Suppose the null hypothesis is true (the mean height of the loaves is no more than 15 cm). Then is the mean height (17 cm) calculated from the sample unexpectedly large? The hypothesis test works by asking the question how <strong>unlikely</strong> the sample mean would be if the null hypothesis were true. The graph shows how far out the sample mean is on the normal curve. The <em data-effect="italics">p</em>-value is the probability that, if we were to take other samples, any other sample mean would fall at least as far out as 17 cm.</p> <p id="element-537"><strong>The <em data-effect="italics">p</em>-value, then, is the probability that a sample mean is the same or greater than 17 cm. when the population mean is, in fact, 15 cm.</strong> We can calculate this probability using the normal distribution for means.</p> <div id="fs-idm112264608" class="bc-figure figure"><span id="id9446267" data-type="media" data-alt="Normal distribution curve on average bread heights with values 15, as the population mean, and 17, as the point to determine the p-value, on the x-axis."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch09_07_01N-1.jpg" alt="Normal distribution curve on average bread heights with values 15, as the population mean, and 17, as the point to determine the p-value, on the x-axis." width="380" data-media-type="image/jpg" /></span></div> <p id="fs-idp85594448"><em data-effect="italics">p</em>-value= <em data-effect="italics">P</em>(\(\overline{x}\) &gt; 17) which is approximately zero.</p> <p>A <em data-effect="italics">p</em>-value of approximately zero tells us that it is highly unlikely that a loaf of bread rises no more than 15 cm, on average. That is, almost 0% of all loaves of bread would be at least as high as 17 cm. <strong>purely by CHANCE</strong> had the population mean height really been 15 cm. Because the outcome of 17 cm. is so <strong>unlikely (meaning it is happening NOT by chance alone)</strong>, we conclude that the evidence is strongly against the null hypothesis (the mean height is at most 15 cm.). There is sufficient evidence that the true mean height for the population of the baker&#8217;s loaves of bread is greater than 15 cm.</p> </div> <div id="fs-idm19107408" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div id="eip-747" data-type="problem"><p>A normal distribution has a standard deviation of 1. We want to verify a claim that the mean is greater than 12. A sample of 36 is taken with a sample mean of 12.5.</p> <p id="eip-idp83058144"><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ</em> ≤ 12 <span data-type="newline"><br /> </span><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> &gt; 12 <span data-type="newline"><br /> </span>The <em data-effect="italics">p</em>-value is 0.0013 <span data-type="newline"><br /> </span>Draw a graph that shows the <em data-effect="italics">p</em>-value.</p> </div> </div> </div> </div> <div id="fs-idp139727505143696" class="bc-section section" data-depth="1"><h3 data-type="title">Decision and Conclusion</h3> <p id="fs-idp169707568">A systematic way to make a decision of whether to reject or not reject the <span data-type="term">null hypothesis</span> is to compare the <em data-effect="italics">p</em>-value and a <strong>preset or preconceived <span data-type="term">α</span> (also called a &#8220;significance level&#8221;)</strong>. A preset <em data-effect="italics">α</em> is the probability of a <span data-type="term">Type I error</span> (rejecting the null hypothesis when the null hypothesis is true). It may or may not be given to you at the beginning of the problem.</p> <p id="element-259">When you make a <strong>decision</strong> to reject or not reject <em data-effect="italics">H<sub>0</sub></em>, do as follows:</p> <ul><li>If <em data-effect="italics">α</em> &gt; <em data-effect="italics">p</em>-value, reject <em data-effect="italics">H<sub>0</sub></em>. The results of the sample data are significant. There is sufficient evidence to conclude that <em data-effect="italics">H<sub>0</sub></em> is an incorrect belief and that the <strong>alternative hypothesis</strong>, <em data-effect="italics">H<sub>a</sub></em>, may be correct.</li> <li>If <em data-effect="italics">α</em> ≤ <em data-effect="italics">p</em>-value, do not reject <em data-effect="italics">H<sub>0</sub></em>. The results of the sample data are not significant.There is not sufficient evidence to conclude that the alternative hypothesis,<em data-effect="italics">H<sub>a</sub></em>, may be correct.</li> <li>When you &#8220;do not reject <em data-effect="italics">H<sub>0</sub></em>&#8220;, it does not mean that you should believe that <em data-effect="italics">H<sub>0</sub></em> is true. It simply means that the sample data have <strong>failed</strong> to provide sufficient evidence to cast serious doubt about the truthfulness of <em data-effect="italics">H<sub>o</sub></em>.</li> </ul> <p><strong>Conclusion:</strong> After you make your decision, write a thoughtful <strong>conclusion</strong> about the hypotheses in terms of the given problem.</p> <div class="textbox textbox--examples" data-type="example"><p>When using the <em data-effect="italics">p</em>-value to evaluate a hypothesis test, it is sometimes useful to use the following memory device</p> <p>If the <em data-effect="italics">p</em>-value is low, the null must go.</p> <p>If the <em data-effect="italics">p</em>-value is high, the null must fly.</p> <p id="eip-244">This memory aid relates a <em data-effect="italics">p</em>-value less than the established alpha (the <em data-effect="italics">p</em> is low) as rejecting the null hypothesis and, likewise, relates a <em data-effect="italics">p</em>-value higher than the established alpha (the <em data-effect="italics">p</em> is high) as not rejecting the null hypothesis.</p> <div data-type="exercise"><div data-type="problem"><p id="eip-idp111480688">Fill in the blanks.</p> <p id="eip-idp109073024">Reject the null hypothesis when ______________________________________.</p> <p id="eip-idp31062480">The results of the sample data _____________________________________.</p> <p id="eip-idp11219984">Do not reject the null when hypothesis when __________________________________________.</p> <p id="eip-idm8475328">The results of the sample data ____________________________________________.</p> </div> <div data-type="solution"><p id="eip-idm15109872">Reject the null hypothesis when <strong>the <em data-effect="italics">p</em>-value is less than the established alpha value</strong>. The results of the sample data <strong>support the alternative hypothesis</strong>.</p> <p id="eip-idm116402928">Do not reject the null hypothesis when <strong>the <em data-effect="italics">p</em>-value is greater than the established alpha value</strong>. The results of the sample data <strong>do not support the alternative hypothesis</strong>.</p> </div> </div> </div> <div id="fs-idm142970080" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="eip-437" data-type="exercise"><div id="eip-594" data-type="problem"><p id="eip-860">It’s a Boy Genetics Labs claim their procedures improve the chances of a boy being born. The results for a test of a single population proportion are as follows:</p> <p id="eip-idm34336112"><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p</em> = 0.50, <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p</em> &gt; 0.50</p> <p id="eip-idp6354960"><em data-effect="italics">α</em> = 0.01</p> <p id="eip-idp119018688"><em data-effect="italics">p</em>-value = 0.025</p> <p id="eip-idm52481600">Interpret the results and state a conclusion in simple, non-technical terms.</p> </div> </div> </div> </div> <div id="fs-idp139727500433008" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idp139727500433648">When the probability of an event occurring is low, and it happens, it is called a rare event. Rare events are important to consider in hypothesis testing because they can inform your willingness not to reject or to reject a null hypothesis. To test a null hypothesis, find the <em data-effect="italics">p</em>-value for the sample data and graph the results. When deciding whether or not to reject the null the hypothesis, keep these two parameters in mind:</p> <ol id="fs-idp139727498090720" type="1"><li><em data-effect="italics">α</em> &gt; <em data-effect="italics">p</em>-value, reject the null hypothesis</li> <li><em data-effect="italics">α</em> ≤ <em data-effect="italics">p</em>-value, do not reject the null hypothesis</li> </ol> </div> <div id="fs-idp139727500224832" class="practice" data-depth="1"><div id="fs-idp139727497512736" data-type="exercise"><div id="fs-idp139727497512992" data-type="problem"><p id="fs-idp139727497513248">When do you reject the null hypothesis?</p> </div> </div> <div id="fs-idp139727506576944" data-type="exercise"><div id="fs-idp139727506577200" data-type="problem"><p id="fs-idp139727497559584">The probability of winning the grand prize at a particular carnival game is 0.005. Is the outcome of winning very likely or very unlikely?</p> </div> <div id="fs-idp139727497560256" data-type="solution"><p id="fs-idp139727497560512">The outcome of winning is very unlikely.</p> </div> </div> <div id="fs-idp139727500342720" data-type="exercise"><div id="fs-idp139727500342976" data-type="problem"><p id="fs-idp139727500343232">The probability of winning the grand prize at a particular carnival game is 0.005. Michele wins the grand prize. Is this considered a rare or common event? Why?</p> </div> </div> <div id="fs-idp139727500456320" data-type="exercise"><div id="fs-idp139727500456576" data-type="problem"><p id="fs-idp139727500456832">It is believed that the mean height of high school students who play basketball on the school team is 73 inches with a standard deviation of 1.8 inches. A random sample of 40 players is chosen. The sample mean was 71 inches, and the sample standard deviation was 1.5 years. Do the data support the claim that the mean height is less than 73 inches? The <em data-effect="italics">p</em>-value is almost zero. State the null and alternative hypotheses and interpret the <em data-effect="italics">p</em>-value.</p> </div> <div id="fs-idp139727500328080" data-type="solution"><p id="eip-idp77616272"><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ</em> &gt; = 73 <span data-type="newline"><br /> </span><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> &lt; 73 <span data-type="newline"><br /> </span>The <em data-effect="italics">p</em>-value is almost zero, which means there is sufficient data to conclude that the mean height of high school students who play basketball on the school team is less than 73 inches at the 5% level. The data do support the claim.</p> </div> </div> <div id="fs-idp139727497515136" data-type="exercise"><div id="fs-idp139727497515392" data-type="problem"><p id="fs-idp139727497515648">The mean age of graduate students at a University is at most 31 y ears with a standard deviation of two years. A random sample of 15 graduate students is taken. The sample mean is 32 years and the sample standard deviation is three years. Are the data significant at the 1% level? The <em data-effect="italics">p</em>-value is 0.0264. State the null and alternative hypotheses and interpret the <em data-effect="italics">p</em>-value.</p> </div> </div> <div id="fs-idp139727497492608" data-type="exercise"><div id="fs-idp139727500337904" data-type="problem"><p id="fs-idp139727500338160">Does the shaded region represent a low or a high <em data-effect="italics">p</em>-value compared to a level of significance of 1%?</p> <div id="fs-idp44994304" class="bc-figure figure"><span id="fs-idp139727500338544" data-type="media" data-alt="" data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch09_07_01N-1.jpg" alt="" width="380" data-media-type="image/jpeg" /></span></div> </div> <div id="fs-idp139727500212304" data-type="solution"><p id="fs-idp139727500212560">The shaded region shows a low <em data-effect="italics">p</em>-value.</p> </div> </div> <div id="fs-idp139727500213504" data-type="exercise"><div id="fs-idp139727500319104" data-type="problem"><p id="fs-idp139727500319360">What should you do when <em data-effect="italics">α</em> &gt; <em data-effect="italics">p</em>-value?</p> </div> </div> <div id="fs-idp139727500288288" data-type="exercise"><div id="fs-idp139727500288544" data-type="problem"><p id="fs-idp139727506518704">What should you do if <em data-effect="italics">α</em> = <em data-effect="italics">p</em>-value?</p> </div> <div id="fs-idp139727506519216" data-type="solution"><p id="fs-idp139727506519472">Do not reject <em data-effect="italics">H<sub>0</sub></em>.</p> </div> </div> <div id="fs-idp139727500315024" data-type="exercise"><div id="fs-idp139727500315280" data-type="problem"><p id="fs-idp139727500315536">If you do not reject the null hypothesis, then it must be true. Is this statement correct? State why or why not in complete sentences.</p> </div> </div> <p><em data-effect="italics">Use the following information to answer the next seven exercises:</em> Suppose that a recent article stated that the mean time spent in jail by a first-time convicted burglar is 2.5 years. A study was then done to see if the mean time has increased in the new century. A random sample of 26 first-time convicted burglars in a recent year was picked. The mean length of time in jail from the survey was three years with a standard deviation of 1.8 years. Suppose that it is somehow known that the population standard deviation is 1.5. Conduct a hypothesis test to determine if the mean length of jail time has increased. Assume the distribution of the jail times is approximately normal.</p> <div data-type="exercise"><div id="fs-idm64220592" data-type="problem"><p id="fs-idp26378528">Is this a test of means or proportions?</p> </div> <div id="fs-idm27491088" data-type="solution"><p id="fs-idp37854496">means</p> </div> </div> <p>&nbsp;</p> <div data-type="exercise"><div id="id13277236" data-type="problem"><p>What symbol represents the random variable for this test?</p> </div> </div> <div data-type="exercise"><div id="id9153115" data-type="problem"><p id="element-504">In words, define the random variable for this test.</p> </div> <div id="id9153134" data-type="solution"><p>the mean time spent in jail for 26 first time convicted burglars</p> </div> </div> <div data-type="exercise"><div id="id9153163" data-type="problem"><p>Is σ known and, if so, what is it?</p> </div> </div> <div data-type="exercise"><div id="id9153210" data-type="problem"><p>Calculate the following:</p> <ol id="list-8276387645" type="a"><li>\(\overline{x}\) _______</li> <li><em data-effect="italics">σ</em> _______</li> <li><em data-effect="italics">s<sub>x</sub></em> _______</li> <li><em data-effect="italics">n</em> _______</li> </ol> </div> <div id="id13779592" data-type="solution"><ol id="list-23512532" type="a"><li>3</li> <li>1.5</li> <li>1.8</li> <li>26</li> </ol> </div> </div> <div data-type="exercise"><div id="id13779679" data-type="problem"><p>Since both σ and \({s}_{x}\) are given, which should be used? In one to two complete sentences, explain why.</p> </div> </div> <div id="element-121" data-type="exercise"><div id="id13779798" data-type="problem"><p>State the distribution to use for the hypothesis test.</p> </div> <div id="id13779818" data-type="solution"><p>\(\overline{X}~N\left(2.5,\frac{1.5}{\sqrt{26}}\right)\)<span data-type="newline"><br /> </span></p> </div> </div> <p>&nbsp;</p> <div id="fs-idm67684880" data-type="exercise"><div id="fs-idp79923936" data-type="problem"><p id="eip-idp59803264">A random survey of 75 death row inmates revealed that the mean length of time on death row is 17.4 years with a standard deviation of 6.3 years. Conduct a hypothesis test to determine if the population mean time on death row could likely be 15 years.</p> <ol id="fs-idp120951904" type="a"><li>Is this a test of one mean or proportion?</li> <li>State the null and alternative hypotheses. <span data-type="newline"><br /> </span><em data-effect="italics">H<sub>0</sub></em>: ____________________ <em data-effect="italics">H<sub>a</sub></em> : ____________________</li> <li>Is this a right-tailed, left-tailed, or two-tailed test?</li> <li>What symbol represents the random variable for this test?</li> <li>In words, define the random variable for this test.</li> <li>Is the population standard deviation known and, if so, what is it?</li> <li>Calculate the following: <ol id="fs-idp42891616" type="i"><li>\(\overline{x}\) = _____________</li> <li><em data-effect="italics">s</em> = ____________</li> <li><em data-effect="italics">n</em> = ____________</li> </ol> </li> <li>Which test should be used?</li> <li>State the distribution to use for the hypothesis test.</li> <li>Find the <em data-effect="italics">p</em>-value.</li> <li>At a pre-conceived <em data-effect="italics">α</em> = 0.05, what is your: <ol id="fs-idp14952416" type="i"><li>Decision:</li> <li>Reason for the decision:</li> <li>Conclusion (write out in a complete sentence):</li> </ol> </li> </ol> </div> </div> </div> <div id="fs-idp78752032" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <div id="fs-idp20135424" data-type="exercise"><div id="fs-idp185700128" data-type="problem"><p id="eip-idm15690000">1) The National Institute of Mental Health published an article stating that in any one-year period, approximately 9.5 percent of American adults suffer from depression or a depressive illness. Suppose that in a survey of 100 people in a certain town, seven of them suffered from depression or a depressive illness. Conduct a hypothesis test to determine if the true proportion of people in that town suffering from depression or a depressive illness is lower than the percent in the general adult American population.</p> <ol id="fs-idp190659504" type="a"><li>Is this a test of one mean or proportion?</li> <li>State the null and alternative hypotheses. <span data-type="newline"><br /> </span><em data-effect="italics">H<sub>0</sub></em>: ____________________ <em data-effect="italics">H<sub>a</sub></em>: ____________________</li> <li>Is this a right-tailed, left-tailed, or two-tailed test?</li> <li>What symbol represents the random variable for this test?</li> <li>In words, define the random variable for this test.</li> <li>Calculate the following: <ol id="fs-idp150758080" type="i"><li><em data-effect="italics">x</em> = ________________</li> <li><em data-effect="italics">n</em> = ________________</li> <li>\({p}^{\prime }\) = _____________</li> </ol> </li> <li>Calculate <em data-effect="italics">σ<sub>x</sub></em> = __________. Show the formula set-up.</li> <li>State the distribution to use for the hypothesis test.</li> <li>Find the <em data-effect="italics">p</em>-value.</li> <li>At a pre-conceived <em data-effect="italics">α</em> = 0.05, what is your: <ol id="fs-idp70896592" type="i"><li>Decision:</li> <li>Reason for the decision:</li> <li>Conclusion (write out in a complete sentence):</li> </ol> </li> </ol> </div> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="fs-idm2388928"><dt>Level of Significance of the Test</dt> <dd id="id18705874">probability of a Type I error (reject the null hypothesis when it is true). Notation: α. In hypothesis testing, the Level of Significance is called the preconceived α or the preset α.</dd> </dl> <dl id="fs-idm119440"><dt><em data-effect="italics">p</em>-value</dt> <dd id="id9446562">the probability that an event will happen purely by chance assuming the null hypothesis is true. The smaller the <em data-effect="italics">p</em>-value, the stronger the evidence is against the null hypothesis.</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-additional-information-and-full-hypothesis-test-examples" title="Chapter 10.6: Additional Information and Full Hypothesis Test Examples"><div class="chapter-title-wrap"><h3 class="chapter-number">60</h3><h2 class="chapter-title"><span class="display-none">Chapter 10.6: Additional Information and Full Hypothesis Test Examples</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <ul><li>In a <span data-type="term">hypothesis test</span> problem, you may see words such as &#8220;the level of significance is 1%.&#8221; The &#8220;1%&#8221; is the preconceived or preset <em data-effect="italics">α</em>.</li> <li>The statistician setting up the hypothesis test selects the value of <em data-effect="italics">α</em> to use <strong>before</strong> collecting the sample data.</li> <li><strong>If no level of significance is given, a common standard to use is <em data-effect="italics">α</em> = 0.05.</strong></li> <li>When you calculate the <em data-effect="italics">p</em>-value and draw the picture, the <em data-effect="italics">p</em>-value is the area in the left tail, the right tail, or split evenly between the two tails. For this reason, we call the hypothesis test left, right, or two tailed.</li> <li>The <strong>alternative hypothesis</strong>, \({H}_{a}\), tells you if the test is left, right, or two-tailed. It is the <strong>key</strong> to conducting the appropriate test.</li> <li><em data-effect="italics">H<sub>a</sub></em><strong>never</strong> has a symbol that contains an equal sign.</li> <li><strong>Thinking about the meaning of the</strong><span data-type="term"><em data-effect="italics">p</em>-value</span>: A data analyst (and anyone else) should have more confidence that he made the correct decision to reject the null hypothesis with a smaller <em data-effect="italics">p</em>-value (for example, 0.001 as opposed to 0.04) even if using the 0.05 level for alpha. Similarly, for a large <em data-effect="italics">p</em>-value such as 0.4, as opposed to a <em data-effect="italics">p</em>-value of 0.056 (alpha = 0.05 is less than either number), a data analyst should have more confidence that she made the correct decision in not rejecting the null hypothesis. This makes the data analyst use judgment rather than mindlessly applying rules.</li> </ul> <p id="fs-idp42898080">The following examples illustrate a left-, right-, and two-tailed test.</p> <div id="fs-idp41487520" class="textbox textbox--examples" data-type="example"><p id="fs-idp41487648"><em data-effect="italics">H<sub>o</sub></em>: <em data-effect="italics">μ</em> = 5, <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> &lt; 5</p> <p>Test of a single population mean. <em data-effect="italics">H<sub>a</sub></em> tells you the test is left-tailed. The picture of the <em data-effect="italics">p</em>-value is as follows:</p> <div id="fs-idm151356816" class="bc-figure figure"><span id="id14703034" data-type="media" data-alt="Normal distribution curve of a single population mean with a value of 5 on the x-axis and the p-value points to the area on the left tail of the curve."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch09_09_01-02-1.jpg" alt="Normal distribution curve of a single population mean with a value of 5 on the x-axis and the p-value points to the area on the left tail of the curve." width="380" data-media-type="image/jpg" /></span></div> </div> <div id="fs-idp42408528" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp67352928" data-type="exercise"><div data-type="problem"><p id="eip-idm132446832"><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ</em> = 10, <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> &lt; 10</p> <p id="eip-idp3861728">Assume the <em data-effect="italics">p</em>-value is 0.0935. What type of test is this? Draw the picture of the <em data-effect="italics">p</em>-value.</p> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><p><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p</em> ≤ 0.2  <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p</em> &gt; 0.2</p> <p id="fs-idp117549824">This is a test of a single population proportion. <em data-effect="italics">H<sub>a</sub></em> tells you the test is <strong>right-tailed</strong>. The picture of the <em data-effect="italics">p</em>-value is as follows:</p> <div id="fs-idm2304848" class="bc-figure figure"><span id="id14703170" data-type="media" data-alt="Normal distribution curve of a single population proportion with the value of 0.2 on the x-axis. The p-value points to the area on the right tail of the curve."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch09_09_02a-1.jpg" alt="Normal distribution curve of a single population proportion with the value of 0.2 on the x-axis. The p-value points to the area on the right tail of the curve." width="380" data-media-type="image/jpg" /></span></div> </div> <div id="fs-idm126098448" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="eip-417" data-type="exercise"><div id="fs-idp131553552" data-type="problem"><p id="eip-idm89526848"><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ</em> ≤ 1, <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> &gt; 1</p> <p id="eip-idm24425280">Assume the <em data-effect="italics">p</em>-value is 0.1243. What type of test is this? Draw the picture of the <em data-effect="italics">p</em>-value.</p> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><p id="element-38a"><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p</em> = 50  <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p</em> ≠ 50</p> <p>This is a test of a single population mean. <em data-effect="italics">H<sub>a</sub></em> tells you the test is <strong>two-tailed</strong>. The picture of the <em data-effect="italics">p</em>-value is as follows.</p> <div id="fs-idm117972816" class="bc-figure figure"><span id="id4829633" data-type="media" data-alt="Normal distribution curve of a single population mean with a value of 50 on the x-axis. The p-value formulas, 1/2(p-value), for a two-tailed test is shown for the areas on the left and right tails of the curve."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch09_09_03N-1.jpg" alt="Normal distribution curve of a single population mean with a value of 50 on the x-axis. The p-value formulas, 1/2(p-value), for a two-tailed test is shown for the areas on the left and right tails of the curve." width="380" data-media-type="image/jpg" /></span></div> </div> <div id="fs-idm63250432" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div data-type="problem"><p id="eip-idp41687312"><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p</em> = 0.5, <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p</em> ≠ 0.5</p> <p id="eip-idm5668432">Assume the <em data-effect="italics">p</em>-value is 0.2564. What type of test is this? Draw the picture of the <em data-effect="italics">p</em>-value.</p> </div> </div> </div> <div id="fs-idm12892576" class="bc-section section" data-depth="1"><h3 data-type="title">Full Hypothesis Test Examples</h3> <div class="textbox textbox--examples" data-type="example"><div data-type="exercise"><div id="id5300933" data-type="problem"><p>Jeffrey, as an eight-year old, <strong>established a mean time of 16.43 seconds</strong> for swimming the 25-yard freestyle, with a <strong>standard deviation of 0.8 seconds</strong>. His dad, Frank, thought that Jeffrey could swim the 25-yard freestyle faster using goggles. Frank bought Jeffrey a new pair of expensive goggles and timed Jeffrey for <strong>15 25-yard freestyle swims</strong>. For the 15 swims, <strong>Jeffrey&#8217;s mean time was 16 seconds. Frank thought that the goggles helped Jeffrey to swim faster than the 16.43 seconds.</strong> Conduct a hypothesis test using a preset <em data-effect="italics">α</em> = 0.05. Assume that the swim times for the 25-yard freestyle are normal.</p> </div> <div id="id5036614" data-type="solution"><p>Set up the Hypothesis Test:</p> <p>Since the problem is about a mean, this is a <strong>test of a single population mean</strong>.</p> <p><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ</em> = 16.43  <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> &lt; 16.43</p> <p id="element-241">For Jeffrey to swim faster, his time will be less than 16.43 seconds. The &#8220;&lt;&#8221; tells you this is left-tailed.</p> <p>Determine the distribution needed:</p> <p><strong>Random variable: </strong>\(\overline{X}\) = the mean time to swim the 25-yard freestyle.</p> <p><strong>Distribution for the test: </strong>\(\overline{X}\) is normal (population <span data-type="term">standard deviation</span> is known: <em data-effect="italics">σ</em> = 0.8)</p> <p>\(\overline{X}~N\left(\mu ,\frac{{\sigma }_{X}}{\sqrt{n}}\right)\) Therefore, \(\overline{X}~N\left(16.43,\frac{0.8}{\sqrt{15}}\right)\)</p> <p><em data-effect="italics">μ</em> = 16.43 comes from <em data-effect="italics">H<sub>0</sub></em> and not the data. <em data-effect="italics">σ</em> = 0.8, and <em data-effect="italics">n</em> = 15.</p> <p>Calculate the <em data-effect="italics">p</em>-value using the normal distribution for a mean:</p> <p id="element-706"><em data-effect="italics">p</em>-value = <em data-effect="italics">P</em>(\(\overline{x}\) &lt; 16) = 0.0187 where the sample mean in the problem is given as 16.</p> <p><em data-effect="italics">p</em>-value = 0.0187 (This is called the <strong>actual level of significance</strong>.) The <em data-effect="italics">p</em>-value is the area to the left of the sample mean is given as 16.</p> <p><strong>Graph:</strong></p> <div id="hyptest11_ex1" class="bc-figure figure"><span id="id4282005" data-type="media" data-alt="Normal distribution curve for the average time to swim the 25-yard freestyle with values 16, as the sample mean, and 16.43 on the x-axis. A vertical upward line extends from 16 on the x-axis to the curve. An arrow points to the left tail of the curve."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch09_11_01N-1.jpg" alt="Normal distribution curve for the average time to swim the 25-yard freestyle with values 16, as the sample mean, and 16.43 on the x-axis. A vertical upward line extends from 16 on the x-axis to the curve. An arrow points to the left tail of the curve." width="380" data-media-type="image/jpg" /></span></div> <p><em data-effect="italics">μ</em> = 16.43 comes from <em data-effect="italics">H<sub>0</sub></em>. Our assumption is <em data-effect="italics">μ</em> = 16.43.</p> <p><strong>Interpretation of the <em data-effect="italics">p</em>-value: If <em data-effect="italics">H<sub>0</sub></em> is true</strong>, there is a 0.0187 probability (1.87%)that Jeffrey&#8217;s mean time to swim the 25-yard freestyle is 16 seconds or less. Because a 1.87% chance is small, the mean time of 16 seconds or less is unlikely to have happened randomly. It is a rare event.</p> <p>Compare <em data-effect="italics">α</em> and the <em data-effect="italics">p</em>-value:</p> <p id="element-890"><em data-effect="italics">α</em> = 0.05 <em data-effect="italics">p</em>-value = 0.0187 <em data-effect="italics">α</em> &gt; <em data-effect="italics">p</em>-value</p> <p id="element-543"><strong>Make a decision:</strong> Since <em data-effect="italics">α</em> &gt; <em data-effect="italics">p</em>-value, reject <em data-effect="italics">H<sub>0</sub></em>.</p> <p>&nbsp;</p> <p>This means that you reject <em data-effect="italics">μ</em> = 16.43. In other words, you do not think Jeffrey swims the 25-yard freestyle in 16.43 seconds but faster with the new goggles.</p> <p><strong>Conclusion:</strong> At the 5% significance level, we conclude that Jeffrey swims faster using the new goggles. The sample data show there is sufficient evidence that Jeffrey&#8217;s mean time to swim the 25-yard freestyle is less than 16.43 seconds.</p> <p id="element-252">The <em data-effect="italics">p</em>-value can easily be calculated.</p> <div id="fs-idm31641280" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="element-362">Press <code>STAT</code> and arrow over to <code>TESTS</code>. Press <code>1:Z-Test</code>. Arrow over to <code>Stats</code> and press <code>ENTER</code>. Arrow down and enter 16.43 for <em data-effect="italics">μ<sub>0</sub></em> (null hypothesis), .8 for <em data-effect="italics">σ</em>, 16 for the sample mean, and 15 for <em data-effect="italics">n</em>. Arrow down to <em data-effect="italics">μ</em> : (alternate hypothesis) and arrow over to &lt; <em data-effect="italics">μ<sub>0</sub></em>. Press <code>ENTER</code>. Arrow down to <code>Calculate</code> and press <code>ENTER</code>. The calculator not only calculates the <em data-effect="italics">p</em>-value (<em data-effect="italics">p</em> = 0.0187) but it also calculates the test statistic (<em data-effect="italics">z</em>-score) for the sample mean. <em data-effect="italics">μ</em> &lt; 16.43 is the alternative hypothesis. Do this set of instructions again except arrow to <code>Draw</code>(instead of <code>Calculate</code>). Press <code>ENTER</code>. A shaded graph appears with <em data-effect="italics">z</em> = -2.08 (test statistic) and <em data-effect="italics">p</em> = 0.0187 (<em data-effect="italics">p</em>-value). Make sure when you use <code>Draw</code> that no other equations are highlighted in <em data-effect="italics">Y</em> = and the plots are turned off.</p> </div> <p>When the calculator does a <em data-effect="italics">Z</em>-Test, the <code>Z-Test</code> function finds the <em data-effect="italics">p</em>-value by doing a normal probability calculation using the <span data-type="term">central limit theorem</span>:</p> <p id="fs-idp159988784">\(P\left(\overline{x}&lt;16\right)=\)<code>2nd DISTR normcdf</code>\(\left(-10^99,16,16.43,0.8/\sqrt{15}\right)\) .</p> <p>The Type I and Type II errors for this problem are as follows:</p> <p id="element-385">The Type I error is to conclude that Jeffrey swims the 25-yard freestyle, on average, in less than 16.43 seconds when, in fact, he actually swims the 25-yard freestyle, on average, in 16.43 seconds. (Reject the null hypothesis when the null hypothesis is true.)</p> <p>The Type II error is that there is not evidence to conclude that Jeffrey swims the 25-yard free-style, on average, in less than 16.43 seconds when, in fact, he actually does swim the 25-yard free-style, on average, in less than 16.43 seconds. (Do not reject the null hypothesis when the null hypothesis is false.)</p> </div> </div> </div> <div id="fs-idm112426112" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp78508304" data-type="exercise"><div id="fs-idp117367632" data-type="problem"><p>The mean throwing distance of a football for Marco, a high school freshman quarterback, is 40 yards, with a standard deviation of two yards. The team coach tells Marco to adjust his grip to get more distance. The coach records the distances for 20 throws. For the 20 throws, Marco’s mean distance was 45 yards. The coach thought the different grip helped Marco throw farther than 40 yards. Conduct a hypothesis test using a preset <em data-effect="italics">α</em> = 0.05. Assume the throw distances for footballs are normal.</p> <p id="eip-idp30205504">First, determine what type of test this is, set up the hypothesis test, find the <em data-effect="italics">p</em>-value, sketch the graph, and state your conclusion.</p> </div> </div> <div id="fs-idm24767312" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="eip-idm96892800">Press STAT and arrow over to TESTS. Press 1:Z-Test. Arrow over to Stats and press ENTER. Arrow down and enter 40 for <em data-effect="italics">μ</em>0 (null hypothesis), 2 for <em data-effect="italics">σ</em>, 45 for the sample mean, and 20 for <em data-effect="italics">n</em>. Arrow down to <em data-effect="italics">μ</em>: (alternative hypothesis) and set it either as &lt;, ≠, or &gt;. Press ENTER. Arrow down to Calculate and press ENTER. The calculator not only calculates the <em data-effect="italics">p</em>-value but it also calculates the test statistic (<em data-effect="italics">z</em>-score) for the sample mean. Select &lt;, ≠, or &gt; for the alternative hypothesis. Do this set of instructions again except arrow to Draw (instead of Calculate). Press ENTER. A shaded graph appears with test statistic and <em data-effect="italics">p</em>-value. Make sure when you use Draw that no other equations are highlighted in <em data-effect="italics">Y</em> = and the plots are turned off.</p> </div> </div> </div> <div data-type="note" data-has-label="true" data-label=""><div data-type="title">Historical Note (<a class="autogenerated-content" href="#fs-idp41487520">(Figure)</a>)</div> <p id="eip-idp233123200">The traditional way to compare the two probabilities, <em data-effect="italics">α</em> and the <em data-effect="italics">p</em>-value, is to compare the critical value (<em data-effect="italics">z</em>-score from <em data-effect="italics">α</em>) to the test statistic (<em data-effect="italics">z</em>-score from data). The calculated test statistic for the <em data-effect="italics">p</em>-value is –2.08. (From the Central Limit Theorem, the test statistic formula is \(z=\frac{\overline{x}-{\mu }_{X}}{\left(\frac{{\sigma }_{X}}{\sqrt{n}}\right)}\). For this problem, \(\overline{x}\) = 16, <em data-effect="italics">μ<sub>X</sub></em> = 16.43 from the null hypothes is, <em data-effect="italics">σ<sub>X</sub></em> = 0.8, and <em data-effect="italics">n</em> = 15.) You can find the critical value for <em data-effect="italics">α</em> = 0.05 in the normal table (see <strong>15.Tables</strong> in the Table of Contents). The <em data-effect="italics">z</em>-score for an area to the left equal to 0.05 is midway between –1.65 and –1.64 (0.05 is midway between 0.0505 and 0.0495). The <em data-effect="italics">z</em>-score is –1.645. Since –1.645 &gt; –2.08 (which demonstrates that α &gt; <em data-effect="italics">p</em>-value), reject <em data-effect="italics">H<sub>0</sub></em>. Traditionally, the decision to reject or not reject was done in this way. Today, comparing the two probabilities <em data-effect="italics">α</em> and the <em data-effect="italics">p</em>-value is very common. For this problem, the <em data-effect="italics">p</em>-value, 0.0187 is considerably smaller than <em data-effect="italics">α</em>, 0.05. You can be confident about your decision to reject. The graph shows <em data-effect="italics">α</em>, the <em data-effect="italics">p</em>-value, and the test statistic and the critical value.</p> <div id="eip-idm47467424" class="bc-figure figure"><span id="eip-idp126239824" data-type="media" data-alt="Distribution curve comparing the α to the p-value. Values of -2.15 and -1.645 are on the x-axis. Vertical upward lines extend from both of these values to the curve. The p-value is equal to 0.0158 and points to the area to the left of -2.15. α is equal to 0.05 and points to the area between the values of -2.15 and -1.645."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch09_11_02-1.png" alt="Distribution curve comparing the α to the p-value. Values of -2.15 and -1.645 are on the x-axis. Vertical upward lines extend from both of these values to the curve. The p-value is equal to 0.0158 and points to the area to the left of -2.15. α is equal to 0.05 and points to the area between the values of -2.15 and -1.645." width="380" data-media-type="image/png" /></span></div> </div> <div class="textbox textbox--examples" data-type="example"><div data-type="exercise"><div id="id5291254" data-type="problem"><p>A college football coach records the mean weight that his players can bench press as <strong>275 pounds</strong>, with a <strong>standard deviation of 55 pounds</strong>. Three of his players thought that the mean weight was <strong>more than</strong> that amount. They asked <strong>30</strong> of their teammates for their estimated maximum lift on the bench press exercise. The data ranged from 205 pounds to 385 pounds. The actual different weights were (frequencies are in parentheses) <span id="set-1" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">205(3)   </span><span data-type="item">215(3)   </span><span data-type="item">225(1)   </span><span data-type="item">241(2)   </span><span data-type="item">252(2)   </span><span data-type="item">265(2)   </span><span data-type="item">275(2)    </span><span data-type="item">313(2)   </span><span data-type="item">316(5)   </span><span data-type="item">338(2)   </span><span data-type="item">341(1)   </span><span data-type="item">345(2)   </span><span data-type="item">368(2)   </span><span data-type="item">385(1)</span></span>.</p> <p>Conduct a hypothesis test using a 2.5% level of significance to determine if the bench press mean is <strong>more than 275 pounds</strong>.</p> </div> <div id="id5342777" data-type="solution"><p>Set up the Hypothesis Test:</p> <p>Since the problem is about a mean weight, this is a <strong>test of a single population mean</strong>.</p> <p><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ</em> = 275<span data-type="newline"><br /> </span><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> &gt; 275<span data-type="newline"><br /> </span>This is a right-tailed test.</p> <p>Calculating the distribution needed:</p> <p id="element-975">Random variable: \(\overline{X}\) = the mean weight, in pounds, lifted by the football players.</p> <p><strong>Distribution for the test:</strong> It is normal because <em data-effect="italics">σ</em> is known.</p> <p>\(\overline{X}~N\left(275,\frac{55}{\sqrt{30}}\right)\)</p> <p>\(\overline{x}=286.2\) pounds (from the data).</p> <p><em data-effect="italics">σ</em> = 55 pounds <strong>(Always use <em data-effect="italics">σ</em> if you know it.)</strong> We assume <em data-effect="italics">μ</em> = 275 pounds unless our data shows us otherwise.</p> <p>Calculate the <em data-effect="italics">p</em>-value using the normal distribution for a mean and using the sample mean as input (see <a class="autogenerated-content" href="/contents/d0ba1833-f0d2-4195-8765-3c436745f0fb">(Figure)</a> for using the data as input):</p> <p>\(p\text{-value}=P\left(\overline{x}&gt;286.2\right)=0.1323\).</p> <p><strong>Interpretation of the <em data-effect="italics">p</em>-value:</strong> If <em data-effect="italics">H<sub>0</sub></em> is true, then there is a 0.1331 probability (13.23%) that the football players can lift a mean weight of 286.2 pounds or more. Because a 13.23% chance is large enough, a mean weight lift of 286.2 pounds or more is not a rare event.</p> <div id="hyptest11_ex3" class="bc-figure figure"><span id="id4317824" data-type="media" data-alt="Normal distribution curve of the average weight lifted by football players with values of 275 and 286.2 on the x-axis. A vertical upward line extends from 286.2 to the curve. The p-value points to the area to the right of 286.2."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch09_11_03N-1.jpg" alt="Normal distribution curve of the average weight lifted by football players with values of 275 and 286.2 on the x-axis. A vertical upward line extends from 286.2 to the curve. The p-value points to the area to the right of 286.2." width="380" data-media-type="image/jpg" /></span></div> <p>Compare <em data-effect="italics">α</em> and the <em data-effect="italics">p</em>-value:</p> <p><em data-effect="italics">α</em> = 0.025 <em data-effect="italics">p</em>-value = 0.1323</p> <p><strong>Make a decision:</strong> Since <em data-effect="italics">α</em> &lt;<em data-effect="italics">p</em>-value, do not reject <em data-effect="italics">H<sub>0</sub></em>.</p> <p><strong>Conclusion:</strong> At the 2.5% level of significance, from the sample data, there is not sufficient evidence to conclude that the true mean weight lifted is more than 275 pounds.</p> <p>The <em data-effect="italics">p</em>-value can easily be calculated.</p> <div id="fs-idm79549936" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="element-233">Put the data and frequencies into lists. Press <code>STAT</code> and arrow over to <code>TESTS</code>. Press <code>1:Z-Test</code>. Arrow over to <code>Data</code> and press <code>ENTER</code>. Arrow down and enter 275 for <em data-effect="italics">μ<sub>0</sub></em>, 55 for <em data-effect="italics">σ</em>, the name of the list where you put the data, and the name of the list where you put the frequencies. Arrow down to <em data-effect="italics">μ:</em> and arrow over to &gt; <em data-effect="italics">μ<sub>0</sub></em>. Press <code>ENTER</code>. Arrow down to <code>Calculate</code> and press <code>ENTER</code>. The calculator not only calculates the <em data-effect="italics">p</em>-value (<em data-effect="italics">p</em> = 0.1331, a little different from the previous calculation &#8211; in it we used the sample mean rounded to one decimal place instead of the data) but it also calculates the test statistic (<em data-effect="italics">z</em>-score) for the sample mean, the sample mean, and the sample standard deviation. <em data-effect="italics">μ</em> &gt; 275 is the alternative hypothesis. Do this set of instructions again except arrow to <code>Draw</code> (instead of <code>Calculate</code>). Press <code>ENTER</code>. A shaded graph appears with <em data-effect="italics">z</em> = 1.112 (test statistic) and <em data-effect="italics">p</em> = 0.1331 (<em data-effect="italics">p</em>-value). Make sure when you use <code>Draw</code> that no other equations are highlighted in <em data-effect="italics">Y</em> = and the plots are turned off.</p> </div> </div> </div> </div> <div id="element-207" class="textbox textbox--examples" data-type="example"><div data-type="exercise"><div id="id4319892" data-type="problem"><p id="fs-idp90231232">Statistics students believe that the mean score on the first statistics test is 65. A statistics instructor thinks the mean score is higher than 65. He samples ten statistics students and obtains the scores <span id="set-2" data-type="list" data-list-type="labeled-item" data-display="inline"><span data-type="item">65   </span><span data-type="item">65   </span><span data-type="item">70   </span><span data-type="item">67   </span><span data-type="item">66   </span><span data-type="item">63   </span><span data-type="item">63   </span><span data-type="item">68   </span><span data-type="item">72   </span><span data-type="item">71</span></span>. He performs a hypothesis test using a 5% level of significance. The data are assumed to be from a normal distribution.</p> </div> <div id="id5147242" data-type="solution"><p>Set up the hypothesis test:</p> <p>A 5% level of significance means that <em data-effect="italics">α</em> = 0.05. This is a test of a <strong>single population mean</strong>.</p> <p><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ</em> = 65  <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> &gt; 65</p> <p>Since the instructor thinks the average score is higher, use a &#8220;&gt;&#8221;. The &#8220;&gt;&#8221; means the test is right-tailed.</p> <p id="fs-idp165624976">Determine the distribution needed:</p> <p><strong>Random variable:</strong>\(\overline{X}\) = average score on the first statistics test.</p> <p><strong>Distribution for the test:</strong> If you read the problem carefully, you will notice that there is <strong>no population standard deviation given</strong>. You are only given <em data-effect="italics">n</em> = 10 sample data values. Notice also that the data come from a normal distribution. This means that the distribution for the test is a student&#8217;s <em data-effect="italics">t</em>.</p> <p id="element-240">Use <em data-effect="italics">t</em><sub>df</sub>. Therefore, the distribution for the test is <em data-effect="italics">t</em><sub>9</sub> where <em data-effect="italics">n</em> = 10 and <em data-effect="italics">df</em> = 10 &#8211; 1 = 9.</p> <p>Calculate the <em data-effect="italics">p</em>-value using the Student&#8217;s <em data-effect="italics">t</em>-distribution:</p> <p><em data-effect="italics">p</em>-value = <em data-effect="italics">P</em>(\(\overline{x}\) &gt; 67) = 0.0396 where the sample mean and sample standard deviation are calculated as 67 and 3.1972 from the data.</p> <p><strong>Interpretation of the <em data-effect="italics">p</em>-value:</strong> If the null hypothesis is true, then there is a 0.0396 probability (3.96%) that the sample mean is 65 or more.</p> <div id="hyptest11_ex4" class="bc-figure figure"><span id="id5241154" data-type="media" data-alt="Normal distribution curve of average scores on the first statistic tests with 65 and 67 values on the x-axis. A vertical upward line extends from 67 to the curve. The p-value points to the area to the right of 67."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch09_11_04N-1.jpg" alt="Normal distribution curve of average scores on the first statistic tests with 65 and 67 values on the x-axis. A vertical upward line extends from 67 to the curve. The p-value points to the area to the right of 67." width="380" data-media-type="image/jpg" /></span></div> <p>Compare <em data-effect="italics">α</em> and the <em data-effect="italics">p</em>-value:</p> <p>Since <em data-effect="italics">α</em> = 0.05 and <em data-effect="italics">p</em>-value = 0.0396. <em data-effect="italics">α</em> &gt; <em data-effect="italics">p</em>-value.</p> <p><strong>Make a decision:</strong> Since <em data-effect="italics">α</em> &gt; <em data-effect="italics">p</em>-value, reject <em data-effect="italics">H<sub>0</sub></em>.</p> <p id="fs-idm21057040">This means you reject <em data-effect="italics">μ</em> = 65. In other words, you believe the average test score is more than 65.</p> <p><strong>Conclusion:</strong> At a 5% level of significance, the sample data show sufficient evidence that the mean (average) test score is more than 65, just as the math instructor thinks.</p> <p>The <em data-effect="italics">p</em>-value can easily be calculated.</p> <div id="fs-idm14472336" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p>Put the data into a list. Press <code>STAT</code> and arrow over to <code>TESTS</code>. Press <code>2:T-Test</code>. Arrow over to <code>Data</code> and press <code>ENTER</code>. Arrow down and enter 65 for <em data-effect="italics">μ</em><sub>0</sub>, the name of the list where you put the data, and 1 for <code>Freq:</code>. Arrow down to <em data-effect="italics">μ</em>: and arrow over to &gt; <em data-effect="italics">μ</em><sub>0</sub>. Press <code>ENTER</code>. Arrow down to <code>Calculate</code> and press <code>ENTER</code>. The calculator not only calculates the <em data-effect="italics">p</em>-value (<em data-effect="italics">p</em> = 0.0396) but it also calculates the test statistic (<em data-effect="italics">t</em>-score) for the sample mean, the sample mean, and the sample standard deviation. <em data-effect="italics">μ</em> &gt; 65 is the alternative hypothesis. Do this set of instructions again except arrow to <code>Draw</code> (instead of <code>Calculate</code>). Press <code>ENTER</code>. A shaded graph appears with <em data-effect="italics">t</em> = 1.9781 (test statistic) and <em data-effect="italics">p</em> = 0.0396 (<em data-effect="italics">p</em>-value). Make sure when you use <code>Draw</code> that no other equations are highlighted in <em data-effect="italics">Y</em> = and the plots are turned off.</p> </div> </div> </div> </div> <div id="fs-idm186895712" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div data-type="problem"><p id="eip-idm59977568">It is believed that a stock price for a particular company will grow at a rate of \$5 per week with a standard deviation of \$1. An investor believes the stock won’t grow as quickly. The changes in stock price is recorded for ten weeks and are as follows: \$4, \$3, \$2, \$3, \$1, \$7, \$2, \$1, \$1, \$2. Perform a hypothesis test using a 5% level of significance. State the null and alternative hypotheses, find the <em data-effect="italics">p</em>-value, state your conclusion, and identify the Type I and Type II errors.</p> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><div data-type="exercise"><div id="id5304801" data-type="problem"><p>Joon believes that 50% of first-time brides in the United States are younger than their grooms. She performs a hypothesis test to determine if the percentage is <strong>the same or different from 50%</strong>. Joon samples <strong>100 first-time brides</strong> and <strong>53</strong> reply that they are younger than their grooms. For the hypothesis test, she uses a 1% level of significance.</p> </div> <div id="id4150001" data-type="solution"><p id="element-399">Set up the hypothesis test:</p> <p>The 1% level of significance means that <em data-effect="italics">α</em> = 0.01. This is a <strong>test of a single population proportion</strong>.</p> <p><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p</em> = 0.50  <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p</em> ≠ 0.50</p> <p>The words <strong>&#8220;is the same or different from&#8221;</strong> tell you this is a two-tailed test.</p> <p>Calculate the distribution needed:</p> <p id="element-783"><strong>Random variable:</strong><em data-effect="italics">P′</em> = the percent of of first-time brides who are younger than their grooms.</p> <p id="element-581"><strong>Distribution for the test:</strong> The problem contains no mention of a mean. The information is given in terms of percentages. Use the distribution for <em data-effect="italics">P′</em>, the estimated proportion.</p> <p id="fs-idm81154768">\({P}^{\prime }~N\left(p,\sqrt{\frac{p\cdot q}{n}}\right)\) Therefore, \({P}^{\prime }~N\left(0.5,\sqrt{\frac{0.5\cdot 0.5}{100}}\right)\)</p> <p id="fs-idm107085552">where <em data-effect="italics">p</em> = 0.50, <em data-effect="italics">q</em> = 1−<em data-effect="italics">p</em> = 0.50, and <em data-effect="italics">n</em> = 100</p> <p id="fs-idp14246672">Calculate the <em data-effect="italics">p</em>-value using the normal distribution for proportions:</p> <p><em data-effect="italics">p</em>-value = <em data-effect="italics">P</em> (<em data-effect="italics">p′</em> &lt; 0.47 or <em data-effect="italics">p′</em> &gt; 0.53) = 0.5485</p> <p id="fs-idp133040416">where <em data-effect="italics">x</em> = 53, <em data-effect="italics">p′</em> = \(\frac{x}{n}\text{ = }\frac{\text{53}}{\text{100}}\) = 0.53.</p> <p><strong>Interpretation of the <em data-effect="italics">p</em>-value:</strong> If the null hypothesis is true, there is 0.5485 probability (54.85%) that the sample (estimated) proportion \(p\text{&#8216;}\) is 0.53 or more OR 0.47 or less (see the graph in <a class="autogenerated-content" href="#hyptest11_ex5">(Figure)</a>).</p> <div id="hyptest11_ex5" class="bc-figure figure"><span id="id5140475" data-type="media" data-alt="Normal distribution curve of the percent of first time brides who are younger than the groom with values of 0.47, 0.50, and 0.53 on the x-axis. Vertical upward lines extend from 0.47 and 0.53 to the curve. 1/2(p-values) are calculated for the areas on outsides of 0.47 and 0.53."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch09_11_05N-1.jpg" alt="Normal distribution curve of the percent of first time brides who are younger than the groom with values of 0.47, 0.50, and 0.53 on the x-axis. Vertical upward lines extend from 0.47 and 0.53 to the curve. 1/2(p-values) are calculated for the areas on outsides of 0.47 and 0.53." width="380" data-media-type="image/jpg" /></span></div> <p id="element-327"><em data-effect="italics">μ</em> = <em data-effect="italics">p</em> = 0.50 comes from <em data-effect="italics">H<sub>0</sub></em>, the null hypothesis.</p> <p><em data-effect="italics">p′</em> = 0.53. Since the curve is symmetrical and the test is two-tailed, the <em data-effect="italics">p′</em> for the left tail is equal to 0.50 – 0.03 = 0.47 where <em data-effect="italics">μ</em> = <em data-effect="italics">p</em> = 0.50. (0.03 is the difference between 0.53 and 0.50.)</p> <p>Compare <em data-effect="italics">α</em> and the <em data-effect="italics">p</em>-value:</p> <p>Since <em data-effect="italics">α</em> = 0.01 and <em data-effect="italics">p</em>-value = 0.5485. <em data-effect="italics">α</em> &lt; <em data-effect="italics">p</em>-value.</p> <p><strong>Make a decision:</strong> Since <em data-effect="italics">α</em> &lt; <em data-effect="italics">p</em>-value, you cannot reject <em data-effect="italics">H<sub>0</sub></em>.</p> <p><strong>Conclusion:</strong> At the 1% level of significance, the sample data do not show sufficient evidence that the percentage of first-time brides who are younger than their grooms is different from 50%.</p> <p>The <em data-effect="italics">p</em>-value can easily be calculated.</p> <div id="fs-idm35225600" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p>Press <code>STAT</code> and arrow over to <code>TESTS</code>. Press <code>5:1-PropZTest</code>. Enter .5 for <em data-effect="italics">p</em><sub>0</sub>, 53 for <em data-effect="italics">x</em> and 100 for <em data-effect="italics">n</em>. Arrow down to <code>Prop</code> and arrow to <code>not equals</code> <em data-effect="italics">p</em><sub>0</sub>. Press <code>ENTER</code>. Arrow down to <code>Calculate</code> and press <code>ENTER</code>. The calculator calculates the <em data-effect="italics">p</em>-value (<em data-effect="italics">p</em> = 0.5485) and the test statistic (<em data-effect="italics">z</em>-score). <code>Prop not equals</code> .5 is the alternate hypothesis. Do this set of instructions again except arrow to <code>Draw</code> (instead of <code>Calculate</code>). Press <code>ENTER</code>. A shaded graph appears with <em data-effect="italics">z</em> = 0.6 (test statistic) and <em data-effect="italics">p</em> = 0.5485 (<em data-effect="italics">p</em>-value). Make sure when you use <code>Draw</code> that no other equations are highlighted in <em data-effect="italics">Y</em> = and the plots are turned off.</p> </div> <p>The Type I and Type II errors are as follows:</p> <p>The Type I error is to conclude that the proportion of first-time brides who are younger than their grooms is different from 50% when, in fact, the proportion is actually 50%. (Reject the null hypothesis when the null hypothesis is true).</p> <p id="fs-idp134008544">The Type II error is there is not enough evidence to conclude that the proportion of first time brides who are younger than their grooms differs from 50% when, in fact, the proportion does differ from 50%. (Do not reject the null hypothesis when the null hypothesis is false.)</p> </div> </div> </div> <div id="fs-idp8711616" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div data-type="problem"><p id="eip-idp103490704">A teacher believes that 85% of students in the class will want to go on a field trip to the local zoo. She performs a hypothesis test to determine if the percentage is the same or different from 85%. The teacher samples 50 students and 39 reply that they would want to go to the zoo. For the hypothesis test, use a 1% level of significance.</p> <p id="eip-idp10599024">First, determine what type of test this is, set up the hypothesis test, find the <em data-effect="italics">p</em>-value, sketch the graph, and state your conclusion.</p> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><div id="element-312" data-type="exercise"><div id="id5318755" data-type="problem"><p>Suppose a consumer group suspects that the proportion of households that have three cell phones is 30%. A cell phone company has reason to believe that the proportion is not 30%. Before they start a big advertising campaign, they conduct a hypothesis test. Their marketing people survey 150 households with the result that 43 of the households have three cell phones.</p> </div> <div id="id4213559" data-type="solution"><p>Set up the Hypothesis Test:</p> <p><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p</em> = 0.30 <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p</em> ≠ 0.30</p> <p>Determine the distribution needed:</p> <p>The <strong>random variable</strong> is <em data-effect="italics">P′</em> = proportion of households that have three cell phones.</p> <p>The <strong>distribution</strong> for the hypothesis test is \(P\text{&#8216;}~N\left(0.30,\sqrt{\frac{\left(0.30\right)\cdot \left(0.70\right)}{150}}\right)\)</p> <p>&nbsp;</p> <div data-type="exercise"><div id="id4146441" data-type="problem"><p>a. The value that helps determine the <em data-effect="italics">p</em>-value is <em data-effect="italics">p′</em>. Calculate <em data-effect="italics">p′</em>.</p> </div> <div id="id4274849" data-type="solution" data-print-placement="end"><p>a. <em data-effect="italics">p′</em> = \(\frac{x}{n}\) where <em data-effect="italics">x</em> is the number of successes and <em data-effect="italics">n</em> is the total number in the sample.</p> <p><em data-effect="italics">x</em> = 43, <em data-effect="italics">n</em> = 150</p> <p><em data-effect="italics">p′</em> = \(\frac{\text{43}}{\text{150}}\)</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id5150803" data-type="problem"><p>b. What is a <strong>success</strong> for this problem?</p> </div> <div id="id5036464" data-type="solution" data-print-placement="end"><p>b. A success is having three cell phones in a household.</p> <p>&nbsp;</p> </div> </div> <div id="element-519" data-type="exercise"><div id="id4292096" data-type="problem"><p id="element-781">c. What is the level of significance?</p> </div> <div id="id5242701" data-type="solution" data-print-placement="end"><p>c. The level of significance is the preset <em data-effect="italics">α</em>. Since <em data-effect="italics">α</em> is not given, assume that <em data-effect="italics">α</em> = 0.05.</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id4673389" data-type="problem"><p>d. Draw the graph for this problem. Draw the horizontal axis. Label and shade appropriately.<span data-type="newline"><br /> </span>Calculate the <em data-effect="italics">p</em>-value.</p> </div> <div id="id5212164" data-type="solution" data-print-placement="end"><p>d. <em data-effect="italics">p</em>-value = 0.7216</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id5104601" data-type="problem"><p>e. Make a decision. _____________(Reject/Do not reject) <em data-effect="italics">H<sub>0</sub></em> because____________.</p> </div> <div id="id5137008" data-type="solution" data-print-placement="end"><p>e. Assuming that <em data-effect="italics">α</em> = 0.05, <em data-effect="italics">α</em> &lt; <em data-effect="italics">p</em>-value. The decision is do not reject <em data-effect="italics">H<sub>0</sub></em> because there is not sufficient evidence to conclude that the proportion of households that have three cell phones is not 30%.</p> </div> </div> </div> </div> </div> <div id="fs-idm14663600" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div data-type="exercise"><div data-type="problem"><p id="eip-idm17000432">Marketers believe that 92% of adults in the United States own a cell phone. A cell phone manufacturer believes that number is actually lower. 200 American adults are surveyed, of which, 174 report having cell phones. Use a 5% level of significance. State the null and alternative hypothesis, find the <em data-effect="italics">p</em>-value, state your conclusion, and identify the Type I and Type II errors.</p> </div> </div> </div> <p id="fs-idp32935184">The next example is a poem written by a statistics student named Nicole Hart. The solution to the problem follows the poem. Notice that the hypothesis test is for a single population proportion. This means that the null and alternate hypotheses use the parameter <em data-effect="italics">p</em>. The distribution for the test is normal. The estimated proportion <em data-effect="italics">p</em>′ is the proportion of fleas killed to the total fleas found on Fido. This is sample information. The problem gives a preconceived <em data-effect="italics">α</em> = 0.01, for comparison, and a 95% confidence interval computation. The poem is clever and humorous, so please enjoy it!</p> <div class="textbox textbox--examples" data-type="example"><div id="fs-idp65427488" data-type="exercise"><div id="id5257658" data-type="problem"><p id="eip-idm151173424">My dog has so many fleas,<span data-type="newline"><br /> </span> They do not come off with ease.<span data-type="newline"><br /> </span> As for shampoo, I have tried many types<span data-type="newline"><br /> </span> Even one called Bubble Hype,<span data-type="newline"><br /> </span> Which only killed 25% of the fleas,<span data-type="newline"><br /> </span> Unfortunately I was not pleased.</p> <p>I&#8217;ve used all kinds of soap,<span data-type="newline"><br /> </span> Until I had given up hope<span data-type="newline"><br /> </span> Until one day I saw<span data-type="newline"><br /> </span> An ad that put me in awe.</p> <p>A shampoo used for dogs<span data-type="newline"><br /> </span> Called GOOD ENOUGH to Clean a Hog<span data-type="newline"><br /> </span> Guaranteed to kill more fleas.</p> <p>I gave Fido a bath<span data-type="newline"><br /> </span> And after doing the math<span data-type="newline"><br /> </span> His number of fleas<span data-type="newline"><br /> </span> Started dropping by 3&#8217;s!</p> <p>Before his shampoo<span data-type="newline"><br /> </span> I counted 42.<span data-type="newline"><br /> </span> At the end of his bath,<span data-type="newline"><br /> </span> I redid the math<span data-type="newline"><br /> </span> And the new shampoo had killed 17 fleas.<span data-type="newline"><br /> </span> So now I was pleased.</p> <p>Now it is time for you to have some fun<span data-type="newline"><br /> </span> With the level of significance being .01,<span data-type="newline"><br /> </span> You must help me figure out<span data-type="newline"><br /> </span> Use the new shampoo or go without?</p> </div> <div id="id5100583" data-type="solution"><p>Set up the hypothesis test:</p> <p><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p</em> ≤ 0.25   <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p</em> &gt; 0.25</p> <p>Determine the distribution needed:</p> <p>In words, CLEARLY state what your random variable \(\overline{X}\) or <em data-effect="italics">P′</em> represents.</p> <p id="fs-idp88897968"><em data-effect="italics">P′</em> = The proportion of fleas that are killed by the new shampoo</p> <p>State the distribution to use for the test.</p> <p><strong>Normal:</strong>\(N\left(0.25,\sqrt{\frac{\left(0.25\right)\left(1-0.25\right)}{42}}\right)\)</p> <p><strong>Test Statistic:</strong><em data-effect="italics">z</em> = 2.3163</p> <p>Calculate the <em data-effect="italics">p</em>-value using the normal distribution for proportions:</p> <p><em data-effect="italics">p</em>-value = 0.0103</p> <p>In one to two complete sentences, explain what the <em data-effect="italics">p</em>-value means for this problem.</p> <p>If the null hypothesis is true (the proportion is 0.25), then there is a 0.0103 probability that the sample (estimated) proportion is 0.4048 \(\left(\frac{17}{42}\right)\) or more.</p> <p>Use the previous information to sketch a picture of this situation. CLEARLY, label and scale the horizontal axis and shade the region(s) corresponding to the <em data-effect="italics">p</em>-value.</p> <div id="hyptest11_ex6" class="bc-figure figure"><span id="id4295819" data-type="media" data-alt="Normal distribution graph of the proportion of fleas killed by the new shampoo with values of 0.25 and 0.4048 on the x-axis. A vertical upward line extends from 0.4048 to the curve and the area to the left of this is shaded in. The test statistic of the sample proportion is listed."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch09_11_06-1.jpg" alt="Normal distribution graph of the proportion of fleas killed by the new shampoo with values of 0.25 and 0.4048 on the x-axis. A vertical upward line extends from 0.4048 to the curve and the area to the left of this is shaded in. The test statistic of the sample proportion is listed." width="380" data-media-type="image/jpg" /></span></div> <p id="element-372">Compare <em data-effect="italics">α</em> and the <em data-effect="italics">p</em>-value:</p> <p>Indicate the correct decision (“reject” or “do not reject” the null hypothesis), the reason for it, and write an appropriate conclusion, using complete sentences.</p> <table summary=""><thead valign="middle"><tr><th data-align="center">alpha</th> <th data-align="center">decision</th> <th data-align="center">reason for decision</th> </tr> </thead> <tbody><tr><td data-align="center">0.01</td> <td data-align="center">Do not reject \({H}_{0}\)</td> <td data-align="center"><em data-effect="italics">α</em> &lt; <em data-effect="italics">p</em>-value</td> </tr> </tbody> </table> <p><strong>Conclusion:</strong> At the 1% level of significance, the sample data do not show sufficient evidence that the percentage of fleas that are killed by the new shampoo is more than 25%.</p> <p>Construct a 95% confidence interval for the true mean or proportion. Include a sketch of the graph of the situation. Label the point estimate and the lower and upper bounds of the confidence interval.</p> <div id="hyptest11_ex7" class="bc-figure figure"><span id="id5244822" data-type="media" data-alt="Normal distribution graph of the proportion of fleas killed by the new shampoo with values of 0.26, 17/42, and 0.55 on the x-axis. A vertical upward line extends from 0.26 and 0.55. The area between these two points is equal to 0.95."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch09_11_07-1.jpg" alt="Normal distribution graph of the proportion of fleas killed by the new shampoo with values of 0.26, 17/42, and 0.55 on the x-axis. A vertical upward line extends from 0.26 and 0.55. The area between these two points is equal to 0.95." width="380" data-media-type="image/jpg" /></span></div> <p><strong>Confidence Interval:</strong> (0.26,0.55) We are 95% confident that the true population proportion <em data-effect="italics">p</em> of fleas that are killed by the new shampoo is between 26% and 55%.</p> <div id="id5328435" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="eip-idm81920704">This test result is not very definitive since the <em data-effect="italics">p</em>-value is very close to alpha. In reality, one would probably do more tests by giving the dog another bath after the fleas have had a chance to return.</p> </div> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><div data-type="exercise"><div data-type="problem"><p id="eip-idm29000496">The National Institute of Standards and Technology provides exact data on conductivity properties of materials. Following are conductivity measurements for 11 randomly selected pieces of a particular type of glass.</p> <p id="eip-idm46067920">1.11;   1.07;   1.11;   1.07;   1.12;   1.08;   0.98;   0.98;   1.02;   0.95;   0.95 <span data-type="newline"><br /> </span>Is there convincing evidence that the average conductivity of this type of glass is greater than one? Use a significance level of 0.05. Assume the population is normal.</p> </div> <div data-type="solution"><p id="eip-idp69766080">Let’s follow a four-step process to answer this statistical question.</p> <ol id="eip-idp69766512" type="1"><li><strong>State the Question</strong>: We need to determine if, at a 0.05 significance level, the average conductivity of the selected glass is greater than one. Our hypotheses will be <ol id="eip-idp129554720" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ</em> ≤ 1</li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> &gt; 1</li> </ol> </li> <li><strong>Plan</strong>: We are testing a sample mean without a known population standard deviation. Therefore, we need to use a Student&#8217;s-t distribution. Assume the underlying population is normal.</li> <li><strong>Do the calculations</strong>: We will input the sample data into the TI-83 as follows. <div id="eip-idp122053552" class="bc-figure figure"><span id="eip-idp167220912" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C09_M09_001a-1.jpg" alt="" width="250" data-media-type="image/jpeg" /></span></div> <div id="eip-idp29927184" class="bc-figure figure"><span id="eip-idp29927312" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C09_M09_001b-1.jpg" alt="" width="250" data-media-type="image/jpeg" /></span></div> <div id="eip-idp126675984" class="bc-figure figure"><span id="eip-idp156983600" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C09_M09_001c-1.jpg" alt="" width="250" data-media-type="image/jpeg" /></span></div> <div id="eip-idp154420944" class="bc-figure figure"><span id="eip-idp154421072" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C09_M09_001d-1.jpg" alt="" width="250" data-media-type="image/jpeg" /></span></div> </li> <li><strong>State the Conclusions</strong>: Since the <em data-effect="italics">p</em>-value* (<em data-effect="italics">p</em> = 0.036) is less than our alpha value, we will reject the null hypothesis. It is reasonable to state that the data supports the claim that the average conductivity level is greater than one.</li> </ol> </div> </div> </div> <div id="eip-767" class="textbox textbox--examples" data-type="example"><div id="eip-178" data-type="exercise"><div data-type="problem"><p id="eip-idp173526416">In a study of 420,019 cell phone users, 172 of the subjects developed brain cancer. Test the claim that cell phone users developed brain cancer at a greater rate than that for non-cell phone users (the rate of brain cancer for non-cell phone users is 0.0340%). Since this is a critical issue, use a 0.005 significance level. Explain why the significance level should be so low in terms of a Type I error.</p> </div> <div data-type="solution"><p id="eip-idp3953440">We will follow the four-step process.</p> <ol id="eip-idp191218624" type="1"><li>We need to conduct a hypothesis test on the claimed cancer rate. Our hypotheses will be <ol id="eip-idp171226608" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p</em> ≤ 0.00034</li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p</em> &gt; 0.00034</li> </ol> <p id="eip-idm39752176" class="finger">If we commit a Type I error, we are essentially accepting a false claim. Since the claim describes cancer-causing environments, we want to minimize the chances of incorrectly identifying causes of cancer.</p> </li> <li>We will be testing a sample proportion with <em data-effect="italics">x</em> = 172 and <em data-effect="italics">n</em> = 420,019. The sample is sufficiently large because we have <em data-effect="italics">np</em> = 420,019(0.00034) = 142.8, <em data-effect="italics">nq</em> = 420,019(0.99966) = 419,876.2, two independent outcomes, and a fixed probability of success <em data-effect="italics">p</em> = 0.00034. Thus we will be able to generalize our results to the population.</li> <li>The associated TI results are <span data-type="newline"><br /> </span> <div id="fs-idm80776672" class="bc-figure figure"><span id="eip-idp39356816" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C09_M09_002a-1.jpg" alt="" width="250" data-media-type="image/jpeg" /></span></div> <div id="eip-idp38098208" class="bc-figure figure"><span id="eip-idp38098336" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C09_M09_002b-1.jpg" alt="" width="250" data-media-type="image/jpeg" /></span></div> </li> <li>Since the <em data-effect="italics">p</em>-value = 0.0073 is greater than our alpha value = 0.005, we cannot reject the null. Therefore, we conclude that there is not enough evidence to support the claim of higher brain cancer rates for the cell phone users.</li> </ol> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><div data-type="exercise"><div data-type="problem"><p id="eip-idp47042256">According to the US Census there are approximately 268,608,618 residents aged 12 and older. Statistics from the Rape, Abuse, and Incest National Network indicate that, on average, 207,754 rapes occur each year (male and female) for persons aged 12 and older. This translates into a percentage of sexual assaults of 0.078%. In Daviess County, KY, there were reported 11 rapes for a population of 37,937. Conduct an appropriate hypothesis test to determine if there is a statistically significant difference between the local sexual assault percentage and the national sexual assault percentage. Use a significance level of 0.01.</p> </div> <div data-type="solution"><p>We will follow the four-step plan.</p> <ol id="eip-idp113522624" type="1"><li>We need to test whether the proportion of sexual assaults in Daviess County, KY is significantly different from the national average.</li> <li>Since we are presented with proportions, we will use a one-proportion <em data-effect="italics">z</em>-test. The hypotheses for the test will be <ol id="eip-idp17854992" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p</em> = 0.00078</li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p</em> ≠ 0.00078</li> </ol> </li> <li>The following screen shots display the summary statistics from the hypothesis test. <div id="fs-idm3544720" class="bc-figure figure"><span id="eip-idp26791072" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C09_M09_003a-1.jpg" alt="" width="250" data-media-type="image/jpeg" /></span></div> <div id="eip-idp38471600" class="bc-figure figure"><span id="eip-idp38471728" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C09_M09_003b-1.jpg" alt="" width="250" data-media-type="image/jpeg" /></span></div> </li> <li>Since the <em data-effect="italics">p</em>-value, <em data-effect="italics">p</em> = 0.00063, is less than the alpha level of 0.01, the sample data indicates that we should reject the null hypothesis. In conclusion, the sample data support the claim that the proportion of sexual assaults in Daviess County, Kentucky is different from the national average proportion.</li> </ol> </div> </div> </div> <div id="fs-idp69608272" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idp168110512">The <span data-type="term">hypothesis test</span> itself has an established process. This can be summarized as follows:</p> <ol><li>Determine <em data-effect="italics">H<sub>0</sub></em> and <em data-effect="italics">H<sub>a</sub></em>. Remember, they are contradictory.</li> <li>Determine the random variable.</li> <li>Determine the distribution for the test.</li> <li>Draw a graph, calculate the test statistic, and use the test statistic to calculate the <em data-effect="italics">p</em>-value. (A <em data-effect="italics">z</em>-score and a <em data-effect="italics">t</em>-score are examples of test statistics.)</li> <li>Compare the preconceived <em data-effect="italics">α</em> with the <em data-effect="italics">p</em>-value, make a decision (reject or do not reject <em data-effect="italics">H<sub>0</sub></em>), and write a clear conclusion using English sentences.</li> </ol> <p>Notice that in performing the hypothesis test, you use <em data-effect="italics">α</em> and not <em data-effect="italics">β</em>. <em data-effect="italics">β</em> is needed to help determine the sample size of the data that is used in calculating the <em data-effect="italics">p</em>-value. Remember that the quantity 1 – <em data-effect="italics">β</em> is called the <strong>Power of the Test</strong>. A high power is desirable. If the power is too low, statisticians typically increase the sample size while keeping <em data-effect="italics">α</em> the same.If the power is low, the null hypothesis might not be rejected when it should be.</p> </div> <div class="practice" data-depth="1"><div data-type="exercise"><div id="eip-135" data-type="problem"><p>Assume <em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ</em> = 9 and <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> &lt; 9. Is this a left-tailed, right-tailed, or two-tailed test?</p> </div> <div data-type="solution"><p>This is a left-tailed test.</p> </div> </div> <div data-type="exercise"><div id="eip-286" data-type="problem"><p>Assume <em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ</em> ≤ 6 and <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> &gt; 6. Is this a left-tailed, right-tailed, or two-tailed test?</p> </div> </div> <div id="fs-idp45725152" data-type="exercise"><div id="eip-116" data-type="problem"><p>Assume <em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p</em> = 0.25 and <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p</em> ≠ 0.25. Is this a left-tailed, right-tailed, or two-tailed test?</p> </div> <div data-type="solution"><p>This is a two-tailed test.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p id="eip-195">Draw the general graph of a left-tailed test.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>Draw the graph of a two-tailed test.</p> </div> <div data-type="solution"><div id="fs-idm149132800" class="bc-figure figure"><span id="eip-idp62459136" data-type="media" data-alt=""><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C09_M09_item002annoN-1.jpg" alt="" width="380" data-media-type="image/jpeg" /></span></div> </div> </div> <div data-type="exercise"><div id="eip-648" data-type="problem"><p>A bottle of water is labeled as containing 16 fluid ounces of water. You believe it is less than that. What type of test would you use?</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>Your friend claims that his mean golf score is 63. You want to show that it is higher than that. What type of test would you use?</p> </div> <div data-type="solution"><p id="eip-68">a right-tailed test</p> </div> </div> <div id="eip-729" data-type="exercise"><div id="eip-491" data-type="problem"><p>A bathroom scale claims to be able to identify correctly any weight within a pound. You think that it cannot be that accurate. What type of test would you use?</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p id="eip-458">You flip a coin and record whether it shows heads or tails. You know the probability of getting heads is 50%, but you think it is less for this particular coin. What type of test would you use?</p> </div> <div id="fs-idp66689312" data-type="solution"><p id="eip-139">a left-tailed test</p> </div> </div> <div data-type="exercise"><div id="eip-519" data-type="problem"><p>If the alternative hypothesis has a not equals ( ≠ ) symbol, you know to use which type of test?</p> </div> </div> <div id="eip-504" data-type="exercise"><div id="eip-634" data-type="problem"><p>Assume the null hypothesis states that the mean is at least 18. Is this a left-tailed, right-tailed, or two-tailed test?</p> </div> <div data-type="solution"><p>This is a left-tailed test.</p> </div> </div> <div id="eip-320" data-type="exercise"><div data-type="problem"><p>Assume the null hypothesis states that the mean is at most 12. Is this a left-tailed, right-tailed, or two-tailed test?</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>Assume the null hypothesis states that the mean is equal to 88. The alternative hypothesis states that the mean is not equal to 88. Is this a left-tailed, right-tailed, or two-tailed test?</p> </div> <div data-type="solution"><p>This is a two-tailed test.</p> </div> </div> <p>&nbsp;</p> </div> <div id="fs-idp139727497473120" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <p id="eip-idm30094416"><em data-effect="italics">For each of the word problems, use a solution sheet to do the hypothesis test. The solution sheet is found in <a class="autogenerated-content" href="/contents/c0449c55-aa47-4f1c-bd5f-0521652f0e82">(Figure)</a>. Please feel free to make copies of the solution sheets. For the online version of the book, it is suggested that you copy the .doc or the .pdf files.</em></p> <div id="id8395739" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="eip-idm120632832">If you are using a Student&#8217;s-<em data-effect="italics">t</em> distribution for one of the following homework problems, you may assume that the underlying population is normally distributed. (In general, you must first prove that assumption, however.)</p> </div> <div data-type="exercise"><div id="id8395759" data-type="problem"><p id="fs-idp44498592">1) A particular brand of tires claims that its deluxe tire averages at least 50,000 miles before it needs to be replaced. From past studies of this tire, the standard deviation is known to be 8,000. A survey of owners of that tire design is conducted. From the 28 tires surveyed, the mean lifespan was 46,500 miles with a standard deviation of 9,800 miles. Using alpha = 0.05, is the data highly inconsistent with the claim?</p> </div> <div id="fs-idm164410224" data-type="solution"><p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id8395793" data-type="problem"><p>2) From generation to generation, the mean age when smokers first start to smoke varies. However, the standard deviation of that age remains constant of around 2.1 years. A survey of 40 smokers of this generation was done to see if the mean starting age is at least 19. The sample mean was 18.1 with a sample standard deviation of 1.3. Do the data support the claim at the 5% level?</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id9424298" data-type="problem"><p>3) The cost of a daily newspaper varies from city to city. However, the variation among prices remains steady with a standard deviation of 20¢. A study was done to test the claim that the mean cost of a daily newspaper is \$1.00. Twelve costs yield a mean cost of 95¢ with a standard deviation of 18¢. Do the data support the claim at the 1% level?</p> </div> <div id="fs-idm17421856" data-type="solution"><p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id9424338" data-type="problem"><p id="element-977">4) An article in the <em data-effect="italics">San Jose Mercury News</em> stated that students in the California state university system take 4.5 years, on average, to finish their undergraduate degrees. Suppose you believe that the mean time is longer. You conduct a survey of 49 students and obtain a sample mean of 5.1 with a sample standard deviation of 1.2. Do the data support your claim at the 1% level?</p> <p>&nbsp;</p> </div> </div> <div id="element-932a" data-type="exercise"><div id="id9424572" data-type="problem"><p>5) The mean number of sick days an employee takes per year is believed to be about ten. Members of a personnel department do not believe this figure. They randomly survey eight employees. The number of sick days they took for the past year are as follows: 12;  4;  15;  3;  11;  8;  6;  8. Let <em data-effect="italics">x</em> = the number of sick days they took for the past year. Should the personnel team believe that the mean number is ten?</p> </div> <div id="fs-idm143535568" data-type="solution"><p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id9424624" data-type="problem"><p>6) In 1955, <em data-effect="italics">Life Magazine</em> reported that the 25 year-old mother of three worked, on average, an 80 hour week. Recently, many groups have been studying whether or not the women&#8217;s movement has, in fact, resulted in an increase in the average work week for women (combining employment and at-home work). Suppose a study was done to determine if the mean work week has increased. 81 women were surveyed with the following results. The sample mean was 83; the sample standard deviation was ten. Does it appear that the mean work week has increased for women at the 5% level?</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id9400577" data-type="problem"><p>7) Your statistics instructor claims that 60 percent of the students who take her Elementary Statistics class go through life feeling more enriched. For some reason that she can&#8217;t quite figure out, most people don&#8217;t believe her. You decide to check this out on your own. You randomly survey 64 of her past Elementary Statistics students and find that 34 feel more enriched as a result of her class. Now, what do you think?</p> </div> <div id="fs-idm68403536" data-type="solution"><p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id9400611" data-type="problem"><p>8) A Nissan Motor Corporation advertisement read, “The average man’s I.Q. is 107. The average brown trout’s I.Q. is 4. So why can’t man catch brown trout?” Suppose you believe that the brown trout’s mean I.Q. is greater than four. You catch 12 brown trout. A fish psychologist determines the I.Q.s as follows: 5;   4;   7;   3;   6;   4;   5;   3;   6;   3;   8;   5. Conduct a hypothesis test of your belief.</p> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id9400954" data-type="problem"><p>9) Refer to <a href="#element-410">Exercise 9.119</a>. Conduct a hypothesis test to see if your decision and conclusion would change if your belief were that the brown trout’s mean I.Q. is <strong>not</strong> four.</p> </div> <div id="fs-idm153124944" data-type="solution"><p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id9400994" data-type="problem"><p>10) According to an article in <em data-effect="italics">Newsweek</em>, the natural ratio of girls to boys is 100:105. In China, the birth ratio is 100: 114 (46.7% girls). Suppose you don’t believe the reported figures of the percent of girls born in China. You conduct a study. In this study, you count the number of girls and boys born in 150 randomly chosen recent births. There are 60 girls and 90 boys born of the 150. Based on your study, do you believe that the percent of girls born in China is 46.7?</p> <p>&nbsp;</p> </div> </div> <div id="element-658" data-type="exercise"><div id="id9401185" data-type="problem"><p>11) A poll done for <em data-effect="italics">Newsweek</em> found that 13% of Americans have seen or sensed the presence of an angel. A contingent doubts that the percent is really that high. It conducts its own survey. Out of 76 Americans surveyed, only two had seen or sensed the presence of an angel. As a result of the contingent’s survey, would you agree with the <em data-effect="italics">Newsweek</em> poll? In complete sentences, also give three reasons why the two polls might give different results.</p> </div> <div id="fs-idm3310512" data-type="solution"><p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id9401240" data-type="problem"><p id="element-140a">12) The mean work week for engineers in a start-up company is believed to be about 60 hours. A newly hired engineer hopes that it’s shorter. She asks ten engineering friends in start-ups for the lengths of their mean work weeks. Based on the results that follow, should she count on the mean work week to be shorter than 60 hours?</p> <p id="fs-idp187110944">Data (length of mean work week): 70;   45;   55;   60;   65;   55;   55;   60;   50;   55.</p> </div> </div> <p>&nbsp;</p> <div id="element-761" data-type="exercise"><div id="id9401488" data-type="problem"><p id="element-926">13) Use the “Lap time” data for Lap 4 (see <a class="autogenerated-content" href="/contents/3ef830bc-5247-460a-9007-e3fd762e5e93">(Figure)</a>) to test the claim that Terri finishes Lap 4, on average, in less than 129 seconds. Use all twenty races given.</p> </div> <div id="fs-idp13518384" data-type="solution"><p>&nbsp;</p> </div> </div> <div id="fs-idp191961952" data-type="exercise"><div id="id9401521" data-type="problem"><p>&nbsp;</p> <p id="element-564a">14) Use the “Initial Public Offering” data (see <a class="autogenerated-content" href="/contents/3ef830bc-5247-460a-9007-e3fd762e5e93">(Figure)</a>) to test the claim that the mean offer price was \$18 per share. Do not use all the data. Use your random number generator to randomly survey 15 prices.</p> </div> </div> <div id="id9401550" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> </div> <div data-type="exercise"><div data-type="solution"><p><strong>Answers to odd questions</strong></p> <p>1)</p> <ol id="fs-idp21859504" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ</em> ≥ 50,000</li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> &lt; 50,000</li> <li>Let \(\overline{X}\) = the average lifespan of a brand of tires.</li> <li>normal distribution</li> <li><em data-effect="italics">z</em> = -2.315</li> <li><em data-effect="italics">p</em>-value = 0.0103</li> <li>Check student’s solution. <ol id="fs-idm113459088" type="i"><li>alpha: 0.05</li> <li>Decision: Reject the null hypothesis.</li> <li>Reason for decision: The <em data-effect="italics">p</em>-value is less than 0.05.</li> <li>Conclusion: There is sufficient evidence to conclude that the mean lifespan of the tires is less than 50,000 miles.</li> </ol> </li> <li>(43,537, 49,463)</li> </ol> <p>3)</p> <ol id="fs-idm17421600" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ</em> = \$1.00</li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> ≠ \$1.00</li> <li>Let \(\overline{X}\) = the average cost of a daily newspaper.</li> <li>normal distribution</li> <li><em data-effect="italics">z</em> = –0.866</li> <li><em data-effect="italics">p</em>-value = 0.3865</li> <li>Check student’s solution. <ol id="fs-idm69728368" type="i"><li>Alpha: 0.01</li> <li>Decision: Do not reject the null hypothesis.</li> <li>Reason for decision: The <em data-effect="italics">p</em>-value is greater than 0.01.</li> <li>Conclusion: There is sufficient evidence to support the claim that the mean cost of daily papers is \$1. The mean cost could be \$1.</li> </ol> </li> <li>(\$0.84, \$1.06)</li> </ol> <p>5)</p> <ol id="fs-idm8405344" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ</em> = 10</li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> ≠ 10</li> <li>Let \(\overline{X}\) the mean number of sick days an employee takes per year.</li> <li>Student’s <em data-effect="italics">t</em>-distribution</li> <li><em data-effect="italics">t</em> = –1.12</li> <li><em data-effect="italics">p</em>-value = 0.300</li> <li>Check student’s solution. <ol id="fs-idm68554224" type="i"><li>Alpha: 0.05</li> <li>Decision: Do not reject the null hypothesis.</li> <li>Reason for decision: The <em data-effect="italics">p</em>-value is greater than 0.05.</li> <li>Conclusion: At the 5% significance level, there is insufficient evidence to conclude that the mean number of sick days is not ten.</li> </ol> </li> <li>(4.9443, 11.806)</li> </ol> <p>7)</p> <ol id="fs-idm150879872" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p</em> ≥ 0.6</li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p</em> &lt; 0.6</li> <li>Let <em data-effect="italics">P′</em> = the proportion of students who feel more enriched as a result of taking Elementary Statistics.</li> <li>normal for a single proportion</li> <li>1.12</li> <li><em data-effect="italics">p</em>-value = 0.1308</li> <li>Check student’s solution. <ol id="fs-idp14435968" type="i"><li>Alpha: 0.05</li> <li>Decision: Do not reject the null hypothesis.</li> <li>Reason for decision: The <em data-effect="italics">p</em>-value is greater than 0.05.</li> <li>Conclusion: There is insufficient evidence to conclude that less than 60 percent of her students feel more enriched.</li> </ol> </li> <li>Confidence Interval: (0.409, 0.654)<span data-type="newline"><br /> </span>The “plus-4s” confidence interval is (0.411, 0.648)</li> </ol> <p>9)</p> <ol id="fs-idm153124688" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ</em> = 4</li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> ≠ 4</li> <li>Let \(\overline{X}\) the average I.Q. of a set of brown trout.</li> <li>two-tailed Student&#8217;s t-test</li> <li><em data-effect="italics">t</em> = 1.95</li> <li><em data-effect="italics">p</em>-value = 0.076</li> <li>Check student’s solution. <ol id="fs-idp21135088" type="i"><li>Alpha: 0.05</li> <li>Decision: Reject the null hypothesis.</li> <li>Reason for decision: The <em data-effect="italics">p</em>-value is greater than 0.05</li> <li>Conclusion: There is insufficient evidence to conclude that the average IQ of brown trout is not four.</li> </ol> </li> <li>(3.8865,5.9468)</li> </ol> <p>11)</p> <ol id="fs-idm3310256" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p</em> ≥ 0.13</li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p</em> &lt; 0.13</li> <li>Let <em data-effect="italics">P′</em> = the proportion of Americans who have seen or sensed angels</li> <li>normal for a single proportion</li> <li>–2.688</li> <li><em data-effect="italics">p</em>-value = 0.0036</li> <li>Check student’s solution. <ol id="fs-idm15778400" type="i"><li>alpha: 0.05</li> <li>Decision: Reject the null hypothesis.</li> <li>Reason for decision: The <em data-effect="italics">p</em>-value is less than 0.05.</li> <li>Conclusion: There is sufficient evidence to conclude that the percentage of Americans who have seen or sensed an angel is less than 13%.</li> </ol> </li> <li>(0, 0.0623).<span data-type="newline"><br /> </span> The“plus-4s” confidence interval is (0.0022, 0.0978)</li> </ol> <p>13)</p> <ol id="fs-idp13518640" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ</em> ≥ 129</li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em> &lt; 129</li> <li>Let \(\overline{X}\) = the average time in seconds that Terri finishes Lap 4.</li> <li>Student&#8217;s <em data-effect="italics">t</em>-distribution</li> <li><em data-effect="italics">t</em> = 1.209</li> <li>0.8792</li> <li>Check student’s solution. <ol id="fs-idm130980080" type="i"><li>Alpha: 0.05</li> <li>Decision: Do not reject the null hypothesis.</li> <li>Reason for decision: The <em data-effect="italics">p</em>-value is greater than 0.05.</li> <li>Conclusion: There is insufficient evidence to conclude that Terri’s mean lap time is less than 129 seconds.</li> </ol> </li> <li>(128.63, 130.37)</li> </ol> <p>&nbsp;</p> </div> </div> </div> <div id="fs-idp118682672" class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p>Data from Amit Schitai. Director of Instructional Technology and Distance Learning. LBCC.</p> <p>Data from <em data-effect="italics">Bloomberg Businessweek</em>. Available online at http://www.businessweek.com/news/2011- 09-15/nyc-smoking-rate-falls-to-record-low-of-14-bloomberg-says.html.</p> <p>Data from energy.gov. Available online at http://energy.gov (accessed June 27. 2013).</p> <p>Data from Gallup®. Available online at www.gallup.com (accessed June 27, 2013).</p> <p>Data from <em data-effect="italics">Growing by Degrees</em> by Allen and Seaman.</p> <p>Data from La Leche League International. Available online at http://www.lalecheleague.org/Law/BAFeb01.html.</p> <p>Data from the American Automobile Association. Available online at www.aaa.com (accessed June 27, 2013).</p> <p>Data from the American Library Association. Available online at www.ala.org (accessed June 27, 2013).</p> <p>Data from the Bureau of Labor Statistics. Available online at http://www.bls.gov/oes/current/oes291111.htm.</p> <p id="fs-idp1">Data from the Centers for Disease Control and Prevention. Available online at www.cdc.gov (accessed June 27, 2013)</p> <p>Data from the U.S. Census Bureau, available online at http://quickfacts.census.gov/qfd/states/00000.html (accessed June 27, 2013).</p> <p>Data from the United States Census Bureau. Available online at http://www.census.gov/hhes/socdemo/language/.</p> <p id="eip-79">Data from Toastmasters International. Available online at http://toastmasters.org/artisan/detail.asp?CategoryID=1&amp;SubCategoryID=10&amp;ArticleID=429&amp;Page=1.</p> <p>Data from Weather Underground. Available online at www.wunderground.com (accessed June 27, 2013).</p> <p>Federal Bureau of Investigations. “Uniform Crime Reports and Index of Crime in Daviess in the State of Kentucky enforced by Daviess County from 1985 to 2005.” Available online at http://www.disastercenter.com/kentucky/crime/3868.htm (accessed June 27, 2013).</p> <p>“Foothill-De Anza Community College District.” De Anza College, Winter 2006. Available online at http://research.fhda.edu/factbook/DAdemofs/Fact_sheet_da_2006w.pdf.</p> <p>Johansen, C., J. Boice, Jr., J. McLaughlin, J. Olsen. “Cellular Telephones and Cancer—a Nationwide Cohort Study in Denmark.” Institute of Cancer Epidemiology and the Danish Cancer Society, 93(3):203-7. Available online at http://www.ncbi.nlm.nih.gov/pubmed/11158188 (accessed June 27, 2013).</p> <p>Rape, Abuse &amp; Incest National Network. “How often does sexual assault occur?” RAINN, 2009. Available online at http://www.rainn.org/get-information/statistics/frequency-of-sexual-assault (accessed June 27, 2013).</p> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl><dt>Central Limit Theorem</dt> <dd>Given a random variable (RV) with known mean \(\mu \) and known standard deviation σ. We are sampling with size <em data-effect="italics">n</em> and we are interested in two new RVs &#8211; the sample mean, \(\overline{X}\), and the sample sum, \(\Sigma X\). If the size <em data-effect="italics">n</em> of the sample is sufficiently large, then \(\overline{X}~N\left(\mu \text{,}\frac{\sigma }{\sqrt{n}}\right)\) and \(\Sigma X~N\left(n\mu ,\sqrt{n}\sigma \right)\). If the size <em data-effect="italics">n</em> of the sample is sufficiently large, then the distribution of the sample means and the distribution of the sample sums will approximate a normal distribution regardless of the shape of the population. The mean of the sample means will equal the population mean and the mean of the sample sums will equal <em data-effect="italics">n</em> times the population mean. The standard deviation of the distribution of the sample means, \(\frac{\sigma }{\sqrt{n}}\), is called the standard error of the mean.</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-hypothesis-testing-of-a-single-mean-and-single-proportion" title="Activity 10.7: Hypothesis Testing of a Single Mean and Single Proportion"><div class="chapter-title-wrap"><h3 class="chapter-number">61</h3><h2 class="chapter-title"><span class="display-none">Activity 10.7: Hypothesis Testing of a Single Mean and Single Proportion</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-id1171854730054" class="statistics lab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Hypothesis Testing of a Single Mean and Single Proportion</div> <p id="id9325443">Class Time:</p> <p id="id9325450">Names:</p> <div id="id3727711" data-type="list"><div data-type="title">Student Learning Outcomes</div> <ul><li>The student will select the appropriate distributions to use in each case.</li> <li>The student will conduct hypothesis tests and interpret the results.</li> </ul> </div> <p><span data-type="title">Television Survey</span>In a recent survey, it was stated that Americans watch television on average four hours per day. Assume that <em data-effect="italics">σ</em> = 2. Using your class as the sample, conduct a hypothesis test to determine if the average for students at your school is lower.</p> <ol id="list-9283749" data-mark-suffix="."><li><em data-effect="italics">H<sub>0</sub></em>: _____________</li> <li><em data-effect="italics">H<sub>a</sub></em>: _____________</li> <li>In words, define the random variable. __________ = ______________________</li> <li>The distribution to use for the test is _______________________.</li> <li>Determine the test statistic using your data.</li> <li>Draw a graph and label it appropriately.Shade the actual level of significance. <ol id="list-25986787634" type="a"><li>Graph: <div id="id11610612" class="bc-figure figure"><span id="id11610617" data-type="media" data-alt="Blank graph with vertical and horizontal axes."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch09_18_01-1.png" alt="Blank graph with vertical and horizontal axes." width="380" data-media-type="image/png" /></span></div> </li> <li>Determine the <em data-effect="italics">p</em>-value.</li> </ol> </li> <li>Do you or do you not reject the null hypothesis? Why?</li> <li>Write a clear conclusion using a complete sentence.</li> </ol> <p id="element-40a"><span data-type="title">Language Survey</span>About 42.3% of Californians and 19.6% of all Americans over age five speak a language other than English at home. Using your class as the sample, conduct a hypothesis test to determine if the percent of the students at your school who speak a language other than English at home is different from 42.3%.</p> <ol id="list-2349867965" data-mark-suffix="."><li><em data-effect="italics">H<sub>0</sub></em>: ___________</li> <li><em data-effect="italics">H<sub>a</sub></em>: ___________</li> <li>In words, define the random variable. __________ = _______________</li> <li>The distribution to use for the test is ________________</li> <li>Determine the test statistic using your data.</li> <li>Draw a graph and label it appropriately. Shade the actual level of significance. <ol id="list-234978687" type="a"><li>Graph: <div id="id11513524" class="bc-figure figure"><span id="id11513529" data-type="media" data-alt="Blank graph with vertical and horizontal axes."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch09_18_02-1.png" alt="Blank graph with vertical and horizontal axes." width="380" data-media-type="image/png" /></span></div> </li> <li>Determine the <em data-effect="italics">p</em>-value.</li> </ol> </li> <li>Do you or do you not reject the null hypothesis? Why?</li> <li>Write a clear conclusion using a complete sentence.</li> </ol> <p><span data-type="title">Jeans Survey</span>Suppose that young adults own an average of three pairs of jeans. Survey eight people from your class to determine if the average is higher than three. Assume the population is normal.</p> <ol id="list-234876875" data-mark-suffix="."><li><em data-effect="italics">H<sub>0</sub></em>: _____________</li> <li><em data-effect="italics">H<sub>a</sub></em>: _____________</li> <li>In words, define the random variable. __________ = ______________________</li> <li>The distribution to use for the test is _______________________.</li> <li>Determine the test statistic using your data.</li> <li>Draw a graph and label it appropriately. Shade the actual level of significance. <ol id="list-23492347" type="a"><li>Graph: <div id="id11513704" class="bc-figure figure"><span id="id11513708" data-type="media" data-alt="Blank graph with vertical and horizontal axes."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch09_18_03-1.png" alt="Blank graph with vertical and horizontal axes." width="380" data-media-type="image/png" /></span></div> </li> <li>Determine the <em data-effect="italics">p</em>-value.</li> </ol> </li> <li>Do you or do you not reject the null hypothesis? Why?</li> <li>Write a clear conclusion using a complete sentence.</li> </ol> </div> </div></div>
<div class="part " id="part-hypothesis-testing-with-two-samples"><div class="part-title-wrap"><h3 class="part-number">XI</h3><h1 class="part-title">Chapter 11: Hypothesis Testing with Two Samples</h1></div><div class="ugc part-ugc"></div></div>
<div class="chapter standard" id="chapter-introduction-22" title="Chapter 11.1: Introduction"><div class="chapter-title-wrap"><h3 class="chapter-number">62</h3><h2 class="chapter-title"><span class="display-none">Chapter 11.1: Introduction</span></h2></div><div class="ugc chapter-ugc"><p>[latexpage]</p> <div id="fs-idp119584608" class="splash"><div class="bc-figcaption figcaption">If you want to test a claim that involves two groups (the types of breakfasts eaten east and west of the Mississippi River) you can use a slightly different technique when conducting a hypothesis test. (credit: Chloe Lim)</div> <p><span id="fs-idm16785584" data-type="media" data-alt="This is a photo of a plate with a large pile of eggs in the foreground and six slices of toast in the background. There is a small dish of red jam sitting near the toast on the plate."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/CNX_Stats_C10_CO-1.jpg" alt="This is a photo of a plate with a large pile of eggs in the foreground and six slices of toast in the background. There is a small dish of red jam sitting near the toast on the plate." width="380" data-media-type="image/jpeg" /></span></p> </div> <div id="fs-idm64219152" class="chapter-objectives" data-type="note" data-has-label="true" data-label=""><div data-type="title">Chapter Objectives</div> <p>By the end of this chapter, the student should be able to:</p> <ul id="list5267"><li>Classify hypothesis tests by type.</li> <li>Conduct and interpret hypothesis tests for two population means, population standard deviations known.</li> <li>Conduct and interpret hypothesis tests for two population means, population standard deviations unknown.</li> <li>Conduct and interpret hypothesis tests for two population proportions.</li> <li>Conduct and interpret hypothesis tests for matched or paired samples.</li> </ul> </div> <p>Studies often compare two groups. For example, researchers are interested in the effect aspirin has in preventing heart attacks. Over the last few years, newspapers and magazines have reported various aspirin studies involving two groups. Typically, one group is given aspirin and the other group is given a placebo. Then, the heart attack rate is studied over several years.</p> <p>There are other situations that deal with the comparison of two groups. For example, studies compare various diet and exercise programs. Politicians compare the proportion of individuals from different income brackets who might vote for them. Students are interested in whether SAT or GRE preparatory courses really help raise their scores.</p> <p>You have learned to conduct hypothesis tests on single means and single proportions. You will expand upon that in this chapter. You will compare two means or two proportions to each other. The general procedure is still the same, just expanded.</p> <p>To compare two means or two proportions, you work with two groups. The groups are classified either as <strong>independent</strong> or <span data-type="term">matched pairs</span>. <span data-type="term">Independent groups</span> consist of two samples that are independent, that is, sample values selected from one population are not related in any way to sample values selected from the other population. <strong>Matched pairs</strong> consist of two samples that are dependent. The parameter tested using matched pairs is the population mean. The parameters tested using independent groups are either population means or population proportions.</p> <div id="fs-idp1161216" class="finger" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="fs-idm27664608">This chapter relies on either a calculator or a computer to calculate the degrees of freedom, the test statistics, and <em data-effect="italics">p</em>-values. TI-83+ and TI-84 instructions are included as well as the test statistic formulas. When using a TI-83+ or TI-84 calculator, we do not need to separate two population means, independent groups, or population variances unknown into large and small sample sizes. However, most statistical computer software has the ability to differentiate these tests.</p> </div> <p><span data-type="newline"><br /> </span>This chapter deals with the following hypothesis tests:</p> <div data-type="list"><div data-type="title">Independent groups (samples are independent)</div> <ul><li>Test of two population means.</li> <li>Test of two population proportions.</li> </ul> </div> <div data-type="list"><div data-type="title">Matched or paired samples (samples are dependent)</div> <ul><li>Test of the two population proportions by testing one population mean of differences.</li> </ul> </div> </div></div>
<div class="chapter standard" id="chapter-comparing-two-independent-population-proportions" title="Chapter 11.2: Comparing Two Independent Population Proportions"><div class="chapter-title-wrap"><h3 class="chapter-number">63</h3><h2 class="chapter-title"><span class="display-none">Chapter 11.2: Comparing Two Independent Population Proportions</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p id="fs-idp16810352">When conducting a hypothesis test that compares two independent population proportions, the following characteristics should be present:</p> <ol id="element-381"><li>The two independent samples are simple random samples that are independent.</li> <li>The number of successes is at least five, and the number of failures is at least five, for each of the samples.</li> <li>Growing literature states that the population must be at least ten or 20 times the size of the sample. This keeps each population from being over-sampled and causing incorrect results.</li> </ol> <p>Comparing two proportions, like comparing two means, is common. If two estimated proportions are different, it may be due to a difference in the populations or it may be due to chance. A hypothesis test can help determine if a difference in the estimated proportions reflects a difference in the population proportions.</p> <p id="fs-idm58849136">The difference of two proportions follows an approximate normal distribution. Generally, the null hypothesis states that the two proportions are the same. That is, <em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p<sub>A</sub></em> = <em data-effect="italics">p<sub>B</sub></em>. To conduct the test, we use a pooled proportion, <em data-effect="italics">p<sub>c</sub></em>.</p> <div id="element-845" data-type="equation"><div data-type="title">The pooled proportion is calculated as follows:</div> <p>\({p}_{c}=\frac{{x}_{A}+{x}_{B}}{{n}_{A}+{n}_{B}}\)</p> </div> <div data-type="equation"><div data-type="title">The distribution for the differences is:</div> <p>\({{P}^{\prime }}_{A}-{{P}^{\prime }}_{B}~N\left[0,\sqrt{{p}_{c}\left(1-{p}_{c}\right)\left(\frac{1}{{n}_{A}}+\frac{1}{{n}_{B}}\right)}\right]\)</p> </div> <div data-type="equation"><div data-type="title">The test statistic (<em data-effect="italics">z</em>-score) is:</div> <p>\(z=\frac{\left({{p}^{\prime }}_{A}-{{p}^{\prime }}_{B}\right)-\left({p}_{A}-{p}_{B}\right)}{\sqrt{{p}_{c}\left(1-{p}_{c}\right)\left(\frac{1}{{n}_{A}}+\frac{1}{{n}_{B}}\right)}}\)</p> </div> <div id="element-944" class="textbox textbox--examples" data-type="example"><div id="fs-idm214596080" data-type="exercise"><div id="fs-idm72417744" data-type="problem"><p>Two types of medication for hives are being tested to determine if there is a <strong>difference in the proportions of adult patient reactions. Twenty</strong> out of a random <strong>sample of 200</strong> adults given medication A still had hives 30 minutes after taking the medication. <strong>Twelve</strong> out of another <strong>random sample of 200 adults</strong> given medication B still had hives 30 minutes after taking the medication. Test at a 1% level of significance.</p> </div> <div id="fs-idm163830832" data-type="solution"><p>The problem asks for a difference in proportions, making it a test of two proportions.</p> <p>Let <em data-effect="italics">A</em> and <em data-effect="italics">B</em> be the subscripts for medication A and medication B, respectively. Then <em data-effect="italics">p<sub>A</sub></em> and <em data-effect="italics">p<sub>B</sub></em> are the desired population proportions.</p> <p><span data-type="title">Random Variable:</span><em data-effect="italics">P′<sub>A</sub></em> – <em data-effect="italics">P′<sub>B</sub></em> = difference in the proportions of adult patients who did not react after 30 minutes to medication A and to medication B.</p> <p><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p<sub>A</sub></em> = <em data-effect="italics">p<sub>B</sub></em></p> <p id="fs-idm35710192"><em data-effect="italics">p<sub>A</sub></em> – <em data-effect="italics">p<sub>B</sub></em> = 0</p> <p><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p<sub>A</sub></em> ≠ <em data-effect="italics">p<sub>B</sub></em></p> <p id="fs-idp82118096"><em data-effect="italics">p<sub>A</sub></em> – <em data-effect="italics">p<sub>B</sub></em> ≠ 0</p> <p>The words <strong>&#8220;is a difference&#8221;</strong> tell you the test is two-tailed.</p> <p id="element-610"><strong>Distribution for the test:</strong> Since this is a test of two binomial population proportions, the distribution is normal:</p> <p>\({p}_{c}=\frac{{x}_{A}+{x}_{B}}{{n}_{A}+{n}_{B}}=\frac{20+12}{200+200}=0.08\text{ }1–{p}_{c}=0.92\)</p> <p>\({{P}^{\prime }}_{A}–{{P}^{\prime }}_{B}~N\left[0,\sqrt{\left(0.08\right)\left(0.92\right)\left(\frac{1}{200}+\frac{1}{200}\right)}\right]\)</p> <p><em data-effect="italics">P′<sub>A</sub></em> – <em data-effect="italics">P′<sub>B</sub></em> follows an approximate normal distribution.</p> <p><strong>Calculate the <em data-effect="italics">p</em>-value using the normal distribution:</strong><em data-effect="italics">p</em>-value = 0.1404.</p> <p id="element-270">Estimated proportion for group A: \({{p}^{\prime }}_{A}=\frac{{x}_{A}}{{n}_{A}}=\frac{20}{200}=0.1\)</p> <p id="fs-idm78563552">Estimated proportion for group B: \({{p}^{\prime }}_{B}=\frac{{x}_{B}}{{n}_{B}}=\frac{12}{200}=0.06\)</p> <p><span data-type="title">Graph:</span></p> <div id="hyptest22_cmp_3_1" class="bc-figure figure"><span id="id3811123" data-type="media" data-alt="Normal distribution curve of the difference in the percentages of adult patients who don't react to medication A and B after 30 minutes. The mean is equal to zero, and the values -0.04, 0, and 0.04 are labeled on the horizontal axis. Two vertical lines extend from -0.04 and 0.04 to the curve. The region to the left of -0.04 and the region to the right of 0.04 are each shaded to represent 1/2(p-value) = 0.0702."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch10_04_01-1.jpg" alt="Normal distribution curve of the difference in the percentages of adult patients who don't react to medication A and B after 30 minutes. The mean is equal to zero, and the values -0.04, 0, and 0.04 are labeled on the horizontal axis. Two vertical lines extend from -0.04 and 0.04 to the curve. The region to the left of -0.04 and the region to the right of 0.04 are each shaded to represent 1/2(p-value) = 0.0702." width="380" data-media-type="image/jpg" /></span></div> <p><em data-effect="italics">P′<sub>A</sub></em> – <em data-effect="italics">P′<sub>B</sub></em> = 0.1 – 0.06 = 0.04.</p> <p>Half the <em data-effect="italics">p</em>-value is below –0.04, and half is above 0.04.</p> <p>Compare <em data-effect="italics">α</em> and the <em data-effect="italics">p</em>-value: <em data-effect="italics">α</em> = 0.01 and the <em data-effect="italics">p</em>-value = 0.1404. <em data-effect="italics">α</em> &lt; <em data-effect="italics">p</em>-value.</p> <p>Make a decision: Since <em data-effect="italics">α</em> &lt; <em data-effect="italics">p</em>-value, do not reject <em data-effect="italics">H<sub>0</sub></em>.</p> <p><strong>Conclusion:</strong> At a 1% level of significance, from the sample data, there is not sufficient evidence to conclude that there is a difference in the proportions of adult patients who did not react after 30 minutes to medication <em data-effect="italics">A</em> and medication <em data-effect="italics">B</em>.</p> <div id="fs-idm4331904" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="fs-idm55744576">Press <code>STAT</code>. Arrow over to <code>TESTS</code> and press <code>6:2-PropZTest</code>. Arrow down and enter <code>20</code> for x1, <code>200</code> for n1, <code>12</code> for x2, and <code>200</code> for n2. Arrow down to <code>p1</code>: and arrow to <code>not equal p2</code>. Press <code>ENTER</code>. Arrow down to <code>Calculate</code> and press <code>ENTER</code>. The <em data-effect="italics">p</em>-value is <em data-effect="italics">p</em> = 0.1404 and the test statistic is 1.47. Do the procedure again, but instead of <code>Calculate</code> do <code>Draw</code>.</p> </div> </div> </div> </div> <div id="fs-idp38726464" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm6482160" data-type="exercise"><div id="fs-idm135837088" data-type="problem"><p id="fs-idm99465232">Two types of valves are being tested to determine if there is a difference in pressure tolerances. Fifteen out of a random sample of 100 of Valve <em data-effect="italics">A</em> cracked under 4,500 psi. Six out of a random sample of 100 of Valve <em data-effect="italics">B</em> cracked under 4,500 psi. Test at a 5% level of significance.</p> </div> </div> </div> <div id="fs-idp34418816" class="textbox textbox--examples" data-type="example"><div id="fs-idm5388112" data-type="exercise"><div id="fs-idm139653568" data-type="problem"><p id="fs-idm29484272">A research study was conducted about gender differences in “sexting.” The researcher believed that the proportion of girls involved in “sexting” is less than the proportion of boys involved. The data collected in the spring of 2010 among a random sample of middle and high school students in a large school district in the southern United States is summarized in <a class="autogenerated-content" href="#fs-idm29483888">(Figure)</a>. Is the proportion of girls sending sexts less than the proportion of boys “sexting?” Test at a 1% level of significance.</p> <table id="fs-idm29483888" summary=""><thead><tr><th></th> <th>Males</th> <th>Females</th> </tr> </thead> <tbody><tr><td>Sent “sexts”</td> <td>183</td> <td>156</td> </tr> <tr><td>Total number surveyed</td> <td>2231</td> <td>2169</td> </tr> </tbody> </table> </div> <div id="fs-idm119696944" data-type="solution"><p id="fs-idm140706224">This is a test of two population proportions. Let M and F be the subscripts for males and females. Then <em data-effect="italics">p<sub>M</sub></em> and <em data-effect="italics">p<sub>F</sub></em> are the desired population proportions.</p> <p id="fs-idp37876608"><span data-type="title">Random variable:</span><em data-effect="italics">p′<sub>F</sub></em> − <em data-effect="italics">p′<sub>M</sub></em> = difference in the proportions of males and females who sent “sexts.”</p> <p id="fs-idp63610272"><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p<sub>F</sub></em> = <em data-effect="italics">p<sub>M</sub></em> <em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p<sub>F</sub></em> – <em data-effect="italics">p<sub>M</sub></em> = 0</p> <p id="fs-idp67543312"><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p<sub>F</sub></em> &lt; <em data-effect="italics">p<sub>M</sub></em> <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p<sub>F</sub></em> – <em data-effect="italics">p<sub>M</sub></em> &lt; 0</p> <p id="fs-idp120133328">The words <strong>&#8220;less than&#8221;</strong> tell you the test is left-tailed.</p> <p id="fs-idp27327440"><strong>Distribution for the test:</strong> Since this is a test of two population proportions, the distribution is normal:</p> <p id="fs-idm83512624">\({p}_{c}=\frac{{x}_{F}+{x}_{M}}{{n}_{F}+{n}_{M}}=\frac{156+183}{2169+2231}=\text{0}\text{.077}\)<span data-type="newline"><br /> </span>\(1-{p}_{c}=0.923\)<span data-type="newline"><br /> </span>Therefore, <span data-type="newline"><br /> </span>\({{p}^{\prime }}_{F}–{{p}^{\prime }}_{M}\sim N\left(0,\sqrt{\left(0.077\right)\left(0.923\right)\left(\frac{1}{2169}+\frac{1}{2231}\right)}\right)\)<span data-type="newline"><br /> </span><em data-effect="italics">p′<sub>F</sub></em> – <em data-effect="italics">p′<sub>M</sub></em> follows an approximate normal distribution.</p> <p id="fs-idm37724704"><strong>Calculate the <em data-effect="italics">p</em>-value using the normal distribution:</strong><span data-type="newline"><br /> </span><em data-effect="italics">p</em>-value = 0.1045 <span data-type="newline"><br /> </span>Estimated proportion for females: 0.0719 <span data-type="newline"><br /> </span>Estimated proportion for males: 0.082</p> <p id="fs-idm93495920"><span data-type="title">Graph:</span></p> <div id="fs-idp39056128" class="bc-figure figure"><span id="fs-idp49042304" data-type="media" data-alt="This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the left of zero extends from the axis to the curve. The region under the curve to the left of the line is shaded representing p-value = 0.1045." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C10_M04_001N-1.jpg" alt="This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the left of zero extends from the axis to the curve. The region under the curve to the left of the line is shaded representing p-value = 0.1045." width="380" data-media-type="image/jpg" /></span></div> <p id="fs-idm29799616"><strong>Decision:</strong> Since <em data-effect="italics">α</em> &lt; <em data-effect="italics">p</em>-value, Do not reject <em data-effect="italics">H<sub>0</sub></em></p> <p id="fs-idp51077664"><strong>Conclusion:</strong> At the 1% level of significance, from the sample data, there is not sufficient evidence to conclude that the proportion of girls sending “sexts” is less than the proportion of boys sending “sexts.”</p> <div id="fs-idm151141472" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="fs-idm112689104">Press STAT. Arrow over to TESTS and press 6:2-PropZTest. Arrow down and enter 156 for x1, 2169 for n1, 183 for x2, and 2231 for n2. Arrow down to p1: and arrow to less than p2. Press <code>ENTER</code>. Arrow down to Calculate and press ENTER. The <em data-effect="italics">p</em>-value is <em data-effect="italics">P</em> = 0.1045 and the test statistic is <em data-effect="italics">z</em> = -1.256.</p> </div> </div> </div> </div> <div id="fs-idp9035712" class="textbox textbox--examples" data-type="example"><div id="fs-idm70907200" data-type="exercise"><div id="fs-idm90651728" data-type="problem"><p id="fs-idp52326208">Researchers conducted a study of smartphone use among adults. A cell phone company claimed that iPhone smartphones are more popular with whites (non-Hispanic) than with African Americans. The results of the survey indicate that of the 232 African American cell phone owners randomly sampled, 5% have an iPhone. Of the 1,343 white cell phone owners randomly sampled, 10% own an iPhone. Test at the 5% level of significance. Is the proportion of white iPhone owners greater than the proportion of African American iPhone owners?</p> </div> <div id="fs-idm244330848" data-type="solution"><p id="fs-idp4139616">This is a test of two population proportions. Let W and A be the subscripts for the whites and African Americans. Then <em data-effect="italics">p<sub>W</sub></em> and <em data-effect="italics">p<sub>A</sub></em> are the desired population proportions.</p> <p id="fs-idp85441184"><span data-type="title">Random variable:</span><em data-effect="italics">p′<sub>W</sub></em> – <em data-effect="italics">p′<sub>A</sub></em> = difference in the proportions of Android and iPhone users.</p> <p id="fs-idp129126752"><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p<sub>W</sub></em> = <em data-effect="italics">p<sub>A</sub></em> <em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p<sub>W</sub></em> – <em data-effect="italics">p<sub>A</sub></em> = 0</p> <p id="fs-idm200846096"><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p<sub>W</sub></em> &gt; <em data-effect="italics">p<sub>A</sub></em> <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p<sub>W</sub></em> – <em data-effect="italics">p<sub>A</sub></em> &gt; 0</p> <p id="fs-idm11108992">The words &#8220;more popular&#8221; indicate that the test is right-tailed.</p> <p id="fs-idp137504160">Distribution for the test: The distribution is approximately normal:</p> <p id="fs-idm35602240">\({p}_{c}=\frac{{x}_{W}+{x}_{A}}{{n}_{W}+{n}_{A}}=\frac{134+12}{1343+232}=\text{ }0.0927\)</p> <p id="fs-idp116231296">\(1-{p}_{c}=0.9073\)</p> <p id="fs-idp68984592">Therefore,</p> <p id="fs-idp87185952">\({{p}^{\prime }}_{W}–{{p}^{\prime }}_{A}\backsim N\left(0,\sqrt{\left(0.0927\right)\left(0.9073\right)\left(\frac{1}{1343}+\frac{1}{232}\right)}\right)\)</p> <p id="fs-idp16865808">\({{p}^{\prime }}_{W}–{{p}^{\prime }}_{A}\) follows an approximate normal distribution.</p> <p id="fs-idp153346256"><span data-type="title">Calculate the <em data-effect="italics">p</em>-value using the normal distribution:</span><span data-type="newline"><br /> </span><em data-effect="italics">p</em>-value = 0.0077<span data-type="newline"><br /> </span> Estimated proportion for group A: 0.10<span data-type="newline"><br /> </span> Estimated proportion for group B: 0.05</p> <p>&nbsp;</p> <p id="fs-idp26516352"><strong>Graph:</strong></p> <div id="id12575638" class="bc-figure figure"><span id="id12575642" data-type="media" data-alt="This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the right of zero extends from the axis to the curve. The region under the curve to the right of the line is shaded representing p-value = 0.00004."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C10_M04_007anno2-1.jpg" alt="This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the right of zero extends from the axis to the curve. The region under the curve to the right of the line is shaded representing p-value = 0.00004." width="380" data-media-type="image/jpg" /></span></div> <p id="fs-idm17894912"><strong>Decision:</strong> Since <em data-effect="italics">α</em> &gt; <em data-effect="italics">p</em>-value, reject the <em data-effect="italics">H<sub>0</sub></em>.</p> <p id="fs-idp74257664"><strong>Conclusion:</strong> At the 5% level of significance, from the sample data, there is sufficient evidence to conclude that a larger proportion of white cell phone owners use iPhones than African Americans.</p> <div id="fs-idp15799920" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="fs-idm46927376">TI-83+ and TI-84: Press STAT. Arrow over to TESTS and press 6:2-PropZTest. Arrow down and enter 135 for x1, 1343 for n1, 12 for x2, and 232 for n2. Arrow down to p1: and arrow to greater than p2. Press ENTER. Arrow down to Calculate and press ENTER. The P-value is P = 0.0092 and the test statistic is Z = 2.33.</p> </div> </div> </div> </div> <div id="fs-idp142829376" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <p id="fs-idp32122976">A concerned group of citizens wanted to know if the proportion of forcible rapes in Texas was different in 2011 than in 2010. Their research showed that of the 113,231 violent crimes in Texas in 2010, 7,622 of them were forcible rapes. In 2011, 7,439 of the 104,873 violent crimes were in the forcible rape category. Test at a 5% significance level. Answer the following questions:<span data-type="newline"><br /> </span></p> <div id="fs-idm127205008" data-type="exercise"><div id="fs-idm286855008" data-type="problem"><p id="fs-idm65044288">a. Is this a test of two means or two proportions?</p> </div> </div> <div id="fs-idm45452624" data-type="exercise"><div id="fs-idm212581120" data-type="problem"><p id="fs-idm11093104">b. Which distribution do you use to perform the test?</p> </div> </div> <div id="fs-idm13955600" data-type="exercise"><div id="fs-idm176424272" data-type="problem"><p id="fs-idp13677968">c. What is the random variable?</p> </div> </div> <div id="fs-idm6924128" data-type="exercise"><div id="fs-idm152098288" data-type="problem"><p id="fs-idp5734352">d. What are the null and alternative hypothesis? Write the null and alternative hypothesis in symbols.</p> </div> </div> <div id="fs-idm131708672" data-type="exercise"><div id="fs-idm129757040" data-type="problem"><p id="fs-idm32329328">e. Is this test right-, left-, or two-tailed?</p> </div> </div> <div id="fs-idm51666240" data-type="exercise"><div id="fs-idm109483616" data-type="problem"><p id="fs-idm36523264">f. What is the <em data-effect="italics">p</em>-value?</p> </div> </div> <div id="fs-idm58266784" data-type="exercise"><div id="fs-idm17415680" data-type="problem"><p id="fs-idm11091776">g. Do you reject or not reject the null hypothesis?</p> </div> </div> <div id="fs-idm46567952" data-type="exercise"><div id="fs-idm149856720" data-type="problem"><p id="fs-idm37627040">h. At the ___ level of significance, from the sample data, there ______ (is/is not) sufficient evidence to conclude that ____________.</p> </div> </div> </div> <div class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="fs-idp111696960">Data from <em data-effect="italics">Educational Resources</em>, December catalog.</p> <p id="fs-idp8154992">Data from Hilton Hotels. Available online at http://www.hilton.com (accessed June 17, 2013).</p> <p id="fs-idm87912912">Data from Hyatt Hotels. Available online at http://hyatt.com (accessed June 17, 2013).</p> <p id="fs-idm35651936">Data from Statistics, United States Department of Health and Human Services.</p> <p id="fs-idm37685328">Data from Whitney Exhibit on loan to San Jose Museum of Art.</p> <p id="fs-idm33903216">Data from the American Cancer Society. Available online at http://www.cancer.org/index (accessed June 17, 2013).</p> <p id="fs-idp189890192">Data from the Chancellor’s Office, California Community Colleges, November 1994.</p> <p id="fs-idp23470640">“State of the States.” Gallup, 2013. Available online at http://www.gallup.com/poll/125066/State-States.aspx?ref=interactive (accessed June 17, 2013).</p> <p id="fs-idm99404960">“West Nile Virus.” Centers for Disease Control and Prevention. Available online at http://www.cdc.gov/ncidod/dvbid/westnile/index.htm (accessed June 17, 2013).</p> </div> <div id="fs-idm55267024" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idp68961936">Test of two population proportions from independent samples.</p> <ul id="fs-idp68962192"><li>Random variable: \({\stackrel{^}{p}}_{A}–{\stackrel{^}{p}}_{B}=\) difference between the two estimated proportions</li> <li>Distribution: normal distribution</li> </ul> </div> <div id="fs-idp63662736" class="formula-review" data-depth="1"><h3 data-type="title">Formula Review</h3> <p id="fs-idm84295280">Pooled Proportion: <em data-effect="italics">p<sub>c</sub></em> = \(\frac{{x}_{F}\text{ }+\text{ }{x}_{M}}{{n}_{F}\text{ }+\text{ }{n}_{M}}\)</p> <p id="fs-idm34404208">Distribution for the differences: <span data-type="newline"><br /> </span>\({{p}^{\prime }}_{A}-{{p}^{\prime }}_{B}\sim N\left[0,\sqrt{{p}_{c}\left(1-{p}_{c}\right)\left(\frac{1}{{n}_{A}}+\frac{1}{{n}_{B}}\right)}\right]\)</p> <p id="fs-idm64589136">where the null hypothesis is <em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p<sub>A</sub></em> = <em data-effect="italics">p<sub>B</sub></em> or <em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p<sub>A</sub></em> – <em data-effect="italics">p<sub>B</sub></em> = 0.</p> <p id="fs-idp35988640">Test Statistic (<em data-effect="italics">z</em>-score): \(z=\frac{\left({p}^{\prime }{}_{A}-{p}^{\prime }{}_{B}\right)}{\sqrt{{p}_{c}\left(1-{p}_{c}\right)\left(\frac{1}{{n}_{A}}+\frac{1}{{n}_{B}}\right)}}\)</p> <p id="fs-idp60078944">where the null hypothesis is <em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p<sub>A</sub></em> = <em data-effect="italics">p<sub>B</sub></em> or <em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p<sub>A</sub></em> − <em data-effect="italics">p<sub>B</sub></em> = 0.</p> <p id="fs-idm12253536">where</p> <p id="fs-idm12253152"><em data-effect="italics">p′<sub>A</sub></em> and <em data-effect="italics">p′<sub>B</sub></em> are the sample proportions, <em data-effect="italics">p<sub>A</sub></em> and <em data-effect="italics">p<sub>B</sub></em> are the population proportions,</p> <p id="fs-idp68599952"><em data-effect="italics">P<sub>c</sub></em> is the pooled proportion, and <strong><em data-effect="italics">n<sub>A</sub></em></strong> and <strong><em data-effect="italics">n<sub>B</sub></em></strong> are the sample sizes.</p> </div> <div id="fs-idp47236640" class="practice" data-depth="1"><p><em data-effect="italics">Use the following information for the next five exercises.</em> Two types of phone operating system are being tested to determine if there is a difference in the proportions of system failures (crashes). Fifteen out of a random sample of 150 phones with OS<sub>1</sub> had system failures within the first eight hours of operation. Nine out of another random sample of 150 phones with OS<sub>2</sub> had system failures within the first eight hours of operation. OS<sub>2</sub> is believed to be more stable (have fewer crashes) than OS<sub>1</sub>.</p> <div data-type="exercise"><div id="eip-850" data-type="problem"><p>Is this a test of means or proportions?</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p id="eip-678">What is the random variable?</p> </div> <div data-type="solution"><p><em data-effect="italics">P</em>′<sub>OS1</sub> – <em data-effect="italics">P</em>′<sub>OS2</sub> = difference in the proportions of phones that had system failures within the first eight hours of operation with OS<sub>1</sub> and OS<sub>2</sub>.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p id="eip-32">State the null and alternative hypotheses.</p> </div> </div> <div id="eip-56" data-type="exercise"><div data-type="problem"><p>What is the <em data-effect="italics">p</em>-value?</p> </div> <div id="eip-599" data-type="solution"><p>0.1018</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>What can you conclude about the two operating systems?</p> </div> </div> <p><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next twelve exercises.</em> In the recent Census, three percent of the U.S. population reported being of two or more races. However, the percent varies tremendously from state to state. Suppose that two random surveys are conducted. In the first random survey, out of 1,000 North Dakotans, only nine people reported being of two or more races. In the second random survey, out of 500 Nevadans, 17 people reported being of two or more races. Conduct a hypothesis test to determine if the population percents are the same for the two states or if the percent for Nevada is statistically higher than for North Dakota.</p> <div data-type="exercise"><div id="id11329019" data-type="problem"><p>Is this a test of means or proportions?</p> </div> <div id="id20348129" data-type="solution"><p>proportions</p> </div> </div> <div data-type="exercise"><div id="id20756414" data-type="problem"><p>State the null and alternative hypotheses.</p> <ol id="list-978623794" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: _________</li> <li><em data-effect="italics">H<sub>a</sub></em>: _________</li> </ol> </div> </div> <div data-type="exercise"><div id="id22206121" data-type="problem"><p>Is this a right-tailed, left-tailed, or two-tailed test? How do you know?</p> </div> <div id="id18377639" data-type="solution"><p>right-tailed</p> </div> </div> <div data-type="exercise"><div id="id22505556" data-type="problem"><p>What is the random variable of interest for this test?</p> </div> </div> <div data-type="exercise"><div id="id18170891" data-type="problem"><p>In words, define the random variable for this test.</p> </div> <div id="eip-idm88064848" data-type="solution"><p id="eip-idm88064592">The random variable is the difference in proportions (percents) of the populations that are of two or more races in Nevada and North Dakota.</p> </div> </div> <div data-type="exercise"><div id="id16485244" data-type="problem"><p>Which distribution (normal or Student&#8217;s <em data-effect="italics">t</em>) would you use for this hypothesis test?</p> </div> </div> <div id="element-464" data-type="exercise"><div id="id13843081" data-type="problem"><p>Explain why you chose the distribution you did for the <a href="#element-7">Exercise 10.56</a>.</p> </div> <div id="eip-idp31729824" data-type="solution"><p id="eip-idm107273104">Our sample sizes are much greater than five each, so we use the normal for two proportions distribution for this hypothesis test.</p> </div> </div> <div data-type="exercise"><div id="id17798140" data-type="problem"><p id="fs-idm20738192">Calculate the test statistic.</p> </div> </div> <div data-type="exercise"><div id="id20752849" data-type="problem"><p>Sketch a graph of the situation. Mark the hypothesized difference and the sample difference. Shade the area corresponding to the <em data-effect="italics">p</em>-value.</p> <div id="id12575638a" class="bc-figure figure"><span id="id12575642a" data-type="media" data-alt="This is a horizontal axis with arrows at each end. The axis is labeled p'N - p'ND"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch10_07_01-1.jpg" alt="This is a horizontal axis with arrows at each end. The axis is labeled p'N - p'ND" width="380" data-media-type="image/jpg" /></span></div> </div> <div id="fs-idm55153680" data-type="solution"><p id="fs-idm55153552">Check student’s solution.</p> </div> </div> <div data-type="exercise"><div id="id17163768" data-type="problem"><p>Find the <em data-effect="italics">p</em>-value.</p> </div> </div> <div data-type="exercise"><div id="id11271894" data-type="problem"><p id="fs-idm34769936">At a pre-conceived <em data-effect="italics">α</em> = 0.05, what is your:</p> <ol id="list-2353627788" type="a"><li>Decision:</li> <li>Reason for the decision:</li> <li>Conclusion (write out in a complete sentence):</li> </ol> </div> <div id="id18024766" data-type="solution"><ol id="fs-idp68589888" type="a"><li>Reject the null hypothesis.</li> <li><em data-effect="italics">p</em>-value &lt; alpha</li> <li>At the 5% significance level, there is sufficient evidence to conclude that the proportion (percent) of the population that is of two or more races in Nevada is statistically higher than that in North Dakota.</li> </ol> </div> </div> <div id="exerciselkj" data-type="exercise"><div id="id17796096" data-type="problem"><p>Does it appear that the proportion of Nevadans who are two or more races is higher than the proportion of North Dakotans? Why or why not?</p> </div> </div> </div> <div id="fs-idp49279360" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <p id="fs-idm14172688"><em data-effect="italics">DIRECTIONS: For each of the word problems, use a solution sheet to do the hypothesis test. The solution sheet is found in <a class="autogenerated-content" href="/contents/c0449c55-aa47-4f1c-bd5f-0521652f0e82">(Figure)</a>. Please feel free to make copies of the solution sheets. For the online version of the book, it is suggested that you copy the .doc or the .pdf files.</em></p> <div id="id17215905" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="fs-idm44775184">If you are using a Student&#8217;s <em data-effect="italics">t</em>-distribution for one of the following homework problems, including for paired data, you may assume that the underlying population is normally distributed. (In general, you must first prove that assumption, however.)</p> </div> <div data-type="exercise"><div id="id9504371" data-type="problem"><p id="id11960146"></p></div> </div> <div data-type="exercise"><div id="id4902537" data-type="problem"><p id="id12924042">1) We are interested in whether the proportions of female suicide victims for ages 15 to 24 are the same for the whites and the blacks races in the United States. We randomly pick one year, 1992, to compare the races. The number of suicides estimated in the United States in 1992 for white females is 4,930. Five hundred eighty were aged 15 to 24. The estimate for black females is 330. Forty were aged 15 to 24. We will let female suicide victims be our population.</p> </div> <div id="fs-idm32452768" data-type="solution"><p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id3785760" data-type="problem"><p id="id12924247">2) Elizabeth Mjelde, an art history professor, was interested in whether the value from the Golden Ratio formula, \(\left(\frac{\text{larger + smaller dimension}}{\text{larger dimension}}\right)\) was the same in the Whitney Exhibit for works from 1900 to 1919 as for works from 1920 to 1942. Thirty-seven early works were sampled, averaging 1.74 with a standard deviation of 0.11. Sixty-five of the later works were sampled, averaging 1.746 with a standard deviation of 0.1064. Do you think that there is a significant difference in the Golden Ratio calculation?</p> <p>&nbsp;</p> </div> </div> <div id="exer24" data-type="exercise"><div id="id19336074" data-type="problem"><p id="id12116088">3) A recent year was randomly picked from 1985 to the present. In that year, there were 2,051 Hispanic students at Cabrillo College out of a total of 12,328 students. At Lake Tahoe College, there were 321 Hispanic students out of a total of 2,441 students. In general, do you think that the percent of Hispanic students at the two colleges is basically the same or different?</p> </div> <div id="fs-idm6549984" data-type="solution"><p>&nbsp;</p> </div> </div> <p><em data-effect="italics">Use the following information to answer the next three exercises.</em> Neuroinvasive West Nile virus is a severe disease that affects a person’s nervous system . It is spread by the Culex species of mosquito. In the United States in 2010 there were 629 reported cases of neuroinvasive West Nile virus out of a total of 1,021 reported cases and there were 486 neuroinvasive reported cases out of a total of 712 cases reported in 2011. Is the 2011 proportion of neuroinvasive West Nile virus cases more than the 2010 proportion of neuroinvasive West Nile virus cases? Using a 1% level of significance, conduct an appropriate hypothesis test.</p> <ul id="list2342352"><li>“2011” subscript: 2011 group.</li> <li>“2010” subscript: 2010 group</li> </ul> <div data-type="exercise"><div id="id4870569" data-type="problem"><p>4) This is:</p> <ol type="a"><li>a test of two proportions</li> <li>a test of two independent means</li> <li>a test of a single mean</li> <li>a test of matched pairs.</li> </ol> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id12760036" data-type="problem"><p>5) An appropriate null hypothesis is:</p> <ol type="a"><li><em data-effect="italics">p<sub>2011</sub></em> ≤ <em data-effect="italics">p<sub>2010</sub></em></li> <li><em data-effect="italics">p<sub>2011</sub></em> ≥ <em data-effect="italics">p<sub>2010</sub></em></li> <li><em data-effect="italics">μ<sub>2011</sub></em> ≤ <em data-effect="italics">μ<sub>2010</sub></em></li> <li><em data-effect="italics">p<sub>2011</sub></em> &gt; <em data-effect="italics">p<sub>2010</sub></em></li> </ol> </div> <div id="id9070418" data-type="solution"><p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id12134941" data-type="problem"><p>6) The <em data-effect="italics">p</em>-value is 0.0022. At a 1% level of significance, the appropriate conclusion is</p> <ol type="a"><li>There is sufficient evidence to conclude that the proportion of people in the United States in 2011 who contracted neuroinvasive West Nile disease is less than the proportion of people in the United States in 2010 who contracted neuroinvasive West Nile disease.</li> <li>There is insufficient evidence to conclude that the proportion of people in the United States in 2011 who contracted neuroinvasive West Nile disease is more than the proportion of people in the United States in 2010 who contracted neuroinvasive West Nile disease.</li> <li>There is insufficient evidence to conclude that the proportion of people in the United States in 2011 who contracted neuroinvasive West Nile disease is less than the proportion of people in the United States in 2010 who contracted neuroinvasive West Nile disease.</li> <li>There is sufficient evidence to conclude that the proportion of people in the United States in 2011 who contracted neuroinvasive West Nile disease is more than the proportion of people in the United States in 2010 who contracted neuroinvasive West Nile disease.</li> </ol> <p>&nbsp;</p> </div> </div> <div id="fs-idp30249712" data-type="exercise"><div id="fs-idp30249968" data-type="problem"><p id="fs-idm5258112">7) Researchers conducted a study to find out if there is a difference in the use of eReaders by different age groups. Randomly selected participants were divided into two age groups. In the 16- to 29-year-old group, 7% of the 628 surveyed use eReaders, while 11% of the 2,309 participants 30 years old and older use eReaders.</p> </div> <div id="fs-idm101357040" data-type="solution"><p id="fs-idm37090256"></p></div> </div> <div id="fs-idm48453008" data-type="exercise"><div id="fs-idm48452752" data-type="problem"><p id="fs-idm48452496">8) Adults aged 18 years old and older were randomly selected for a survey on obesity. Adults are considered obese if their body mass index (BMI) is at least 30. The researchers wanted to determine if the proportion of women who are obese in the south is less than the proportion of southern men who are obese. The results are shown in <a class="autogenerated-content" href="#fs-idp62517488">(Figure)</a>. Test at the 1% level of significance.</p> <table id="fs-idp62517488" summary=""><thead><tr><th></th> <th>Number who are obese</th> <th>Sample size</th> </tr> </thead> <tbody><tr><td>Men</td> <td>42,769</td> <td>155,525</td> </tr> <tr><td>Women</td> <td>67,169</td> <td>248,775</td> </tr> </tbody> </table> <p>&nbsp;</p> </div> </div> <div id="fs-idm16853120" data-type="exercise"><div id="fs-idm16852864" data-type="problem"><p id="fs-idm16852608">9) Two computer users were discussing tablet computers. A higher proportion of people ages 16 to 29 use tablets than the proportion of people age 30 and older. <a class="autogenerated-content" href="#fs-idp42650880">(Figure)</a> details the number of tablet owners for each age group. Test at the 1% level of significance.</p> <table id="fs-idp42650880" summary=""><thead><tr><th></th> <th>16–29 year olds</th> <th>30 years old and older</th> </tr> </thead> <tbody><tr><td>Own a Tablet</td> <td>69</td> <td>231</td> </tr> <tr><td>Sample Size</td> <td>628</td> <td>2,309</td> </tr> </tbody> </table> </div> <div id="fs-idp35035792" data-type="solution"><p id="fs-idp19014416"></p></div> </div> <div id="fs-idp44091424" data-type="exercise"><div id="fs-idp44091680" data-type="problem"><p id="fs-idp44091936">10) A group of friends debated whether more men use smartphones than women. They consulted a research study of smartphone use among adults. The results of the survey indicate that of the 973 men randomly sampled, 379 use smartphones. For women, 404 of the 1,304 who were randomly sampled use smartphones. Test at the 5% level of significance.</p> <p>&nbsp;</p> </div> </div> <div id="fs-idm174644848" data-type="exercise"><div id="fs-idm174644592" data-type="problem"><p id="fs-idm174644336">11) While her husband spent 2½ hours picking out new speakers, a statistician decided to determine whether the percent of men who enjoy shopping for electronic equipment is higher than the percent of women who enjoy shopping for electronic equipment. The population was Saturday afternoon shoppers. Out of 67 men, 24 said they enjoyed the activity. Eight of the 24 women surveyed claimed to enjoy the activity. Interpret the results of the survey.</p> </div> <div id="fs-idm111468704" data-type="solution"><p>&nbsp;</p> </div> </div> <div id="fs-idm111076192" data-type="exercise"><div id="fs-idm111075936" data-type="problem"><p id="fs-idm11241056">12) We are interested in whether children’s educational computer software costs less, on average, than children’s entertainment software. Thirty-six educational software titles were randomly picked from a catalog. The mean cost was \$31.14 with a standard deviation of \$4.69. Thirty-five entertainment software titles were randomly picked from the same catalog. The mean cost was \$33.86 with a standard deviation of \$10.87. Decide whether children’s educational software costs less, on average, than children’s entertainment software.</p> </div> </div> <div id="fs-idm98770112" data-type="exercise"><div id="fs-idm10004272" data-type="problem"><p id="fs-idm10004016">13) Joan Nguyen recently claimed that the proportion of college-age males with at least one pierced ear is as high as the proportion of college-age females. She conducted a survey in her classes. Out of 107 males, 20 had at least one pierced ear. Out of 92 females, 47 had at least one pierced ear. Do you believe that the proportion of males has reached the proportion of females?</p> </div> <div id="fs-idm75362416" data-type="solution"><p>&nbsp;</p> </div> </div> <div id="fs-idm103490000" data-type="exercise"><div id="fs-idm103489744" data-type="problem"><p id="fs-idm103489488">14) Use the data sets found in <a class="autogenerated-content" href="/contents/3ef830bc-5247-460a-9007-e3fd762e5e93">(Figure)</a> to answer this exercise. Is the proportion of race laps Terri completes slower than 130 seconds less than the proportion of practice laps she completes slower than 135 seconds?</p> </div> </div> <div id="fs-idm104246192" data-type="exercise"><div id="fs-idp505328" data-type="problem"></div> <div id="eip-idp83945904" data-type="solution"><p><strong>Answers to odd questions</strong></p> <p>1)</p> <ol id="fs-idm32452512" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">P<sub>W</sub></em> = <em data-effect="italics">P<sub>B</sub></em></li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">P<sub>W</sub></em> ≠ <em data-effect="italics">P<sub>B</sub></em></li> <li>The random variable is the difference in the proportions of white and black suicide victims, aged 15 to 24.</li> <li>normal for two proportions</li> <li>test statistic: –0.1944</li> <li><em data-effect="italics">p</em>-value: 0.8458</li> <li>Check student’s solution. <ol id="fs-idp6439728" type="i"><li>Alpha: 0.05</li> <li>Decision: Reject the null hypothesis.</li> <li>Reason for decision: <em data-effect="italics">p</em>-value &gt; alpha</li> <li>Conclusion: At the 5% significance level, there is insufficient evidence to conclude that the proportions of white and black female suicide victims, aged 15 to 24, are different.</li> </ol> </li> </ol> <p>3)</p> <p id="fs-idm23261552">Subscripts: 1 = Cabrillo College, 2 = Lake Tahoe College</p> <ol id="fs-idp93991872" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p<sub>1</sub></em> = <em data-effect="italics">p<sub>2</sub></em></li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p<sub>1</sub></em> ≠ <em data-effect="italics">p<sub>2</sub></em></li> <li>The random variable is the difference between the proportions of Hispanic students at Cabrillo College and Lake Tahoe College.</li> <li>normal for two proportions</li> <li>test statistic: 4.29</li> <li><em data-effect="italics">p</em>-value: 0.00002</li> <li>Check student’s solution. <ol id="fs-idm32250112" type="i"><li>Alpha: 0.05</li> <li>Decision: Reject the null hypothesis.</li> <li>Reason for decision: <em data-effect="italics">p</em>-value &lt; alpha</li> <li>Conclusion: There is sufficient evidence to conclude that the proportions of Hispanic students at Cabrillo College and Lake Tahoe College are different.</li> </ol> </li> </ol> <p>5) a</p> <p>7)</p> <p id="fs-idm101356784">Test: two independent sample proportions.</p> <p id="fs-idm101356400">Random variable: <em data-effect="italics">p</em>′<sub>1</sub> &#8211; <em data-effect="italics">p</em>′<sub>2</sub></p> <p id="fs-idm29110192">Distribution: <span data-type="newline"><br /> </span><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p<sub>1</sub></em> = <em data-effect="italics">p<sub>2</sub></em> <span data-type="newline"><br /> </span><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p<sub>1</sub></em> ≠ <em data-effect="italics">p<sub>2</sub></em></p> <p id="fs-idp1703488">The proportion of eReader users is different for the 16- to 29-year-old users from that of the 30 and older users.</p> <p id="fs-idp1703872">Graph: two-tailed</p> <div id="fs-idm17839568" class="bc-figure figure"><span id="fs-idp73177440" data-type="media" data-alt="This is a normal distribution curve with mean equal to zero. Both the right and left tails of the curve are shaded. Each tail represents 1/2(p-value) = 0.0017." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C10_M04_004annoN-1.jpg" alt="This is a normal distribution curve with mean equal to zero. Both the right and left tails of the curve are shaded. Each tail represents 1/2(p-value) = 0.0017." width="380" data-media-type="image/jpg" /></span></div> <p id="fs-idp1704256"><em data-effect="italics">p</em>-value : 0.0033</p> <p id="fs-idm37090640">Decision: Reject the null hypothesis.</p> <p>Conclusion: At the 5% level of significance, from the sample data, there is sufficient evidence to conclude that the proportion of eReader users 16 to 29 years old is different from the proportion of eReader users 30 and older.</p> <p>9)</p> <p id="fs-idp35036048">Test: two independent sample proportions</p> <p id="fs-idp35036432">Random variable: <em data-effect="italics">p′<sub>1</sub></em> − <em data-effect="italics">p′<sub>2</sub></em></p> <p id="fs-idm9404992">Distribution:</p> <p id="fs-idp46050528"><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p<sub>1</sub></em> = <em data-effect="italics">p<sub>2</sub></em><span data-type="newline"><br /> </span><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p<sub>1</sub></em> &gt; <em data-effect="italics">p<sub>2</sub></em></p> <p id="fs-idm39674464">A higher proportion of tablet owners are aged 16 to 29 years old than are 30 years old and older.</p> <p id="fs-idm39673968">Graph: right-tailed</p> <div id="fs-idp128478512" class="bc-figure figure"><span id="fs-idp24706912" data-type="media" data-alt="This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the right of zero extends from the axis to the curve. The region under the curve to the right of the line is shaded representing p-value = 0.2354." data-display="block"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C10_M04_006annoN-1.jpg" alt="This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the right of zero extends from the axis to the curve. The region under the curve to the right of the line is shaded representing p-value = 0.2354." width="380" data-media-type="image/jpg" /></span></div> <p><em data-effect="italics">p</em>-value: 0.2354</p> <p id="fs-idm11093536">Decision: Do not reject the <em data-effect="italics">H<sub>0</sub></em>.</p> <p>Conclusion: At the 1% level of significance, from the sample data, there is not sufficient evidence to conclude that a higher proportion of tablet owners are aged 16 to 29 years old than are 30 years old and older.</p> </div> <p>11)</p> <p id="fs-idp35259328">Subscripts: 1: men; 2: women</p> <ol id="fs-idp66491504" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p<sub>1</sub></em> ≤ <em data-effect="italics">p<sub>2</sub></em></li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p<sub>1</sub></em> &gt; <em data-effect="italics">p<sub>2</sub></em></li> <li>\({{P}^{\prime }}_{1}-{{P}^{\prime }}_{2}\) is the difference between the proportions of men and women who enjoy shopping for electronic equipment.</li> <li>normal for two proportions</li> <li>test statistic: 0.22</li> <li><em data-effect="italics">p</em>-value: 0.4133</li> <li>Check student’s solution. <ol id="fs-idp134826688" type="i"><li>Alpha: 0.05</li> <li>Decision: Do not reject the null hypothesis.</li> <li>Reason for Decision: <em data-effect="italics">p</em>-value &gt; alpha</li> <li>Conclusion: At the 5% significance level, there is insufficient evidence to conclude that the proportion of men who enjoy shopping for electronic equipment is more than the proportion of women.</li> </ol> </li> </ol> <p>13)</p> <ol id="fs-idm11471360" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p<sub>1</sub></em> = <em data-effect="italics">p<sub>2</sub></em></li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p<sub>1</sub></em> ≠ <em data-effect="italics">p<sub>2</sub></em></li> <li>\({{P}^{\prime }}_{1}-{{P}^{\prime }}_{2}\) is the difference between the proportions of men and women that have at least one pierced ear.</li> <li>normal for two proportions</li> <li>test statistic: –4.82</li> <li><em data-effect="italics">p</em>-value: zero</li> <li>Check student’s solution. <ol id="fs-idp115516720" type="i"><li>Alpha: 0.05</li> <li>Decision: Reject the null hypothesis.</li> <li>Reason for Decision: <em data-effect="italics">p</em>-value &lt; alpha</li> <li>Conclusion: At the 5% significance level, there is sufficient evidence to conclude that the proportions of males and females with at least one pierced ear is different.</li> </ol> </li> </ol> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl id="fs-idm110633744"><dt>Pooled Proportion</dt> <dd id="fs-idm110633104">estimate of the common value of <em data-effect="italics">p<sub>1</sub></em> and <em data-effect="italics">p<sub>2</sub></em>.</dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-matched-or-paired-samples" title="Chapter 11.3: Matched or Paired Samples"><div class="chapter-title-wrap"><h3 class="chapter-number">64</h3><h2 class="chapter-title"><span class="display-none">Chapter 11.3: Matched or Paired Samples</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <p id="fs-idp164310080">When using a hypothesis test for matched or paired samples, the following characteristics should be present:</p> <ol id="element-139"><li>Simple random sampling is used.</li> <li>Sample sizes are often small.</li> <li>Two measurements (samples) are drawn from the same pair of individuals or objects.</li> <li>Differences are calculated from the matched or paired samples.</li> <li>The differences form the sample that is used for the hypothesis test.</li> <li>Either the matched pairs have differences that come from a population that is normal or the number of differences is sufficiently large so that distribution of the sample mean of differences is approximately normal.</li> </ol> <p>In a hypothesis test for matched or paired samples, subjects are matched in pairs and differences are calculated. The differences are the data. The population mean for the differences, <em data-effect="italics">μ<sub>d</sub></em>, is then tested using a Student&#8217;s-t test for a single population mean with <em data-effect="italics">n</em> – 1 degrees of freedom, where <em data-effect="italics">n</em> is the number of differences.</p> <div data-type="equation"><div data-type="title">The test statistic (<em data-effect="italics">t</em>-score) is:</div> <p>\(t=\frac{{\overline{x}}_{d}-{\mu }_{d}}{\left(\frac{{s}_{d}}{\sqrt{n}}\right)}\)</p> </div> <div class="textbox textbox--examples" data-type="example"><div data-type="exercise"><div id="id24601264" data-type="problem"><p><strong>A study was conducted to investigate the effectiveness of hypnotism in reducing pain. Results for randomly selected subjects are shown in <a class="autogenerated-content" href="#table-2345">(Figure)</a>. A lower score indicates less pain. The &#8220;before&#8221; value is matched to an &#8220;after&#8221; value and the differences are calculated. The differences have a normal distribution. Are the sensory measurements, on average, lower after hypnotism? Test at a 5% significance level.</strong></p> <table id="table-2345" summary="This table presents results of subjects and effect of hypnotism in reducing pain. The second through ninth column are all subjects. The first row is for before and the second row is for after."><thead><tr><th>Subject:</th> <th>A</th> <th>B</th> <th>C</th> <th>D</th> <th>E</th> <th>F</th> <th>G</th> <th>H</th> </tr> </thead> <tbody><tr><td>Before</td> <td>6.6</td> <td>6.5</td> <td>9.0</td> <td>10.3</td> <td>11.3</td> <td>8.1</td> <td>6.3</td> <td>11.6</td> </tr> <tr><td>After</td> <td>6.8</td> <td>2.4</td> <td>7.4</td> <td>8.5</td> <td>8.1</td> <td>6.1</td> <td>3.4</td> <td>2.0</td> </tr> </tbody> </table> </div> <div id="id24601284" data-type="solution"><p id="element-215">Corresponding &#8220;before&#8221; and &#8220;after&#8221; values form matched pairs. (Calculate &#8220;after&#8221; – &#8220;before.&#8221;)</p> <table id="table-25832" summary="This table shows the after data in the first column, before data in the second column, and the difference in the third column."><thead><tr><th>After Data</th> <th>Before Data</th> <th>Difference</th> </tr> </thead> <tbody><tr><td>6.8</td> <td>6.6</td> <td>0.2</td> </tr> <tr><td>2.4</td> <td>6.5</td> <td>-4.1</td> </tr> <tr><td>7.4</td> <td>9</td> <td>-1.6</td> </tr> <tr><td>8.5</td> <td>10.3</td> <td>-1.8</td> </tr> <tr><td>8.1</td> <td>11.3</td> <td>-3.2</td> </tr> <tr><td>6.1</td> <td>8.1</td> <td>-2</td> </tr> <tr><td>3.4</td> <td>6.3</td> <td>-2.9</td> </tr> <tr><td>2</td> <td>11.6</td> <td>-9.6</td> </tr> </tbody> </table> <p>The data <strong>for the test</strong> are the differences: {0.2, –4.1, –1.6, –1.8, –3.2, –2, –2.9, –9.6}</p> <p>The sample mean and sample standard deviation of the differences are: \(\phantom{\rule{20pt}{0ex}}\overline{{x}_{d}}=–3.13\) and \({s}_{d}=2.91\) Verify these values.</p> <p>Let \({\mu }_{d}\) be the population mean for the differences. We use the subscript \(d\) to denote &#8220;differences.&#8221;</p> <p id="element-354"><strong>Random variable:</strong>\({\overline{X}}_{d}\) = the mean difference of the sensory measurements</p> <p id="fs-idp39645840"><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ<sub>d</sub></em> ≥ 0</p> <p>The null hypothesis is zero or positive, meaning that there is the same or more pain felt after hypnotism. That means the subject shows no improvement. <em data-effect="italics">μ<sub>d</sub></em> is the population mean of the differences.)</p> <p id="fs-idp150170576"><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ<sub>d</sub></em> &lt; 0</p> <p>The alternative hypothesis is negative, meaning there is less pain felt after hypnotism. That means the subject shows improvement. The score should be lower after hypnotism, so the difference ought to be negative to indicate improvement.</p> <p><strong>Distribution for the test:</strong> The distribution is a Student&#8217;s <em data-effect="italics">t</em> with <em data-effect="italics">df</em> = <em data-effect="italics">n</em> – 1 = 8 – 1 = 7. Use <em data-effect="italics">t</em><sub>7</sub>. <strong>(Notice that the test is for a single population mean.)</strong></p> <p><strong>Calculate the <em data-effect="italics">p</em>-value using the Student&#8217;s-t distribution:</strong><em data-effect="italics">p</em>-value = 0.0095</p> <p id="element-562"><strong>Graph:</strong></p> <div id="hyptest22_samp1" class="bc-figure figure"><span id="id11476386" data-type="media" data-alt="Normal distribution curve of the average difference of sensory measurements with values of -3.13 and 0. A vertical upward line extends from -3.13 to the curve, and the p-value is indicated in the area to the left of this value."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch10_05_01-1.jpg" alt="Normal distribution curve of the average difference of sensory measurements with values of -3.13 and 0. A vertical upward line extends from -3.13 to the curve, and the p-value is indicated in the area to the left of this value." width="380" data-media-type="image/jpg" /></span></div> <p>\({\overline{X}}_{d}\) is the random variable for the differences.</p> <p>The sample mean and sample standard deviation of the differences are:</p> <p>\({\overline{x}}_{d}\) = –3.13</p> <p>\({\overline{s}}_{d}\) = 2.91</p> <p><strong>Compare <em data-effect="italics">α</em> and the <em data-effect="italics">p</em>-value:</strong><em data-effect="italics">α</em> = 0.05 and <em data-effect="italics">p</em>-value = 0.0095. <em data-effect="italics">α</em> &gt; <em data-effect="italics">p</em>-value.</p> <p><strong>Make a decision:</strong> Since <em data-effect="italics">α</em> &gt; <em data-effect="italics">p</em>-value, reject <em data-effect="italics">H<sub>0</sub></em>. This means that <em data-effect="italics">μ<sub>d</sub></em> &lt; 0 and there is improvement.</p> <p><strong>Conclusion:</strong> At a 5% level of significance, from the sample data, there is sufficient evidence to conclude that the sensory measurements, on average, are lower after hypnotism. Hypnotism appears to be effective in reducing pain.</p> </div> </div> <div id="id11476665" class="finger" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="fs-idp48132064">For the TI-83+ and TI-84 calculators, you can either calculate the differences ahead of time (<strong>after &#8211; before</strong>) and put the differences into a list or you can put the <strong>after</strong> data into a first list and the <strong>before</strong> data into a second list. Then go to a third list and arrow up to the name. Enter 1<sup>st</sup> list name &#8211; 2<sup>nd</sup> list name. The calculator will do the subtraction, and you will have the differences in the third list.</p> </div> <div id="id11476702" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="fs-idp28416112">Use your list of differences as the data. Press <code>STAT</code> and arrow over to <code>TESTS</code>. Press <code>2:T-Test</code>. Arrow over to <code>Data</code> and press <code>ENTER</code>. Arrow down and enter <code>0</code> for \({\mu }_{0}\), the name of the list where you put the data, and <code>1</code> for Freq:. Arrow down to <code>μ</code>: and arrow over to <code>&lt;</code> \({\mu }_{0}\). Press <code>ENTER</code>. Arrow down to <code>Calculate</code> and press <code>ENTER</code>. The <em data-effect="italics">p</em>-value is 0.0094, and the test statistic is -3.04. Do these instructions again except, arrow to <code>Draw</code> (instead of <code>Calculate</code>). Press <code>ENTER</code>.</p> </div> </div> <div id="fs-idp46905536" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp1097104" data-type="exercise"><div id="fs-idp1097360" data-type="problem"><p id="fs-idm54581328">A study was conducted to investigate how effective a new diet was in lowering cholesterol. Results for the randomly selected subjects are shown in the table. The differences have a normal distribution. Are the subjects’ cholesterol levels lower on average after the diet? Test at the 5% level.</p> <table id="fs-idm36821616" summary=""><tbody><tr><td>Subject</td> <td>A</td> <td>B</td> <td>C</td> <td>D</td> <td>E</td> <td>F</td> <td>G</td> <td>H</td> <td>I</td> </tr> <tr><td>Before</td> <td>209</td> <td>210</td> <td>205</td> <td>198</td> <td>216</td> <td>217</td> <td>238</td> <td>240</td> <td>222</td> </tr> <tr><td>After</td> <td>199</td> <td>207</td> <td>189</td> <td>209</td> <td>217</td> <td>202</td> <td>211</td> <td>223</td> <td>201</td> </tr> </tbody> </table> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><p id="element-486">A college football coach was interested in whether the college&#8217;s strength development class increased his players&#8217; maximum lift (in pounds) on the bench press exercise. He asked four of his players to participate in a study. The amount of weight they could each lift was recorded before they took the strength development class. After completing the class, the amount of weight they could each lift was again measured. The data are as follows:</p> <table id="table-234678" summary="This table shows players and the amount of weight they are able to lift. The first column is the weight lifted and the second through the sixth columns represent the players. The first row is the amount of weight lifted before the class and the second row is the amount of weight lifted after the class."><thead><tr><th data-align="center">Weight (in pounds)</th> <th data-align="center">Player 1</th> <th data-align="center">Player 2</th> <th data-align="center">Player 3</th> <th data-align="center">Player 4</th> </tr> </thead> <tbody><tr><td>Amount of weight lifted prior to the class</td> <td>205</td> <td>241</td> <td>338</td> <td>368</td> </tr> <tr><td>Amount of weight lifted after the class</td> <td>295</td> <td>252</td> <td>330</td> <td>360</td> </tr> </tbody> </table> <p id="element-651"><strong>The coach wants to know if the strength development class makes his players stronger, on average.</strong><span data-type="newline"><br /> </span>Record the <strong>differences</strong> data. Calculate the differences by subtracting the amount of weight lifted prior to the class from the weight lifted after completing the class. The data for the differences are: {90, 11, -8, -8}. Assume the differences have a normal distribution.</p> <p>Using the differences data, calculate the sample mean and the sample standard deviation.</p> <p id="element-258">\({\overline{x}}_{d}\) = 21.3, <em data-effect="italics">s<sub>d</sub></em> = 46.7</p> <div id="fs-idp43851248" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="fs-idp15765136">The data given here would indicate that the distribution is actually right-skewed. The difference 90 may be an extreme outlier? It is pulling the sample mean to be 21.3 (positive). The means of the other three data values are actually negative.</p> </div> <p>Using the difference data, this becomes a test of a single __________ (fill in the blank).</p> <p><strong>Define the random variable:</strong>\({\overline{X}}_{d}\) mean difference in the maximum lift per player.</p> <p>The distribution for the hypothesis test is <em data-effect="italics">t<sub>3</sub></em>.</p> <p id="element-3201"><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ<sub>d</sub></em> ≤ 0, <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ<sub>d</sub></em> &gt; 0</p> <p><strong>Graph:</strong></p> <div id="hyptest22_samp2" class="bc-figure figure"><span id="id11476943" data-type="media" data-alt="Normal distribution curve with values of 0 and 21.3. A vertical upward line extends from 21.3 to the curve and the p-value is indicated in the area to the right of this value."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch10_05_02-1.jpg" alt="Normal distribution curve with values of 0 and 21.3. A vertical upward line extends from 21.3 to the curve and the p-value is indicated in the area to the right of this value." width="380" data-media-type="image/jpg" /></span></div> <p><strong>Calculate the <em data-effect="italics">p</em>-value:</strong> The <em data-effect="italics">p</em>-value is 0.2150</p> <p><strong>Decision:</strong> If the level of significance is 5%, the decision is not to reject the null hypothesis, because α &lt; <em data-effect="italics">p</em>-value.</p> <p><strong>What is the conclusion?</strong></p> <p id="e106soln">At a 5% level of significance, from the sample data, there is not sufficient evidence to conclude that the strength development class helped to make the players stronger, on average.</p> </div> <div id="fs-idm13751200" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idp4946544" data-type="exercise"><div id="fs-idp4946672" data-type="problem"><p id="fs-idp112606336">A new prep class was designed to improve SAT test scores. Five students were selected at random. Their scores on two practice exams were recorded, one before the class and one after. The data recorded in <a class="autogenerated-content" href="#fs-idp42085840">(Figure)</a>. Are the scores, on average, higher after the class? Test at a 5% level.</p> <table id="fs-idp42085840" summary=""><thead><tr><th>SAT Scores</th> <th>Student 1</th> <th>Student 2</th> <th>Student 3</th> <th>Student 4</th> </tr> </thead> <tbody><tr><td>Score before class</td> <td>1840</td> <td>1960</td> <td>1920</td> <td>2150</td> </tr> <tr><td>Score after class</td> <td>1920</td> <td>2160</td> <td>2200</td> <td>2100</td> </tr> </tbody> </table> </div> </div> </div> <div class="textbox textbox--examples" data-type="example"><p>Seven eighth graders at Kennedy Middle School measured how far they could push the shot-put with their dominant (writing) hand and their weaker (non-writing) hand. They thought that they could push equal distances with either hand. The data were collected and recorded in <a class="autogenerated-content" href="#table-2535678">(Figure)</a>.</p> <table id="table-2535678" summary="This table presents the students shot-put distances by their dominant and non-dominant hand. The first column lists the hand type and the second through the eighth column represent the students. The first row is for the dominant hand and the second row is for the weaker hand."><thead valign="top"><tr><th data-align="center">Distance (in feet) using</th> <th>Student 1</th> <th>Student 2</th> <th>Student 3</th> <th>Student 4</th> <th>Student 5</th> <th>Student 6</th> <th>Student 7</th> </tr> </thead> <tbody valign="top"><tr><td>Dominant Hand</td> <td>30</td> <td>26</td> <td>34</td> <td>17</td> <td>19</td> <td>26</td> <td>20</td> </tr> <tr><td>Weaker Hand</td> <td>28</td> <td>14</td> <td>27</td> <td>18</td> <td>17</td> <td>26</td> <td>16</td> </tr> </tbody> </table> <p>&lt;!&#8211;MAGTULOY DITO&#8230;&#8211;&gt;</p> <p id="fs-idm44594480">Conduct a hypothesis test to determine whether the mean difference in distances between the children’s dominant versus weaker hands is significant.</p> <p id="fs-idp12305744">Record the <strong>differences</strong> data. Calculate the differences by subtracting the distances with the weaker hand from the distances with the dominant hand. The data for the differences are: {2, 12, 7, –1, 2, 0, 4}. The differences have a normal distribution.</p> <p id="fs-idp35626608">Using the differences data, calculate the sample mean and the sample standard deviation. \({\overline{x}}_{d}\) = 3.71, \({s}_{d}\) = 4.5.</p> <p id="fs-idm28861824"><strong>Random variable:</strong>\({\overline{X}}_{d}\) = mean difference in the distances between the hands.</p> <p id="fs-idm33656496"><strong>Distribution for the hypothesis test:</strong><em data-effect="italics">t<sub>6</sub></em></p> <p id="fs-idm15045744"><em data-effect="italics">H</em><sub>0</sub>: <em data-effect="italics">μ<sub>d</sub></em> = 0 <em data-effect="italics">H</em><em data-effect="italics"><sub>a</sub></em>: <em data-effect="italics">μ</em><em data-effect="italics"><sub>d</sub></em> ≠ 0</p> <p id="fs-idm14789184"><strong>Graph:</strong></p> <div id="fs-idp32743168" class="bc-figure figure"><span id="fs-idp32743424" data-type="media" data-alt="This is a normal distribution curve with mean equal to zero. Both the right and left tails of the curve are shaded. Each tail represents 1/2(p-value) = 0.0358."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C10_M05_001N-1.jpg" alt="This is a normal distribution curve with mean equal to zero. Both the right and left tails of the curve are shaded. Each tail represents 1/2(p-value) = 0.0358." width="380" data-media-type="image/jpg" /></span></div> <p id="fs-idm45951248"><strong>Calculate the <em data-effect="italics">p</em>-value:</strong> The <em data-effect="italics">p</em>-value is 0.0716 (using the data directly).</p> <p id="fs-idp16197920">(test statistic = 2.18. <em data-effect="italics">p</em>-value = 0.0719 using \(\left({\overline{x}}_{d}=3.71,\text{ }{s}_{d}=4.5.\right)\)</p> <p id="fs-idp45183040"><strong>Decision:</strong> Assume <em data-effect="italics">α</em> = 0.05. Since <em data-effect="italics">α</em> &lt; <em data-effect="italics">p</em>-value, Do not reject <em data-effect="italics">H<sub>0</sub></em>.</p> <p id="fs-idp71022560"><strong>Conclusion:</strong> At the 5% level of significance, from the sample data, there is not sufficient evidence to conclude that there is a difference in the children’s weaker and dominant hands to push the shot-put.</p> </div> <div id="fs-idp98011888" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try-It</div> <div id="fs-idm3639952" data-type="exercise"><div id="fs-idp119669552" data-type="problem"><p id="fs-idp119669808">Five ball players think they can throw the same distance with their dominant hand (throwing) and off-hand (catching hand). The data were collected and recorded in <a class="autogenerated-content" href="#fs-idp26621360">(Figure)</a>. Conduct a hypothesis test to determine whether the mean difference in distances between the dominant and off-hand is significant. Test at the 5% level.</p> <table id="fs-idp26621360" summary=""><thead><tr><th></th> <th>Player 1</th> <th>Player 2</th> <th>Player 3</th> <th>Player 4</th> <th>Player 5</th> </tr> </thead> <tbody><tr><td>Dominant Hand</td> <td>120</td> <td>111</td> <td>135</td> <td>140</td> <td>125</td> </tr> <tr><td>Off-hand</td> <td>105</td> <td>109</td> <td>98</td> <td>111</td> <td>99</td> </tr> </tbody> </table> </div> </div> </div> <div id="fs-idp51883760" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idp192128">A hypothesis test for matched or paired samples (t-test) has these characteristics:</p> <ul id="fs-idp192512"><li>Test the differences by subtracting one measurement from the other measurement</li> <li>Random Variable: \({\overline{x}}_{d}\) = mean of the differences</li> <li>Distribution: Student’s-t distribution with <em data-effect="italics">n</em> – 1 degrees of freedom</li> <li>If the number of differences is small (less than 30), the differences must follow a normal distribution.</li> <li>Two samples are drawn from the same set of objects.</li> <li>Samples are dependent.</li> </ul> </div> <div id="fs-idm5585024" class="formula-review" data-depth="1"><h3 data-type="title">Formula Review</h3> <p id="fs-idp95152944">Test Statistic (<em data-effect="italics">t</em>-score): <em data-effect="italics">t</em> = \(\frac{{\overline{x}}_{d}-{\mu }_{d}}{\left(\frac{{s}_{d}}{\sqrt{n}}\right)}\)</p> <p id="fs-idm30659744">where:</p> <p id="fs-idm8251264">\({\overline{x}}_{d}\) is the mean of the sample differences. <em data-effect="italics">μ</em><sub>d</sub> is the mean of the population differences. <em data-effect="italics">s<sub>d</sub></em> is the sample standard deviation of the differences. <em data-effect="italics">n</em> is the sample size.</p> </div> <div id="fs-idm5499920" class="practice" data-depth="1"><p id="fs-idp116024816"><em data-effect="italics">Use the following information to answer the next five exercises.</em> A study was conducted to test the effectiveness of a software patch in reducing system failures over a six-month period. Results for randomly selected installations are shown in <a class="autogenerated-content" href="#fs-idp116025200">(Figure)</a>. The “before” value is matched to an “after” value, and the differences are calculated. The differences have a normal distribution. Test at the 1% significance level.</p> <table id="fs-idp116025200" summary=""><thead><tr><th>Installation</th> <th>A</th> <th>B</th> <th>C</th> <th>D</th> <th>E</th> <th>F</th> <th>G</th> <th>H</th> </tr> </thead> <tbody><tr><td>Before</td> <td>3</td> <td>6</td> <td>4</td> <td>2</td> <td>5</td> <td>8</td> <td>2</td> <td>6</td> </tr> <tr><td>After</td> <td>1</td> <td>5</td> <td>2</td> <td>0</td> <td>1</td> <td>0</td> <td>2</td> <td>2</td> </tr> </tbody> </table> <div id="fs-idm5499280" data-type="exercise"><div id="fs-idp107935744" data-type="problem"><p id="fs-idp9763376">What is the random variable?</p> </div> <div id="fs-idp27834352" data-type="solution"><p id="fs-idp27834608">the mean difference of the system failures</p> </div> </div> <div id="fs-idp71083104" data-type="exercise"><div id="fs-idp22060800" data-type="problem"><p id="fs-idp22061056">State the null and alternative hypotheses.</p> </div> </div> <div id="fs-idm10056224" data-type="exercise"><div id="fs-idm120978416" data-type="problem"><p id="fs-idm120978160">What is the <em data-effect="italics">p</em>-value?</p> </div> <div id="fs-idp56985488" data-type="solution"><p id="fs-idp56985744">0.0067</p> </div> </div> <div id="fs-idp119203856" data-type="exercise"><div id="fs-idp119204112" data-type="problem"><p id="fs-idp119204368">Draw the graph of the <em data-effect="italics">p</em>-value.</p> </div> </div> <div id="fs-idp92713024" data-type="exercise"><div id="fs-idm6212864" data-type="problem"><p id="fs-idm6212608">What conclusion can you draw about the software patch?</p> </div> <div id="fs-idp48368" data-type="solution"><p id="fs-idp48624">With a <em data-effect="italics">p</em>-value 0.0067, we can reject the null hypothesis. There is enough evidence to support that the software patch is effective in reducing the number of system failures.</p> </div> </div> <p id="fs-idp5775856"><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer next five exercises.</em> A study was conducted to test the effectiveness of a juggling class. Before the class started, six subjects juggled as many balls as they could at once. After the class, the same six subjects juggled as many balls as they could. The differences in the number of balls are calculated. The differences have a normal distribution. Test at the 1% significance level.</p> <table id="fs-idp110216832" summary=""><thead><tr><th>Subject</th> <th>A</th> <th>B</th> <th>C</th> <th>D</th> <th>E</th> <th>F</th> </tr> </thead> <tbody><tr><td>Before</td> <td>3</td> <td>4</td> <td>3</td> <td>2</td> <td>4</td> <td>5</td> </tr> <tr><td>After</td> <td>4</td> <td>5</td> <td>6</td> <td>4</td> <td>5</td> <td>7</td> </tr> </tbody> </table> <div id="fs-idm940496" data-type="exercise"><div id="fs-idm54492608" data-type="problem"><p id="fs-idp81273744">State the null and alternative hypotheses.</p> </div> </div> <div id="fs-idp15011664" data-type="exercise"><div id="fs-idp15011920" data-type="problem"><p id="fs-idp15012176">What is the <em data-effect="italics">p</em>-value?</p> </div> <div id="fs-idp112806592" data-type="solution"><p id="fs-idp112806848">0.0021</p> </div> </div> <div id="fs-idp12451552" data-type="exercise"><div id="fs-idp12451808" data-type="problem"><p id="fs-idm14064240">What is the sample mean difference?</p> </div> </div> <div id="fs-idp9592768" data-type="exercise"><div id="fs-idp9593024" data-type="problem"><p id="fs-idp9593280">Draw the graph of the <em data-effect="italics">p</em>-value.</p> </div> <div id="fs-idp27275328" data-type="solution"><div id="fs-idp27275584" class="bc-figure figure"><span id="fs-idp79897888" data-type="media" data-alt="This is a normal distribution curve with mean equal to zero. The values 0 and 1.67 are labeled on the horiztonal axis. A vertical line extends from 1.67 to the curve. The region under the curve to the right of the line is shaded to represent p-value = 0.0021."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C10_M05_item002anno-1.jpg" alt="This is a normal distribution curve with mean equal to zero. The values 0 and 1.67 are labeled on the horiztonal axis. A vertical line extends from 1.67 to the curve. The region under the curve to the right of the line is shaded to represent p-value = 0.0021." width="380" data-media-type="image/jpg" /></span></div> </div> </div> <div id="fs-idm16412208" data-type="exercise"><div id="fs-idp41873440" data-type="problem"><p id="fs-idp41873696">What conclusion can you draw about the juggling class?</p> </div> </div> <p id="fs-idm27879072"><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next five exercises.</em> A doctor wants to know if a blood pressure medication is effective. Six subjects have their blood pressures recorded. After twelve weeks on the medication, the same six subjects have their blood pressure recorded again. For this test, only systolic pressure is of concern. Test at the 1% significance level.</p> <table id="fs-idm32701392" summary=""><thead><tr><th>Patient</th> <th>A</th> <th>B</th> <th>C</th> <th>D</th> <th>E</th> <th>F</th> </tr> </thead> <tbody><tr><td>Before</td> <td>161</td> <td>162</td> <td>165</td> <td>162</td> <td>166</td> <td>171</td> </tr> <tr><td>After</td> <td>158</td> <td>159</td> <td>166</td> <td>160</td> <td>167</td> <td>169</td> </tr> </tbody> </table> <div id="fs-idp5612992" data-type="exercise"><div id="fs-idp5613248" data-type="problem"><p id="fs-idp119813056">State the null and alternative hypotheses.</p> </div> <div id="fs-idp119813568" data-type="solution"><p id="fs-idm49836624"><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ<sub>d</sub></em> ≥ 0</p> <p id="fs-idm47657296"><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ<sub>d</sub></em> &lt; 0</p> </div> </div> <div id="fs-idm60519136" data-type="exercise"><div id="fs-idm60518880" data-type="problem"><p id="fs-idp8433040">What is the test statistic?</p> </div> </div> <div id="fs-idp69509280" data-type="exercise"><div id="fs-idp8847888" data-type="problem"><p id="fs-idp8848144">What is the <em data-effect="italics">p</em>-value?</p> </div> <div id="fs-idp113313936" data-type="solution"><p id="fs-idp113314192">0.0699</p> </div> </div> <div id="fs-idp22543728" data-type="exercise"><div id="fs-idp22543984" data-type="problem"><p id="fs-idm72860048">What is the sample mean difference?</p> </div> </div> <div id="fs-idm55198624" data-type="exercise"><div id="fs-idm27177184" data-type="problem"><p id="fs-idm27176928">What is the conclusion?</p> </div> <div id="fs-idp22337696" data-type="solution"><p id="fs-idp22337952">We decline to reject the null hypothesis. There is not sufficient evidence to support that the medication is effective.</p> </div> </div> </div> <div id="fs-idp14136208" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <p id="fs-idp84148336"><em data-effect="italics">DIRECTIONS: For each of the word problems, use a solution sheet to do the hypothesis test. The solution sheet is found in <a href="/contents/c0449c55-aa47-4f1c-bd5f-0521652f0e82">Appendix E</a>. Please feel free to make copies of the solution sheets. For the online version of the book, it is suggested that you copy the .doc or the .pdf files.</em></p> <div data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="eip-idm140675200">If you are using a Student&#8217;s <em data-effect="italics">t</em>-distribution for the homework problems, including for paired data, you may assume that the underlying population is normally distributed. (When using these tests in a real situation, you must first prove that assumption, however.)</p> </div> <div id="fs-idp64398288" data-type="exercise"><div id="fs-idp64398544" data-type="problem"><p id="fs-idp64398800">1) Ten individuals went on a low–fat diet for 12 weeks to lower their cholesterol. The data are recorded in <a class="autogenerated-content" href="#fs-idp16051952">(Figure)</a>. Do you think that their cholesterol levels were significantly lowered?</p> <table id="fs-idp16051952" summary="The table presents the starting cholesterol level in the first row and the ending cholesterol level in the second row."><thead><tr><th>Starting cholesterol level</th> <th>Ending cholesterol level</th> </tr> </thead> <tbody><tr><td>140</td> <td>140</td> </tr> <tr><td>220</td> <td>230</td> </tr> <tr><td>110</td> <td>120</td> </tr> <tr><td>240</td> <td>220</td> </tr> <tr><td>200</td> <td>190</td> </tr> <tr><td>180</td> <td>150</td> </tr> <tr><td>190</td> <td>200</td> </tr> <tr><td>360</td> <td>300</td> </tr> <tr><td>280</td> <td>300</td> </tr> <tr><td>260</td> <td>240</td> </tr> </tbody> </table> </div> <div id="fs-idp21115136" data-type="solution"><p id="fs-idp6749856"><em data-effect="italics">p</em>-value = 0.1494</p> <p id="fs-idp164535872"></p></div> </div> <p><em data-effect="italics">Use the following information to answer the next two exercises.</em></p> <p>A new AIDS prevention drug was tried on a group of 224 HIV positive patients. Forty-five patients developed AIDS after four years. In a control group of 224 HIV positive patients, 68 developed AIDS after four years. We want to test whether the method of treatment reduces the proportion of patients that develop AIDS after four years or if the proportions of the treated group and the untreated group stay the same.</p> <p>Let the subscript <em data-effect="italics">t</em> = treated patient and <em data-effect="italics">ut</em> = untreated patient.</p> <div id="element-719" data-type="exercise"><div id="id17683497" data-type="problem"><p id="element-505">2) The appropriate hypotheses are:</p> <ol type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p<sub>t</sub></em> &lt; <em data-effect="italics">p<sub>ut</sub></em> and <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p<sub>t</sub></em> ≥ <em data-effect="italics">p<sub>ut</sub></em></li> <li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p<sub>t</sub></em> ≤ <em data-effect="italics">p<sub>ut</sub></em> and <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p<sub>t</sub></em> &gt; <em data-effect="italics">p<sub>ut</sub></em></li> <li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p<sub>t</sub></em> = <em data-effect="italics">p<sub>ut</sub></em> and <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p<sub>t</sub></em> ≠ <em data-effect="italics">p<sub>ut</sub></em></li> <li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">p<sub>t</sub></em> = <em data-effect="italics">p<sub>ut</sub></em> and <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">p<sub>t</sub></em> &lt; <em data-effect="italics">p<sub>ut</sub></em></li> </ol> <p>&nbsp;</p> </div> </div> <div data-type="exercise"><div id="id12277184" data-type="problem"><p>3) If the <em data-effect="italics">p</em>-value is 0.0062 what is the conclusion (use <em data-effect="italics">α</em> = 0.05)?</p> <ol id="list1111" type="a"><li>The method has no effect.</li> <li>There is sufficient evidence to conclude that the method reduces the proportion of HIV positive patients who develop AIDS after four years.</li> <li>There is sufficient evidence to conclude that the method increases the proportion of HIV positive patients who develop AIDS after four years.</li> <li>There is insufficient evidence to conclude that the method reduces the proportion of HIV positive patients who develop AIDS after four years.</li> </ol> </div> <div id="id7050591" data-type="solution"><p>&nbsp;</p> </div> </div> <p><em data-effect="italics">Use the following information to answer the next two exercises.</em></p> <p>An experiment is conducted to show that blood pressure can be consciously reduced in people trained in a “biofeedback exercise program.” Six subjects were randomly selected and blood pressure measurements were recorded before and after the training. The difference between blood pressures was calculated (after &#8211; before) producing the following results: \({\overline{x}}_{d}\) = −10.2 <em data-effect="italics">s<sub>d</sub></em> = 8.4. Using the data, test the hypothesis that the blood pressure has decreased after the training.</p> <div data-type="exercise"><div id="id4488558" data-type="problem"><p>4) The distribution for the test is:</p> <ol type="a"><li><em data-effect="italics">t<sub>5</sub></em></li> <li><em data-effect="italics">t<sub>6</sub></em></li> <li><em data-effect="italics">N</em>(−10.2, 8.4)</li> <li>N(−10.2, \(\frac{8.4}{\sqrt{6}}\))</li> </ol> <p>&nbsp;</p> </div> </div> <div id="element-954" data-type="exercise"><div id="id6323430" data-type="problem"><p>5) If <em data-effect="italics">α</em> = 0.05, the <em data-effect="italics">p</em>-value and the conclusion are</p> <ol id="element-595" type="a"><li>0.0014; There is sufficient evidence to conclude that the blood pressure decreased after the training.</li> <li>0.0014; There is sufficient evidence to conclude that the blood pressure increased after the training.</li> <li>0.0155; There is sufficient evidence to conclude that the blood pressure decreased after the training.</li> <li>0.0155; There is sufficient evidence to conclude that the blood pressure increased after the training.</li> </ol> </div> <div id="id12581096" data-type="solution"><p id="element-981"></p></div> </div> <div data-type="exercise"><div id="id4419960" data-type="problem"><p>6) A golf instructor is interested in determining if her new technique for improving players’ golf scores is effective. She takes four new students. She records their 18-hole scores before learning the technique and then after having taken her class. She conducts a hypothesis test. The data are as follows.</p> <table summary="..."><thead><tr><th></th> <th>Player 1</th> <th>Player 2</th> <th>Player 3</th> <th>Player 4</th> </tr> </thead> <tbody><tr><td>Mean score before class</td> <td>83</td> <td>78</td> <td>93</td> <td>87</td> </tr> <tr><td>Mean score after class</td> <td>80</td> <td>80</td> <td>86</td> <td>86</td> </tr> </tbody> </table> <p>The correct decision is:</p> <ol type="a"><li>Reject <em data-effect="italics">H<sub>0</sub></em>.</li> <li>Do not reject the <em data-effect="italics">H<sub>0</sub></em>.</li> </ol> <p>&nbsp;</p> </div> </div> <div id="fs-idp31663264" data-type="exercise"><div id="fs-idp31663520" data-type="problem"><p id="fs-idm44180368">7) A local cancer support group believes that the estimate for new female breast cancer cases in the south is higher in 2013 than in 2012. The group compared the estimates of new female breast cancer cases by southern state in 2012 and in 2013. The results are in <a class="autogenerated-content" href="#fs-idp20856160">(Figure)</a>.</p> <table id="fs-idp20856160" summary=""><thead><tr><th>Southern States</th> <th>2012</th> <th>2013</th> </tr> </thead> <tbody><tr><td>Alabama</td> <td>3,450</td> <td>3,720</td> </tr> <tr><td>Arkansas</td> <td>2,150</td> <td>2,280</td> </tr> <tr><td>Florida</td> <td>15,540</td> <td>15,710</td> </tr> <tr><td>Georgia</td> <td>6,970</td> <td>7,310</td> </tr> <tr><td>Kentucky</td> <td>3,160</td> <td>3,300</td> </tr> <tr><td>Louisiana</td> <td>3,320</td> <td>3,630</td> </tr> <tr><td>Mississippi</td> <td>1,990</td> <td>2,080</td> </tr> <tr><td>North Carolina</td> <td>7,090</td> <td>7,430</td> </tr> <tr><td>Oklahoma</td> <td>2,630</td> <td>2,690</td> </tr> <tr><td>South Carolina</td> <td>3,570</td> <td>3,580</td> </tr> <tr><td>Tennessee</td> <td>4,680</td> <td>5,070</td> </tr> <tr><td>Texas</td> <td>15,050</td> <td>14,980</td> </tr> <tr><td>Virginia</td> <td>6,190</td> <td>6,280</td> </tr> </tbody> </table> </div> <div id="fs-idp65561056" data-type="solution"><p id="fs-idp84781280"></p></div> </div> <div id="fs-idp83186240" data-type="exercise"><div id="fs-idp70867824" data-type="problem"><p id="fs-idp70868080">8) A traveler wanted to know if the prices of hotels are different in the ten cities that he visits the most often. The list of the cities with the corresponding hotel prices for his two favorite hotel chains is in <a class="autogenerated-content" href="#fs-idp126693504">(Figure)</a>. Test at the 1% level of significance.</p> <table id="fs-idp126693504" summary=""><thead><tr><th>Cities</th> <th>Hyatt Regency prices in dollars</th> <th>Hilton prices in dollars</th> </tr> </thead> <tbody><tr><td>Atlanta</td> <td>107</td> <td>169</td> </tr> <tr><td>Boston</td> <td>358</td> <td>289</td> </tr> <tr><td>Chicago</td> <td>209</td> <td>299</td> </tr> <tr><td>Dallas</td> <td>209</td> <td>198</td> </tr> <tr><td>Denver</td> <td>167</td> <td>169</td> </tr> <tr><td>Indianapolis</td> <td>179</td> <td>214</td> </tr> <tr><td>Los Angeles</td> <td>179</td> <td>169</td> </tr> <tr><td>New York City</td> <td>625</td> <td>459</td> </tr> <tr><td>Philadelphia</td> <td>179</td> <td>159</td> </tr> <tr><td>Washington, DC</td> <td>245</td> <td>239</td> </tr> </tbody> </table> </div> </div> <div id="fs-idp111440192" data-type="exercise"><div id="fs-idp111440448" data-type="problem"><p id="fs-idp14986256">9) A politician asked his staff to determine whether the underemployment rate in the northeast decreased from 2011 to 2012. The results are in <a class="autogenerated-content" href="#fs-idp10131504">(Figure)</a>.</p> <table id="fs-idp10131504" summary=""><thead><tr><th>Northeastern States</th> <th>2011</th> <th>2012</th> </tr> </thead> <tbody><tr><td>Connecticut</td> <td>17.3</td> <td>16.4</td> </tr> <tr><td>Delaware</td> <td>17.4</td> <td>13.7</td> </tr> <tr><td>Maine</td> <td>19.3</td> <td>16.1</td> </tr> <tr><td>Maryland</td> <td>16.0</td> <td>15.5</td> </tr> <tr><td>Massachusetts</td> <td>17.6</td> <td>18.2</td> </tr> <tr><td>New Hampshire</td> <td>15.4</td> <td>13.5</td> </tr> <tr><td>New Jersey</td> <td>19.2</td> <td>18.7</td> </tr> <tr><td>New York</td> <td>18.5</td> <td>18.7</td> </tr> <tr><td>Ohio</td> <td>18.2</td> <td>18.8</td> </tr> <tr><td>Pennsylvania</td> <td>16.5</td> <td>16.9</td> </tr> <tr><td>Rhode Island</td> <td>20.7</td> <td>22.4</td> </tr> <tr><td>Vermont</td> <td>14.7</td> <td>12.3</td> </tr> <tr><td>West Virginia</td> <td>15.5</td> <td>17.3</td> </tr> </tbody> </table> </div> <div id="fs-idm813936" data-type="solution"><p id="fs-idp36520624"></p></div> </div> </div> <div id="fs-idm68580000" class="bring-together-homework" data-depth="1"><h3 data-type="title">Bringing It Together</h3> <p id="fs-idp56242144"><em data-effect="italics">Use the following information to answer the next ten exercises.</em> indicate which of the following choices best identifies the hypothesis test.</p> <ol type="a"><li>independent group means, population standard deviations and/or variances known</li> <li>independent group means, population standard deviations and/or variances unknown</li> <li>matched or paired samples</li> <li>single mean</li> <li>two proportions</li> <li>single proportion</li> </ol> <p>&nbsp;</p> <div id="fs-idm69867264" data-type="exercise"><div id="fs-idp4767888" data-type="problem"><p id="fs-idp4768144">10) A powder diet is tested on 49 people, and a liquid diet is tested on 36 different people. The population standard deviations are two pounds and three pounds, respectively. Of interest is whether the liquid diet yields a higher mean weight loss than the powder diet.</p> <p>&nbsp;</p> </div> </div> <div id="fs-idm89593504" data-type="exercise"><div id="fs-idm89593248" data-type="problem"><p id="fs-idm68777296">11) A new chocolate bar is taste-tested on consumers. Of interest is whether the proportion of children who like the new chocolate bar is greater than the proportion of adults who like it.</p> </div> <div id="fs-idp32894736" data-type="solution"><p id="fs-idp32894992"></p></div> </div> <div id="fs-idm66474352" data-type="exercise"><div id="fs-idm151765440" data-type="problem"><p id="fs-idm151765184">12) The mean number of English courses taken in a two–year time period by male and female college students is believed to be about the same. An experiment is conducted and data are collected from nine males and 16 females.</p> <p>&nbsp;</p> </div> </div> <div id="fs-idm53146784" data-type="exercise"><div id="fs-idm20367904" data-type="problem"><p id="fs-idm20367648">13) A football league reported that the mean number of touchdowns per game was five. A study is done to determine if the mean number of touchdowns has decreased.</p> </div> <div id="fs-idp145517728" data-type="solution"><p id="fs-idp145517984"></p></div> </div> <div id="fs-idm118453520" data-type="exercise"><div id="fs-idm118453264" data-type="problem"><p id="fs-idm66098144">14) A study is done to determine if students in the California state university system take longer to graduate than students enrolled in private universities. One hundred students from both the California state university system and private universities are surveyed. From years of research, it is known that the population standard deviations are 1.5811 years and one year, respectively.</p> <p>&nbsp;</p> </div> </div> <div id="fs-idp12095488" data-type="exercise"><div id="fs-idm80698688" data-type="problem"><p id="fs-idm80698432">15) According to a YWCA Rape Crisis Center newsletter, 75% of rape victims know their attackers. A study is done to verify this.</p> </div> <div id="fs-idm10304768" data-type="solution"><p id="fs-idm10304512"></p></div> </div> <div id="fs-idm63898096" data-type="exercise"><div id="fs-idm63897840" data-type="problem"><p id="fs-idm84604768">16) According to a recent study, U.S. companies have a mean maternity-leave of six weeks.</p> </div> </div> <div id="fs-idm45776544" data-type="exercise"><div id="fs-idm45776288" data-type="problem"><p id="fs-idm135539360">17) A recent drug survey showed an increase in use of drugs and alcohol among local high school students as compared to the national percent. Suppose that a survey of 100 local youths and 100 national youths is conducted to see if the proportion of drug and alcohol use is higher locally than nationally.</p> </div> <div id="fs-idm13080224" data-type="solution"><p id="fs-idp101006144"></p></div> </div> <div id="fs-idm129031904" data-type="exercise"><div id="fs-idm129031648" data-type="problem"><p id="fs-idm88824880">18) A new SAT study course is tested on 12 individuals. Pre-course and post-course scores are recorded. Of interest is the mean increase in SAT scores. The following data are collected:</p> <table id="eip-idp6708160" summary=""><thead><tr><th>Pre-course score</th> <th>Post-course score</th> </tr> </thead> <tbody><tr><td>1</td> <td>300</td> </tr> <tr><td>960</td> <td>920</td> </tr> <tr><td>1010</td> <td>1100</td> </tr> <tr><td>840</td> <td>880</td> </tr> <tr><td>1100</td> <td>1070</td> </tr> <tr><td>1250</td> <td>1320</td> </tr> <tr><td>860</td> <td>860</td> </tr> <tr><td>1330</td> <td>1370</td> </tr> <tr><td>790</td> <td>770</td> </tr> <tr><td>990</td> <td>1040</td> </tr> <tr><td>1110</td> <td>1200</td> </tr> <tr><td>740</td> <td>850</td> </tr> </tbody> </table> <p>&nbsp;</p> </div> </div> <div id="fs-idm28383952" data-type="exercise"><div id="fs-idm28383696" data-type="problem"><p id="fs-idm15629152">19) University of Michigan researchers reported in the <cite><span data-type="cite-title">Journal of the National Cancer Institute</span></cite> that quitting smoking is especially beneficial for those under age 49. In this American Cancer Society study, the risk (probability) of dying of lung cancer was about the same as for those who had never smoked.</p> </div> <div id="fs-idp58402592" data-type="solution"><p id="fs-idp58402848"></p></div> </div> <div data-type="exercise"><div data-type="problem"><p>20) Lesley E. Tan investigated the relationship between left-handedness vs. right-handedness and motor competence in preschool children. Random samples of 41 left-handed preschool children and 41 right-handed preschool children were given several tests of motor skills to determine if there is evidence of a difference between the children based on this experiment. The experiment produced the means and standard deviations shown <a class="autogenerated-content" href="#fs-idp128816944">(Figure)</a>. Determine the appropriate test and best distribution to use for that test.</p> <table id="fs-idp128816944" summary="..."><tbody><tr><td></td> <td>Left-handed</td> <td>Right-handed</td> </tr> <tr><td>Sample size</td> <td>41</td> <td>41</td> </tr> <tr><td>Sample mean</td> <td>97.5</td> <td>98.1</td> </tr> <tr><td>Sample standard deviation</td> <td>17.5</td> <td>19.2</td> </tr> </tbody> </table> <ol id="eip-idm67605488" type="a"><li>Two independent means, normal distribution</li> <li>Two independent means, Student’s-t distribution</li> <li>Matched or paired samples, Student’s-t distribution</li> <li>Two population proportions, normal distribution</li> </ol> <p>&nbsp;</p> </div> </div> <div id="fs-idp144471136" data-type="exercise"><div id="fs-idp144471392" data-type="problem"><p>21) A golf instructor is interested in determining if her new technique for improving players’ golf scores is effective. She takes four (4) new students. She records their 18-hole scores before learning the technique and then after having taken her class. She conducts a hypothesis test. The data are as <a class="autogenerated-content" href="#fs-idp136885584">(Figure)</a>.</p> <table id="fs-idp136885584" summary="table question"><thead><tr><th></th> <th>Player 1</th> <th>Player 2</th> <th>Player 3</th> <th>Player 4</th> </tr> </thead> <tbody><tr><td>Mean score before class</td> <td>83</td> <td>78</td> <td>93</td> <td>87</td> </tr> <tr><td>Mean score after class</td> <td>80</td> <td>80</td> <td>86</td> <td>86</td> </tr> </tbody> </table> <p>This is:</p> <ol id="eip-idm74615072" type="a"><li>a test of two independent means.</li> <li>a test of two proportions.</li> <li>a test of a single mean.</li> <li>a test of a single proportion.</li> </ol> </div> <div data-type="solution"><p><strong>Answers to odd questions</strong></p> <p>1) At the 5% significance level, there is insufficient evidence to conclude that the medication lowered cholesterol levels after 12 weeks.</p> <p>3) b</p> <p>5) c</p> <p>7)</p> <p id="fs-idp65561312">Test: two matched pairs or paired samples (<em data-effect="italics">t</em>-test)</p> <p id="fs-idp123141440">Random variable: \({\overline{X}}_{d}\)</p> <p id="fs-idp45658064">Distribution: <em data-effect="italics">t</em><sub>12</sub></p> <p id="fs-idp85679904"><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ</em><em data-effect="italics"><sub>d</sub></em> = 0 <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em><em data-effect="italics"><sub>d</sub></em> &gt; 0</p> <p id="fs-idm70033392">The mean of the differences of new female breast cancer cases in the south between 2013 and 2012 is greater than zero. The estimate for new female breast cancer cases in the south is higher in 2013 than in 2012.</p> <p id="fs-idp16018656">Graph: right-tailed</p> <p id="fs-idp14099296"><em data-effect="italics">p</em>-value: 0.0004</p> <div id="fs-idp14099680" class="bc-figure figure"><span id="fs-idm6747376" data-type="media" data-alt="This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the right of zero extends from the axis to the curve. The region under the curve to the right of the line is shaded representing p-value = 0.0004."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C10_M05_002annoN-1.jpg" alt="This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the right of zero extends from the axis to the curve. The region under the curve to the right of the line is shaded representing p-value = 0.0004." width="380" data-media-type="image/jpg" /></span></div> <p id="fs-idp11877696">Decision: Reject <em data-effect="italics">H<sub>0</sub></em></p> <p>Conclusion: At the 5% level of significance, from the sample data, there is sufficient evidence to conclude that there was a higher estimate of new female breast cancer cases in 2013 than in 2012.</p> <p>9)</p> <p id="fs-idm813680">Test: matched or paired samples (<em data-effect="italics">t</em>-test)</p> <p id="fs-idp27663120">Difference data: {–0.9, –3.7, –3.2, –0.5, 0.6, –1.9, –0.5, 0.2, 0.6, 0.4, 1.7, –2.4, 1.8}</p> <p id="fs-idp27663504">Random Variable: \({\overline{X}}_{d}\)</p> <p id="fs-idm32460304">Distribution: <em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ<sub>d</sub></em> = 0 <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ<sub>d</sub></em> &lt; 0</p> <p id="fs-idp91580896">The mean of the differences of the rate of underemployment in the northeastern states between 2012 and 2011 is less than zero. The underemployment rate went down from 2011 to 2012.</p> <p id="fs-idp91581472">Graph: left-tailed.</p> <div id="fs-idm60026752" class="bc-figure figure"><span id="fs-idm60026496" data-type="media" data-alt="This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the right of zero extends from the axis to the curve. The region under the curve to the right of the line is shaded representing p-value = 0.1207."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C10_M05_004annoN-1.jpg" alt="This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the right of zero extends from the axis to the curve. The region under the curve to the right of the line is shaded representing p-value = 0.1207." width="380" data-media-type="image/jpg" /></span></div> <p id="fs-idm57153840"><em data-effect="italics">p</em>-value: 0.1207</p> <p id="fs-idm57153456">Decision: Do not reject <em data-effect="italics">H<sub>0</sub></em>.</p> <p>Conclusion: At the 5% level of significance, from the sample data, there is not sufficient evidence to conclude that there was a decrease in the underemployment rates of the northeastern states from 2011 to 2012.</p> <p>11) e</p> <p>13) d</p> <p>15) f</p> <p>17) e</p> <p>19) f</p> <p>21) a</p> </div> </div> </div> </div></div>
<div class="chapter standard" id="chapter-two-population-means-with-unknown-standard-deviations" title="Chapter 11.4: Two Population Means with Unknown Standard Deviations"><div class="chapter-title-wrap"><h3 class="chapter-number">65</h3><h2 class="chapter-title"><span class="display-none">Chapter 11.4: Two Population Means with Unknown Standard Deviations</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <ol><li>The two independent samples are simple random samples from two distinct populations.</li> <li>For the two distinct populations: <ul id="fs-idm39244848"><li>if the sample sizes are small, the distributions are important (should be normal)</li> <li>if the sample sizes are large, the distributions are not important (need not be normal)</li> </ul> </li> </ol> <div id="fs-idm107349216" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="fs-idm165699296">The test comparing two independent population means with unknown and possibly unequal population standard deviations is called the Aspin-Welch t-test. The degrees of freedom formula was developed by Aspin-Welch.</p> </div> <p>The comparison of two population means is very common. A difference between the two samples depends on both the means and the standard deviations. Very different means can occur by chance if there is great variation among the individual samples. In order to account for the variation, we take the difference of the sample means, \({\overline{X}}_{1}\) – \({\overline{X}}_{2}\), and divide by the standard error in order to standardize the difference. The result is a t-score test statistic.</p> <p>Because we do not know the population standard deviations, we estimate them using the two sample standard deviations from our independent samples. For the hypothesis test, we calculate the estimated standard deviation, or <span data-type="term">standard error</span>, of <strong>the difference in sample means</strong>, \({\overline{X}}_{1}\) – \({\overline{X}}_{2}\).</p> <div id="std_err" data-type="equation"><span data-type="title">The standard error is:</span>\(\sqrt{\frac{\left({s}_{1}{\right)}^{2}}{{n}_{1}}+\frac{\left({s}_{2}{\right)}^{2}}{{n}_{2}}}\)</div> <p>The test statistic (<em data-effect="italics">t</em>-score) is calculated as follows:</p> <div id="t-score2" data-type="equation">\(\frac{\text{(}{\overline{x}}_{1}–{\overline{x}}_{2}\text{)}–\text{(}{\mu }_{1}–{\mu }_{2}\text{)}}{\sqrt{\frac{{\text{(}{s}_{1}\text{)}}^{2}}{{n}_{1}}+\frac{{\text{(}{s}_{2}\text{)}}^{2}}{{n}_{2}}}}\)</div> <div data-type="list"><div data-type="title">where:</div> <ul><li><em data-effect="italics">s</em><sub>1</sub> and <em data-effect="italics">s</em><sub>2</sub>, the sample standard deviations, are estimates of <em data-effect="italics">σ</em><sub>1</sub> and <em data-effect="italics">σ</em><sub>2</sub>, respectively.</li> <li><em data-effect="italics">σ</em><sub>1</sub> and <em data-effect="italics">σ</em><sub>1</sub> are the unknown population standard deviations.</li> <li>\({\overline{x}}_{1}\) and \({\overline{x}}_{2}\) are the sample means. <em data-effect="italics">μ</em><sub>1</sub> and <em data-effect="italics">μ</em><sub>2</sub> are the population means.</li> </ul> </div> <p id="element-256" class="finger">The number of <span data-type="term">degrees of freedom (<em data-effect="italics">df</em>)</span> requires a somewhat complicated calculation. However, a computer or calculator calculates it easily. The <em data-effect="italics">df</em> are not always a whole number. The test statistic calculated previously is approximated by the Student&#8217;s <em data-effect="italics">t</em>-distribution with <em data-effect="italics">df</em> as follows:</p> <div id="eq_df" data-type="equation"><span data-type="title">Degrees of freedom</span>\(df=\frac{{\left(\frac{{\left({s}_{1}\right)}^{2}}{{n}_{1}}+\frac{{\left({s}_{2}\right)}^{2}}{{n}_{2}}\right)}^{2}}{\left(\frac{1}{{n}_{1}–1}\right){\left(\frac{{\left({s}_{1}\right)}^{2}}{{n}_{1}}\right)}^{2}+\left(\frac{1}{{n}_{2}–1}\right){\left(\frac{{\left({s}_{2}\right)}^{2}}{{n}_{2}}\right)}^{2}}\)</div> <p id="element-316">When both sample sizes <em data-effect="italics">n</em><sub>1</sub> and <em data-effect="italics">n</em><sub>2</sub> are five or larger, the Student&#8217;s <em data-effect="italics">t</em> approximation is very good. Notice that the sample variances (<em data-effect="italics">s</em><sub>1</sub>)<sup>2</sup> and (<em data-effect="italics">s</em><sub>2</sub>)<sup>2</sup> are not pooled. (If the question comes up, do not pool the variances.)</p> <div id="fs-idm74088160" class="finger" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="fs-idm378573120">It is not necessary to compute this by hand. A calculator or computer easily computes it.</p> </div> <div class="textbox textbox--examples" data-type="example"><div data-type="title">Independent groups</div> <p>The average amount of time boys and girls aged seven to 11 spend playing sports each day is believed to be the same. A study is done and data are collected, resulting in the data in <a class="autogenerated-content" href="#uid888">(Figure)</a>. Each populations has a normal distribution.</p> <table id="uid888" summary="This table presents the sample size in the second column, average hours a day in the third column, and the sample standard deviation in the fourth column. The first row is for girls and the second row is for boys."><thead valign="top"><tr><th data-align="center"></th> <th data-align="center">Sample Size</th> <th data-align="center">Average Number of Hours Playing Sports Per Day</th> <th data-align="center">Sample Standard Deviation</th> </tr> </thead> <tbody valign="top"><tr><td>Girls</td> <td>9</td> <td>2</td> <td>\(0.866\)</td> </tr> <tr><td>Boys</td> <td>16</td> <td>3.2</td> <td>1.00</td> </tr> </tbody> </table> <div data-type="exercise"><div id="id6322791" data-type="problem"><p>Is there a difference in the mean amount of time boys and girls aged seven to 11 play sports each day? Test at the 5% level of significance.</p> </div> <div id="id21329984" data-type="solution"><p><strong>The population standard deviations are not known.</strong> Let <em data-effect="italics">g</em> be the subscript for girls and <em data-effect="italics">b</em> be the subscript for boys. Then, <em data-effect="italics">μ<sub>g</sub></em> is the population mean for girls and <em data-effect="italics">μ<sub>b</sub></em> is the population mean for boys. This is a test of two <strong>independent groups</strong>, two population <strong>means</strong>.</p> <p><span data-type="term">Random variable</span>: \({\overline{X}}_{g}-{\overline{X}}_{b}\) = difference in the sample mean amount of time girls and boys play sports each day. <span data-type="newline"><br /> </span><em data-effect="italics">H</em><sub>0</sub>: <em data-effect="italics">μ<sub>g</sub></em> = <em data-effect="italics">μ<sub>b</sub></em>  <em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ<sub>g</sub></em> – <em data-effect="italics">μ<sub>b</sub></em> = 0 <span data-type="newline"><br /> </span><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ<sub>g</sub></em> ≠ <em data-effect="italics">μ<sub>b</sub></em>  <em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ<sub>g</sub></em> – <em data-effect="italics">μ<sub>b</sub></em> ≠ 0 <span data-type="newline"><br /> </span>The words <strong>&#8220;the same&#8221;</strong> tell you <em data-effect="italics">H<sub>0</sub></em> has an &#8220;=&#8221;. Since there are no other words to indicate <em data-effect="italics">H<sub>a</sub></em>, assume it says <strong>&#8220;is different.&#8221;</strong> This is a two-tailed test.</p> <p><strong>Distribution for the test:</strong> Use <em data-effect="italics">t<sub>df</sub></em> where <em data-effect="italics">df</em> is calculated using the <em data-effect="italics">df</em> formula for independent groups, two population means. Using a calculator, <em data-effect="italics">df</em> is approximately 18.8462. <strong>Do not pool the variances.</strong></p> <p><strong>Calculate the <em data-effect="italics">p</em>-value using a Student&#8217;s <em data-effect="italics">t</em>-distribution:</strong><em data-effect="italics">p</em>-value = 0.0054</p> <p id="element-90"><strong>Graph:</strong></p> <div id="hyptest22_cmp1" class="bc-figure figure"><span id="id18963169" data-type="media" data-alt="This is a normal distribution curve representing the difference in the average amount of time girls and boys play sports all day. The mean is equal to zero, and the values -1.2, 0, and 1.2 are labeled on the horizontal axis. Two vertical lines extend from -1.2 and 1.2 to the curve. The region to the left of x = -1.2 and the region to the right of x = 1.2 are shaded to represent the p-value. The area of each region is 0.0028."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch10_02_01-1.jpg" alt="This is a normal distribution curve representing the difference in the average amount of time girls and boys play sports all day. The mean is equal to zero, and the values -1.2, 0, and 1.2 are labeled on the horizontal axis. Two vertical lines extend from -1.2 and 1.2 to the curve. The region to the left of x = -1.2 and the region to the right of x = 1.2 are shaded to represent the p-value. The area of each region is 0.0028." width="380" data-media-type="image/jpg" /></span></div> <p><span data-type="newline"><br /> </span>\({s}_{g}=0.866\)<span data-type="newline"><br /> </span>\({s}_{b}=1\)<span data-type="newline"><br /> </span>So, \({\overline{x}}_{g}–{\overline{x}}_{b}\) = 2 – 3.2 = –1.2 <span data-type="newline"><br /> </span>Half the <em data-effect="italics">p</em>-value is below –1.2 and half is above 1.2.</p> <p><strong>Make a decision:</strong> Since <em data-effect="italics">α</em> &gt; <em data-effect="italics">p</em>-value, reject <em data-effect="italics">H<sub>0</sub></em>. This means you reject <em data-effect="italics">μ<sub>g</sub></em> = <em data-effect="italics">μ<sub>b</sub></em>. The means are different.</p> <div id="fs-idp61535456" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="fs-idp144406384">Press <code>STAT</code>. Arrow over to <code>TESTS</code> and press <code>4:2-SampTTest</code>. Arrow over to Stats and press <code>ENTER</code>. Arrow down and enter <code>2</code> for the first sample mean, <code>\(0.866\)</code> for Sx1, <code>9</code> for n1, <code>3.2</code> for the second sample mean, <code>1</code> for Sx2, and <code>16</code> for n2. Arrow down to μ1: and arrow to <code>does not equal</code> μ2. Press <code>ENTER</code>. Arrow down to Pooled: and <code>No</code>. Press <code>ENTER</code>. Arrow down to <code>Calculate</code> and press <code>ENTER</code>. The <em data-effect="italics">p</em>-value is <em data-effect="italics">p</em> = 0.0054, the dfs are approximately 18.8462, and the test statistic is -3.14. Do the procedure again but instead of Calculate do Draw.</p> </div> <p><strong>Conclusion:</strong> At the 5% level of significance, the sample data show there is sufficient evidence to conclude that the mean number of hours that girls and boys aged seven to 11 play sports per day is different (mean number of hours boys aged seven to 11 play sports per day is greater than the mean number of hours played by girls OR the mean number of hours girls aged seven to 11 play sports per day is greater than the mean number of hours played by boys).</p> </div> </div> </div> <div id="fs-idp67783984" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="fs-idm235264144" data-type="exercise"><div id="fs-idm162906288" data-type="problem"><p id="fs-idm193306272">Two samples are shown in <a class="autogenerated-content" href="#fs-idm99631856">(Figure)</a>. Both have normal distributions. The means for the two populations are thought to be the same. Is there a difference in the means? Test at the 5% level of significance.</p> <table id="fs-idm99631856" summary=""><thead><tr><th></th> <th>Sample Size</th> <th>Sample Mean</th> <th>Sample Standard Deviation</th> </tr> </thead> <tbody><tr><td>Population A</td> <td>25</td> <td>5</td> <td>1</td> </tr> <tr><td>Population B</td> <td>16</td> <td>4.7</td> <td>1.2</td> </tr> </tbody> </table> </div> </div> </div> <div id="fs-idm50414208" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="fs-idm204425680">When the sum of the sample sizes is larger than 30 (<em data-effect="italics">n</em><sub>1</sub> + <em data-effect="italics">n</em><sub>2</sub> &gt; 30) you can use the normal distribution to approximate the Student&#8217;s <em data-effect="italics">t</em>.</p> </div> <div class="textbox textbox--examples" data-type="example"><p id="element-980">A study is done by a community group in two neighboring colleges to determine which one graduates students with more math classes. College A samples 11 graduates. Their average is four math classes with a standard deviation of 1.5 math classes. College B samples nine graduates. Their average is 3.5 math classes with a standard deviation of one math class. The community group believes that a student who graduates from college A <strong>has taken more math classes,</strong> on the average. Both populations have a normal distribution. Test at a 1% significance level. Answer the following questions.</p> <p>&nbsp;</p> <div id="fs-idm178911728" data-type="exercise"><div id="fs-idm177203296" data-type="problem"><p id="fs-idm171395200">a. Is this a test of two means or two proportions?</p> </div> <div id="fs-idm36568576" data-type="solution"><p id="fs-idm78962816">a. two means</p> <p>&nbsp;</p> </div> </div> <div id="fs-idm194746624" data-type="exercise"><div id="fs-idm227630576" data-type="problem"><p id="fs-idm227630448">b. Are the populations standard deviations known or unknown?</p> </div> <div id="fs-idm217237904" data-type="solution"><p id="fs-idm217237776">b. unknown</p> <p>&nbsp;</p> </div> </div> <div id="fs-idm164304912" data-type="exercise"><div id="fs-idm35318288" data-type="problem"><p id="fs-idm220657088">c. Which distribution do you use to perform the test?</p> </div> <div id="fs-idm153443152" data-type="solution"><p id="fs-idm145309056">c. Student&#8217;s <em data-effect="italics">t</em></p> <p>&nbsp;</p> </div> </div> <div id="fs-idm130995312" data-type="exercise"><div id="fs-idm143520064" data-type="problem"><p id="fs-idm143519936">d. What is the random variable?</p> </div> <div id="fs-idm222952320" data-type="solution"><p id="fs-idm222952192">d. \({\overline{X}}_{A}-{\overline{X}}_{B}\)</p> <p>&nbsp;</p> </div> </div> <div id="fs-idm301029872" data-type="exercise"><div id="fs-idm176701568" data-type="problem"><p id="fs-idm151272736">e. What are the null and alternate hypotheses? Write the null and alternate hypotheses in words and in symbols.</p> </div> <div id="fs-idm147502464" data-type="solution"><p id="fs-idm206451408">e.</p> <ul id="fs-idp21552016"><li>\({H}_{o}:{\mu }_{A}\le {\mu }_{B}\)</li> <li>\({H}_{a}:{\mu }_{A}&gt;{\mu }_{B}\)</li> </ul> <p>&nbsp;</p> <p>&nbsp;</p> </div> </div> <div id="fs-idm39370224" data-type="exercise"><div id="fs-idm154517184" data-type="problem"><p id="fs-idm154517056">f. Is this test right-, left-, or two-tailed?</p> </div> <div id="fs-idm168558752" data-type="solution"><p id="fs-idm168558624">f.</p> <div id="eip-idm29208032" class="bc-figure figure"><span id="eip-idm4786768" data-type="media" data-alt="This is a normal distribution curve with mean equal to 0. A vertical line near the tail of the curve to the right of zero extends from the axis to the curve. The region under the curve to the right of the line is shaded."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C10_M02_001anno-1.jpg" alt="This is a normal distribution curve with mean equal to 0. A vertical line near the tail of the curve to the right of zero extends from the axis to the curve. The region under the curve to the right of the line is shaded." width="380" data-media-type="image/jpg" /></span></div> <p><span data-type="newline"><br /> </span>right</p> <p>&nbsp;</p> </div> </div> <div id="fs-idm146752960" data-type="exercise"><div id="fs-idm148661808" data-type="problem"><p id="fs-idm148661680">g. What is the <em data-effect="italics">p</em>-value?</p> </div> <div id="fs-idm79675952" data-type="solution"><p id="fs-idm106536048">g. 0.1928</p> <p>&nbsp;</p> </div> </div> <div id="fs-idm176691440" data-type="exercise"><div id="fs-idp29101808" data-type="problem"><p id="fs-idp29101936">h. Do you reject or not reject the null hypothesis?</p> </div> <div id="fs-idm36842944" data-type="solution"><p id="fs-idm36842816">h. Do not reject.</p> <p>&nbsp;</p> </div> </div> <div id="fs-idm65144592" data-type="exercise"><div id="fs-idm153855872" data-type="problem"><p id="fs-idm41056320">i. <strong>Conclusion:</strong></p> </div> <div id="fs-idm31976880" data-type="solution"><p id="fs-idm31976752">i. At the 1% level of significance, from the sample data, there is not sufficient evidence to conclude that a student who graduates from college A has taken more math classes, on the average, than a student who graduates from college B.</p> </div> </div> </div> <div id="fs-idp145514592" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <div id="eip-idp166826064" data-type="exercise"><div id="eip-idp166826320" data-type="problem"><p id="fs-idm170467472">A study is done to determine if Company A retains its workers longer than Company B. Company A samples 15 workers, and their average time with the company is five years with a standard deviation of 1.2. Company B samples 20 workers, and their average time with the company is 4.5 years with a standard deviation of 0.8. The populations are normally distributed.</p> <ol id="fs-idp55515664" type="a"><li>Are the population standard deviations known?</li> <li>Conduct an appropriate hypothesis test. At the 5% significance level, what is your conclusion?</li> </ol> </div> </div> </div> <div id="fs-idm22935408" class="textbox textbox--examples" data-type="example"><p id="fs-idm10086800">A professor at a large community college wanted to determine whether there is a difference in the means of final exam scores between students who took his statistics course online and the students who took his face-to-face statistics class. He believed that the mean of the final exam scores for the online class would be lower than that of the face-to-face class. Was the professor correct? The randomly selected 30 final exam scores from each group are listed in <a class="autogenerated-content" href="#fs-idm34056944">(Figure)</a> and <a class="autogenerated-content" href="#fs-idm124378288">(Figure)</a>.</p> <table id="fs-idm34056944" summary=""><caption><span data-type="title">Online Class</span></caption> <tbody><tr><td>67.6</td> <td>41.2</td> <td>85.3</td> <td>55.9</td> <td>82.4</td> <td>91.2</td> <td>73.5</td> <td>94.1</td> <td>64.7</td> <td>64.7</td> </tr> <tr><td>70.6</td> <td>38.2</td> <td>61.8</td> <td>88.2</td> <td>70.6</td> <td>58.8</td> <td>91.2</td> <td>73.5</td> <td>82.4</td> <td>35.5</td> </tr> <tr><td>94.1</td> <td>88.2</td> <td>64.7</td> <td>55.9</td> <td>88.2</td> <td>97.1</td> <td>85.3</td> <td>61.8</td> <td>79.4</td> <td>79.4</td> </tr> </tbody> </table> <table id="fs-idm124378288" summary=""><caption><span data-type="title">Face-to-face Class</span></caption> <tbody><tr><td>77.9</td> <td>95.3</td> <td>81.2</td> <td>74.1</td> <td>98.8</td> <td>88.2</td> <td>85.9</td> <td>92.9</td> <td>87.1</td> <td>88.2</td> </tr> <tr><td>69.4</td> <td>57.6</td> <td>69.4</td> <td>67.1</td> <td>97.6</td> <td>85.9</td> <td>88.2</td> <td>91.8</td> <td>78.8</td> <td>71.8</td> </tr> <tr><td>98.8</td> <td>61.2</td> <td>92.9</td> <td>90.6</td> <td>97.6</td> <td>100</td> <td>95.3</td> <td>83.5</td> <td>92.9</td> <td>89.4</td> </tr> </tbody> </table> <div id="fs-idm298748224" data-type="exercise"><div id="fs-idm181409216" data-type="problem"><p id="fs-idm28696464">Is the mean of the Final Exam scores of the online class lower than the mean of the Final Exam scores of the face-to-face class? Test at a 5% significance level. Answer the following questions:</p> <ol id="fs-idp180387984" type="a"><li>Is this a test of two means or two proportions?</li> <li>Are the population standard deviations known or unknown?</li> <li>Which distribution do you use to perform the test?</li> <li>What is the random variable?</li> <li>What are the null and alternative hypotheses? Write the null and alternative hypotheses in words and in symbols.</li> <li>Is this test right, left, or two tailed?</li> <li>What is the <em data-effect="italics">p</em>-value?</li> <li>Do you reject or not reject the null hypothesis?</li> <li>At the ___ level of significance, from the sample data, there ______ (is/is not) sufficient evidence to conclude that ______.</li> </ol> <p id="fs-idp169691824">(See the conclusion in <a class="autogenerated-content" href="#element-968">(Figure)</a>, and write yours in a similar fashion)</p> <div id="fs-idm11616112" class="statistics calculator" data-type="note" data-has-label="true" data-label=""><p id="fs-idm206958928">First put the data for each group into two lists (such as L1 and L2). Press STAT. Arrow over to TESTS and press 4:2SampTTest. Make sure Data is highlighted and press ENTER. Arrow down and enter L1 for the first list and L2 for the second list. Arrow down to <em data-effect="italics">μ</em><sub>1</sub>: and arrow to ≠ <em data-effect="italics">μ</em><sub>2</sub> (does not equal). Press ENTER. Arrow down to Pooled: No. Press ENTER. Arrow down to Calculate and press ENTER.</p> </div> <div id="fs-idm107478368" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="fs-idm218153776">Be careful not to mix up the information for Group 1 and Group 2!</p> </div> </div> <div id="fs-idm168184384" data-type="solution"><ol id="fs-idm11713040" type="a"><li>two means</li> <li>unknown</li> <li>Student&#8217;s <em data-effect="italics">t</em></li> <li>\({\overline{X}}_{1}–{\overline{X}}_{2}\)</li> <li><ol id="fs-idm109578096"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ<sub>1</sub></em> = <em data-effect="italics">μ<sub>2</sub></em> Null hypothesis: the means of the final exam scores are equal for the online and face-to-face statistics classes.</li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ<sub>1</sub></em> &lt; <em data-effect="italics">μ<sub>2</sub></em> Alternative hypothesis: the mean of the final exam scores of the online class is less than the mean of the final exam scores of the face-to-face class.</li> </ol> </li> <li>left-tailed</li> <li><em data-effect="italics">p</em>-value = 0.0011 <div id="fs-idm98197120" class="bc-figure figure"><span id="fs-idp598000" data-type="media" data-alt="This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the left of zero extends from the axis to the curve. The region under the curve to the left of the line is shaded representing p-value = 0.0011."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C10_M02_002annoN-1.jpg" alt="This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the left of zero extends from the axis to the curve. The region under the curve to the left of the line is shaded representing p-value = 0.0011." width="380" data-media-type="image/jpg" /></span></div> </li> <li>Reject the null hypothesis</li> <li>The professor was correct. The evidence shows that the mean of the final exam scores for the online class is lower than that of the face-to-face class. <span data-type="newline"><br /> </span>At the <u data-effect="underline">5%</u> level of significance, from the sample data, there is (is/is not) sufficient evidence to conclude that the mean of the final exam scores for the online class is less than <u data-effect="underline">the mean of final exam scores of the face-to-face class.</u></li> </ol> </div> </div> </div> <p><span data-type="title">Cohen&#8217;s Standards for Small, Medium, and Large Effect Sizes</span><span data-type="term">Cohen&#8217;s <em data-effect="italics">d</em></span> is a measure of effect size based on the differences between two means. Cohen’s <em data-effect="italics">d</em>, named for United States statistician Jacob Cohen, measures the relative strength of the differences between the means of two populations based on sample data. The calculated value of effect size is then compared to Cohen’s standards of small, medium, and large effect sizes.</p> <table id="fs-idp93235584" summary=""><caption><span data-type="title">Cohen&#8217;s Standard Effect Sizes</span></caption> <thead><tr><th>Size of effect</th> <th><em data-effect="italics">d</em></th> </tr> </thead> <tbody><tr><td>Small</td> <td>0.2</td> </tr> <tr><td>medium</td> <td>0.5</td> </tr> <tr><td>Large</td> <td>0.8</td> </tr> </tbody> </table> <p id="fs-idm52480352">Cohen&#8217;s <em data-effect="italics">d</em> is the measure of the difference between two means divided by the pooled standard deviation: \(d=\frac{{\overline{x}}_{1}–{\overline{x}}_{2}}{{s}_{pooled}}\) where \({s}_{pooled}=\sqrt{\frac{\left({n}_{1}–1\right){s}_{1}^{2}+\left({n}_{2}–1\right){s}_{2}^{2}}{{n}_{1}+{n}_{2}–2}}\)</p> <div id="fs-idm100194752" class="textbox textbox--examples" data-type="example"><div id="fs-idm122689856" data-type="exercise"><div id="fs-idm122689600" data-type="problem"><p id="fs-idm1441280">Calculate Cohen’s <em data-effect="italics">d</em> for <a class="autogenerated-content" href="#element-968">(Figure)</a>. Is the size of the effect small, medium, or large? Explain what the size of the effect means for this problem.</p> </div> <div id="fs-idm42685344" data-type="solution"><p id="fs-idp179952160"><em data-effect="italics">μ</em><sub>1</sub> = 4 <em data-effect="italics">s</em><sub>1</sub> = 1.5 <em data-effect="italics">n</em><sub>1</sub> = 11 <span data-type="newline"><br /> </span><em data-effect="italics">μ</em><sub>2</sub> = 3.5 <em data-effect="italics">s</em><sub>2</sub> = 1 <em data-effect="italics">n</em><sub>2</sub> = 9 <span data-type="newline"><br /> </span><em data-effect="italics">d</em> = 0.384 <span data-type="newline"><br /> </span>The effect is small because 0.384 is between Cohen’s value of 0.2 for small effect size and 0.5 for medium effect size. The size of the differences of the means for the two colleges is small indicating that there is not a significant difference between them.</p> </div> </div> </div> <div id="fs-idm112602048" class="textbox textbox--examples" data-type="example"><div id="fs-idm163588752" data-type="exercise"><div id="fs-idm118829104" data-type="problem"><p id="fs-idm174261728">Calculate Cohen’s <em data-effect="italics">d</em> for <a class="autogenerated-content" href="#fs-idm22935408">(Figure)</a>. Is the size of the effect small, medium or large? Explain what the size of the effect means for this problem.</p> </div> <div id="fs-idm177693104" data-type="solution"><p id="fs-idm86592608"><em data-effect="italics">d</em> = 0.834; Large, because 0.834 is greater than Cohen’s 0.8 for a large effect size. The size of the differences between the means of the Final Exam scores of online students and students in a face-to-face class is large indicating a significant difference.</p> </div> </div> </div> <div id="fs-idp145071264" class="statistics try" data-type="note" data-has-label="true" data-label=""><div data-type="title">Try It</div> <p id="fs-idm62120944">Weighted alpha is a measure of risk-adjusted performance of stocks over a period of a year. A high positive weighted alpha signifies a stock whose price has risen while a small positive weighted alpha indicates an unchanged stock price during the time period. Weighted alpha is used to identify companies with strong upward or downward trends. The weighted alpha for the top 30 stocks of banks in the northeast and in the west as identified by Nasdaq on May 24, 2013 are listed in <a class="autogenerated-content" href="#fs-idm242901296">(Figure)</a> and <a class="autogenerated-content" href="#fs-idm167801152">(Figure)</a>, respectively.</p> <table id="fs-idm242901296" summary=""><caption><span data-type="title">Northeast</span></caption> <tbody><tr><td>94.2</td> <td>75.2</td> <td>69.6</td> <td>52.0</td> <td>48.0</td> <td>41.9</td> <td>36.4</td> <td>33.4</td> <td>31.5</td> <td>27.6</td> </tr> <tr><td>77.3</td> <td>71.9</td> <td>67.5</td> <td>50.6</td> <td>46.2</td> <td>38.4</td> <td>35.2</td> <td>33.0</td> <td>28.7</td> <td>26.5</td> </tr> <tr><td>76.3</td> <td>71.7</td> <td>56.3</td> <td>48.7</td> <td>43.2</td> <td>37.6</td> <td>33.7</td> <td>31.8</td> <td>28.5</td> <td>26.0</td> </tr> </tbody> </table> <table id="fs-idm167801152" summary=""><caption><span data-type="title">West</span></caption> <tbody><tr><td>126.0</td> <td>70.6</td> <td>65.2</td> <td>51.4</td> <td>45.5</td> <td>37.0</td> <td>33.0</td> <td>29.6</td> <td>23.7</td> <td>22.6</td> </tr> <tr><td>116.1</td> <td>70.6</td> <td>58.2</td> <td>51.2</td> <td>43.2</td> <td>36.0</td> <td>31.4</td> <td>28.7</td> <td>23.5</td> <td>21.6</td> </tr> <tr><td>78.2</td> <td>68.2</td> <td>55.6</td> <td>50.3</td> <td>39.0</td> <td>34.1</td> <td>31.0</td> <td>25.3</td> <td>23.4</td> <td>21.5</td> </tr> </tbody> </table> <div id="fs-idm120507648" data-type="exercise"><div id="eip-idp51089200" data-type="problem"><p id="fs-idm100910544">Is there a difference in the weighted alpha of the top 30 stocks of banks in the northeast and in the west? Test at a 5% significance level. Answer the following questions:</p> <ol id="fs-idp33249296" type="a"><li>Is this a test of two means or two proportions?</li> <li>Are the population standard deviations known or unknown?</li> <li>Which distribution do you use to perform the test?</li> <li>What is the random variable?</li> <li>What are the null and alternative hypotheses? Write the null and alternative hypotheses in words and in symbols.</li> <li>Is this test right, left, or two tailed?</li> <li>What is the <em data-effect="italics">p</em>-value?</li> <li>Do you reject or not reject the null hypothesis?</li> <li>At the ___ level of significance, from the sample data, there ______ (is/is not) sufficient evidence to conclude that ______.</li> <li>Calculate Cohen’s <em data-effect="italics">d</em> and interpret it.</li> </ol> </div> </div> </div> <div class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p>Data from Graduating Engineer + Computer Careers. Available online at http://www.graduatingengineer.com</p> <p id="fs-idp6735712">Data from <em data-effect="italics">Microsoft Bookshelf</em>.</p> <p id="fs-idm54110288">Data from the United States Senate website, available online at www.Senate.gov (accessed June 17, 2013).</p> <p id="fs-idp26924704">“List of current United States Senators by Age.” Wikipedia. Available online at http://en.wikipedia.org/wiki/List_of_current_United_States_Senators_by_age (accessed June 17, 2013).</p> <p id="fs-idp67778720">“Sectoring by Industry Groups.” Nasdaq. Available online at http://www.nasdaq.com/markets/barchart-sectors.aspx?page=sectors&amp;base=industry (accessed June 17, 2013).</p> <p id="fs-idp92967520">“Strip Clubs: Where Prostitution and Trafficking Happen.” Prostitution Research and Education, 2013. Available online at www.prostitutionresearch.com/ProsViolPosttrauStress.html (accessed June 17, 2013).</p> <p id="fs-idp130720928">“World Series History.” Baseball-Almanac, 2013. Available online at http://www.baseball-almanac.com/ws/wsmenu.shtml (accessed June 17, 2013).</p> </div> <div id="fs-idm178063424" class="summary" data-depth="1"><h3 data-type="title">Chapter Review</h3> <p id="fs-idm116979232">Two population means from independent samples where the population standard deviations are not known</p> <ul id="fs-idm111534432"><li>Random Variable: \({\overline{X}}_{1}-{\overline{X}}_{2}\) = the difference of the sampling means</li> <li>Distribution: Student&#8217;s <em data-effect="italics">t</em>-distribution with degrees of freedom (variances not pooled)</li> </ul> </div> <div id="fs-idm147993376" class="formula-review" data-depth="1"><h3 data-type="title">Formula Review</h3> <p id="fs-idm5111056">Standard error: <em data-effect="italics">SE</em> = \(\sqrt{\frac{{\left({s}_{1}\right)}^{2}}{{n}_{1}}+\frac{{\left({s}_{2}\right)}^{2}}{{n}_{2}}}\)</p> <p id="fs-idm105811424">Test statistic (<em data-effect="italics">t</em>-score): <em data-effect="italics">t</em> = \(\frac{\left({\overline{x}}_{1}-{\overline{x}}_{2}\right)-\left({\mu }_{1}-{\mu }_{2}\right)}{\sqrt{\frac{{\left({s}_{1}\right)}^{2}}{{n}_{1}}+\frac{{\left({s}_{2}\right)}^{2}}{{n}_{2}}}}\)</p> <p id="fs-idm96358640">Degrees of freedom:<span data-type="newline"><br /> </span> \(df= \frac{{\left(\frac{{\left({s}_{1}\right)}^{2}}{{n}_{1}}+ \frac{{\left({s}_{2}\right)}^{2}}{{n}_{2}}\right)}^{2}}{\left(\frac{1}{{n}_{1}-1}\right){\left(\frac{{\left({s}_{1}\right)}^{2}}{{n}_{1}}\right)}^{2}+\left(\frac{1}{{n}_{2}-1}\right){\left(\frac{{\left({s}_{2}\right)}^{2}}{{n}_{2}}\right)}^{2}}\)</p> <p id="fs-idm98284352">where:</p> <p id="fs-idm92032864"><em data-effect="italics">s</em><sub>1</sub> and <em data-effect="italics">s</em><sub>2</sub> are the sample standard deviations, and <em data-effect="italics">n</em><sub>1</sub> and <em data-effect="italics">n</em><sub>2</sub> are the sample sizes.</p> <p id="fs-idm32210368">\({\overline{x}}_{1}\) and \({\overline{x}}_{2}\) are the sample means.</p> <p id="fs-idm117522208">Cohen’s <em data-effect="italics">d</em> is the measure of effect size:</p> <p id="fs-idm117521824">\(d=\frac{{\overline{x}}_{1}-{\overline{x}}_{2}}{{s}_{pooled}}\)<span data-type="newline"><br /> </span>where \({s}_{pooled}=\sqrt{\frac{\left({n}_{1}-1\right){s}_{1}^{2}+\left({n}_{2}-1\right){s}_{2}^{2}}{{n}_{1}+{n}_{2}-2}}\)</p> </div> <div id="fs-idm9464368" class="practice" data-depth="1"><p id="fs-idm124875168"><em data-effect="italics">Use the following information to answer the next 15 exercises:</em> Indicate if the hypothesis test is for</p> <ol id="fs-idm45665008" type="a"><li>independent group means, population standard deviations, and/or variances known</li> <li>independent group means, population standard deviations, and/or variances unknown</li> <li>matched or paired samples</li> <li>single mean</li> <li>two proportions</li> <li>single proportion</li> </ol> <div id="fs-idm22935904" data-type="exercise"><div id="fs-idm124875424" data-type="problem"><p id="fs-idm84013008">It is believed that 70% of males pass their drivers test in the first attempt, while 65% of females pass the test in the first attempt. Of interest is whether the proportions are in fact equal.</p> </div> <div id="fs-idm235455488" data-type="solution"><p id="fs-idm173588752">two proportions</p> </div> </div> <div id="fs-idm146907648" data-type="exercise"><div id="fs-idm146907392" data-type="problem"><p id="fs-idm146907136">A new laundry detergent is tested on consumers. Of interest is the proportion of consumers who prefer the new brand over the leading competitor. A study is done to test this.</p> </div> </div> <div id="fs-idm161821520" data-type="exercise"><div id="fs-idm161821264" data-type="problem"><p id="fs-idm164006592">A new windshield treatment claims to repel water more effectively. Ten windshields are tested by simulating rain without the new treatment. The same windshields are then treated, and the experiment is run again. A hypothesis test is conducted.</p> </div> <div id="fs-idm127047328" data-type="solution"><p id="fs-idm127047072">matched or paired samples</p> </div> </div> <div id="fs-idm120536064" data-type="exercise"><div id="fs-idm68027168" data-type="problem"><p id="fs-idm68026912">The known standard deviation in salary for all mid-level professionals in the financial industry is \$11,000. Company A and Company B are in the financial industry. Suppose samples are taken of mid-level professionals from Company A and from Company B. The sample mean salary for mid-level professionals in Company A is \$80,000. The sample mean salary for mid-level professionals in Company B is \$96,000. Company A and Company B management want to know if their mid-level professionals are paid differently, on average.</p> </div> </div> <div id="fs-idp21825264" data-type="exercise"><div id="fs-idp21825520" data-type="problem"><p id="fs-idm175592768">The average worker in Germany gets eight weeks of paid vacation.</p> </div> <div id="fs-idm81349024" data-type="solution"><p id="fs-idm81348768">single mean</p> </div> </div> <div id="fs-idm170345552" data-type="exercise"><div id="fs-idm170345296" data-type="problem"><p id="fs-idm39186912">According to a television commercial, 80% of dentists agree that Ultrafresh toothpaste is the best on the market.</p> </div> </div> <div id="fs-idp3576048" data-type="exercise"><div id="fs-idm27505984" data-type="problem"><p id="fs-idm27505728">It is believed that the average grade on an English essay in a particular school system for females is higher than for males. A random sample of 31 females had a mean score of 82 with a standard deviation of three, and a random sample of 25 males had a mean score of 76 with a standard deviation of four.</p> </div> <div id="fs-idm27710704" data-type="solution"><p id="fs-idm54785888">independent group means, population standard deviations and/or variances unknown</p> </div> </div> <div id="fs-idm32687376" data-type="exercise"><div id="fs-idm32687120" data-type="problem"><p id="fs-idm32686864">The league mean batting average is 0.280 with a known standard deviation of 0.06. The Rattlers and the Vikings belong to the league. The mean batting average for a sample of eight Rattlers is 0.210, and the mean batting average for a sample of eight Vikings is 0.260. There are 24 players on the Rattlers and 19 players on the Vikings. Are the batting averages of the Rattlers and Vikings statistically different?</p> </div> </div> <div id="fs-idm36046944" data-type="exercise"><div id="fs-idm36046688" data-type="problem"><p id="fs-idm3190160">In a random sample of 100 forests in the United States, 56 were coniferous or contained conifers. In a random sample of 80 forests in Mexico, 40 were coniferous or contained conifers. Is the proportion of conifers in the United States statistically more than the proportion of conifers in Mexico?</p> </div> <div id="fs-idm7896320" data-type="solution"><p id="fs-idm7896064">two proportions</p> </div> </div> <div id="fs-idm84811344" data-type="exercise"><div id="fs-idm84811088" data-type="problem"><p id="fs-idm16418448">A new medicine is said to help improve sleep. Eight subjects are picked at random and given the medicine. The means hours slept for each person were recorded before starting the medication and after.</p> </div> </div> <div id="fs-idm111997184" data-type="exercise"><div id="fs-idm111996928" data-type="problem"><p id="fs-idm35434864">It is thought that teenagers sleep more than adults on average. A study is done to verify this. A sample of 16 teenagers has a mean of 8.9 hours slept and a standard deviation of 1.2. A sample of 12 adults has a mean of 6.9 hours slept and a standard deviation of 0.6.</p> </div> <div id="fs-idm5318624" data-type="solution"><p id="fs-idm5318368">independent group means, population standard deviations and/or variances unknown</p> </div> </div> <div id="fs-idm83613168" data-type="exercise"><div id="fs-idm83612912" data-type="problem"><p id="fs-idm122472032">Varsity athletes practice five times a week, on average.</p> </div> </div> <div id="fs-idm155074576" data-type="exercise"><div id="fs-idm187988384" data-type="problem"><p id="fs-idm187988128">A sample of 12 in-state graduate school programs at school A has a mean tuition of \$64,000 with a standard deviation of \$8,000. At school B, a sample of 16 in-state graduate programs has a mean of \$80,000 with a standard deviation of \$6,000. On average, are the mean tuitions different?</p> </div> <div id="fs-idm63313632" data-type="solution"><p id="fs-idm86332592">independent group means, population standard deviations and/or variances unknown</p> </div> </div> <div id="fs-idm86331952" data-type="exercise"><div id="fs-idm109695328" data-type="problem"><p id="fs-idm109695072">A new WiFi range booster is being offered to consumers. A researcher tests the native range of 12 different routers under the same conditions. The ranges are recorded. Then the researcher uses the new WiFi range booster and records the new ranges. Does the new WiFi range booster do a better job?</p> </div> </div> <div id="fs-idm152397632" data-type="exercise"><div id="fs-idm187246864" data-type="problem"><p id="fs-idm187246608">A high school principal claims that 30% of student athletes drive themselves to school, while 4% of non-athletes drive themselves to school. In a sample of 20 student athletes, 45% drive themselves to school. In a sample of 35 non-athlete students, 6% drive themselves to school. Is the percent of student athletes who drive themselves to school more than the percent of nonathletes?</p> </div> <div id="fs-idm178089840" data-type="solution"><p id="fs-idm174648448">two proportions</p> </div> </div> <p><span data-type="newline"><br /> </span><em data-effect="italics">Use the following information to answer the next three exercises:</em> A study is done to determine which of two soft drinks has more sugar. There are 13 cans of Beverage A in a sample and six cans of Beverage B. The mean amount of sugar in Beverage A is 36 grams with a standard deviation of 0.6 grams. The mean amount of sugar in Beverage B is 38 grams with a standard deviation of 0.8 grams. The researchers believe that Beverage B has more sugar than Beverage A, on average. Both populations have normal distributions.</p> <div id="fs-idm174647808" data-type="exercise"><div id="fs-idm80739952" data-type="problem"><p id="fs-idm174134640">Are standard deviations known or unknown?</p> </div> </div> <div id="fs-idm213711120" data-type="exercise"><div id="fs-idm213710864" data-type="problem"><p id="fs-idm213710608">What is the random variable?</p> </div> <div id="fs-idm10993824" data-type="solution"><p id="fs-idm10993568">The random variable is the difference between the mean amounts of sugar in the two soft drinks.</p> </div> </div> <div id="fs-idm106857024" data-type="exercise"><div id="fs-idm38595520" data-type="problem"><p id="fs-idm38595264">Is this a one-tailed or two-tailed test?</p> </div> </div> <p><em data-effect="italics">Use the following information to answer the next 12 exercises:</em> The U.S. Center for Disease Control reports that the mean life expectancy was 47.6 years for whites born in 1900 and 33.0 years for nonwhites. Suppose that you randomly survey death records for people born in 1900 in a certain county. Of the 124 whites, the mean life span was 45.3 years with a standard deviation of 12.7 years. Of the 82 nonwhites, the mean life span was 34.1 years with a standard deviation of 15.6 years. Conduct a hypothesis test to see if the mean life spans in the county were the same for whites and nonwhites.</p> <div id="fs-idm77134224" data-type="exercise"><div id="fs-idm77133968" data-type="problem"><p id="fs-idm82811296">Is this a test of means or proportions?</p> </div> <div id="fs-idm82810784" data-type="solution"><p id="fs-idm5951248">means</p> </div> </div> <div id="fs-idm87532368" data-type="exercise"><div id="fs-idm87532112" data-type="problem"><p id="fs-idm87531856">State the null and alternative hypotheses.</p> <ol id="fs-idm126031792" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: __________</li> <li><em data-effect="italics">H<sub>a</sub></em>: __________</li> </ol> </div> </div> <div id="fs-idp13932064" data-type="exercise"><div id="fs-idm162071840" data-type="problem"><p id="fs-idm162071584">Is this a right-tailed, left-tailed, or two-tailed test?</p> </div> <div id="fs-idm104897488" data-type="solution"><p id="fs-idm104897232">two-tailed</p> </div> </div> <div id="fs-idm104977072" data-type="exercise"><div id="fs-idm104976816" data-type="problem"><p id="fs-idm161761248">In symbols, what is the random variable of interest for this test?</p> </div> </div> <div id="fs-idm147822240" data-type="exercise"><div id="fs-idm147821984" data-type="problem"><p id="fs-idm147821728">In words, define the random variable of interest for this test.</p> </div> <div id="fs-idm2327840" data-type="solution"><p id="fs-idm2327584">the difference between the mean life spans of whites and nonwhites</p> </div> </div> <div id="fs-idm17957104" data-type="exercise"><div id="fs-idm17956848" data-type="problem"><p id="fs-idm100280816">Which distribution (normal or Student&#8217;s <em data-effect="italics">t</em>) would you use for this hypothesis test?</p> </div> </div> <div id="fs-idm121049024" data-type="exercise"><div id="fs-idm121048768" data-type="problem"><p id="fs-idm121048512">Explain why you chose the distribution you did for <a class="autogenerated-content" href="#fs-idm17957104">(Figure)</a>.</p> </div> <div id="eip-idp105137344" data-type="solution"><p id="eip-idp105137600">This is a comparison of two population means with unknown population standard deviations.</p> </div> </div> <div id="fs-idm96874480" data-type="exercise"><div id="fs-idm96874224" data-type="problem"><p id="fs-idm263989856">Calculate the test statistic and <em data-effect="italics">p</em>-value.</p> </div> </div> <div id="fs-idm120924688" data-type="exercise"><div id="fs-idm120924432" data-type="problem"><p id="fs-idm120924176">Sketch a graph of the situation. Label the horizontal axis. Mark the hypothesized difference and the sample difference. Shade the area corresponding to the <em data-effect="italics">p</em>-value.</p> </div> <div id="fs-idp104610560" data-type="solution"><p id="fs-idp104610816">Check student’s solution.</p> </div> </div> <div id="fs-idm47382320" data-type="exercise"><div id="fs-idm47382064" data-type="problem"><p id="fs-idm174192304">Find the <em data-effect="italics">p</em>-value.</p> </div> </div> <div id="fs-idm129306608" data-type="exercise"><div id="fs-idm129306352" data-type="problem"><p id="fs-idm148755040">At a pre-conceived <em data-effect="italics">α</em> = 0.05, what is your:</p> <ol id="fs-idp598432" type="a"><li>Decision:</li> <li>Reason for the decision:</li> <li>Conclusion (write out in a complete sentence):</li> </ol> </div> <div id="fs-idm110404384" data-type="solution"><ol id="fs-idp229240368" type="a"><li>Reject the null hypothesis</li> <li><em data-effect="italics">p</em>-value &lt; 0.05</li> <li>There is not enough evidence at the 5% level of significance to support the claim that life expectancy in the 1900s is different between whites and nonwhites.</li> </ol> </div> </div> <div id="fs-idm127412752" data-type="exercise"><div id="fs-idm126248464" data-type="problem"><p id="fs-idm126248208">Does it appear that the means are the same? Why or why not?</p> </div> </div> </div> <div id="fs-idm173509824" class="free-response" data-depth="1"><h3 data-type="title">Homework</h3> <p><em data-effect="italics">DIRECTIONS: For each of the word problems, use a solution sheet to do the hypothesis test. The solution sheet is found in <a href="/contents/c0449c55-aa47-4f1c-bd5f-0521652f0e82">Appendix E</a>. Please feel free to make copies of the solution sheets. For the online version of the book, it is suggested that you copy the .doc or the .pdf files.</em></p> <div id="eip-792" data-type="note" data-has-label="true" data-label=""><div data-type="title">NOTE</div> <p id="eip-idp140444868962512">If you are using a Student&#8217;s <em data-effect="italics">t</em>-distribution for a homework problem in what follows, including for paired data, you may assume that the underlying population is normally distributed. (When using these tests in a real situation, you must first prove that assumption, however.)</p> </div> <div id="fs-idm164128" data-type="exercise"><div id="fs-idm163872" data-type="problem"><p id="fs-idm163616"></p></div> </div> <div id="fs-idm8894688" data-type="exercise"><div id="fs-idm8894432" data-type="problem"><p id="fs-idm185269312">1)A student at a four-year college claims that mean enrollment at four–year colleges is higher than at two–year colleges in the United States. Two surveys are conducted. Of the 35 two–year colleges surveyed, the mean enrollment was 5,068 with a standard deviation of 4,777. Of the 35 four-year colleges surveyed, the mean enrollment was 5,466 with a standard deviation of 8,191.</p> </div> <div id="fs-idp138385488" data-type="solution"><p>&nbsp;</p> </div> </div> <div id="fs-idm39083984" data-type="exercise"><div id="fs-idm39083728" data-type="problem"><p id="fs-idm53880736">2) At Rachel’s 11<sup>th</sup> birthday party, eight girls were timed to see how long (in seconds) they could hold their breath in a relaxed position. After a two-minute rest, they timed themselves while jumping. The girls thought that the mean difference between their jumping and relaxed times would be zero. Test their hypothesis.</p> <table id="fs-idm53880480" summary="The table presents the first column as the relaxed time in seconds and the second column as the jumping time in seconds."><thead><tr><th>Relaxed time (seconds)</th> <th>Jumping time (seconds)</th> </tr> </thead> <tbody><tr><td>26</td> <td>21</td> </tr> <tr><td>47</td> <td>40</td> </tr> <tr><td>30</td> <td>28</td> </tr> <tr><td>22</td> <td>21</td> </tr> <tr><td>23</td> <td>25</td> </tr> <tr><td>45</td> <td>43</td> </tr> <tr><td>37</td> <td>35</td> </tr> <tr><td>29</td> <td>32</td> </tr> </tbody> </table> <p>&nbsp;</p> </div> </div> <div id="fs-idm122985232" data-type="exercise"><div id="fs-idm122984976" data-type="problem"><p id="fs-idm122984720">3) Mean entry-level salaries for college graduates with mechanical engineering degrees and electrical engineering degrees are believed to be approximately the same. A recruiting office thinks that the mean mechanical engineering salary is actually lower than the mean electrical engineering salary. The recruiting office randomly surveys 50 entry level mechanical engineers and 60 entry level electrical engineers. Their mean salaries were \$46,100 and \$46,700, respectively. Their standard deviations were \$3,450 and \$4,210, respectively. Conduct a hypothesis test to determine if you agree that the mean entry-level mechanical engineering salary is lower than the mean entry-level electrical engineering salary.</p> </div> <div id="fs-idm188189904" data-type="solution"><p>&nbsp;</p> </div> </div> <div id="fs-idm62206688" data-type="exercise"><div id="fs-idm62206432" data-type="problem"><p id="fs-idm105723088">4) Marketing companies have collected data implying that teenage girls use more ring tones on their cellular phones than teenage boys do. In one particular study of 40 randomly chosen teenage girls and boys (20 of each) with cellular phones, the mean number of ring tones for the girls was 3.2 with a standard deviation of 1.5. The mean for the boys was 1.7 with a standard deviation of 0.8. Conduct a hypothesis test to determine if the means are approximately the same or if the girls’ mean is higher than the boys’ mean.</p> <p>&nbsp;</p> </div> </div> <p><em data-effect="italics">Use the information from <a class="autogenerated-content" href="/contents/3ef830bc-5247-460a-9007-e3fd762e5e93">(Figure)</a> to answer the next four exercises.</em></p> <div id="fs-idm50796704" data-type="exercise"><div id="fs-idm50796448" data-type="problem"><p id="fs-idm98236496">5) Using the data from Lap 1 only, conduct a hypothesis test to determine if the mean time for completing a lap in races is the same as it is in practices.</p> </div> <div id="fs-idm73210192" data-type="solution"><p>&nbsp;</p> </div> </div> <div id="fs-idm2195248" data-type="exercise"><div id="fs-idm2194992" data-type="problem"><p id="fs-idm241101904">6) Repeat the test in <a href="#fs-idm50796704">Exercise 10.83</a>, but use Lap 5 data this time.</p> <p>&nbsp;</p> </div> </div> <div id="fs-idm170930784" data-type="exercise"><div id="fs-idm54834800" data-type="problem"><p id="fs-idm54834544">7) Repeat the test in <a href="#fs-idm50796704">Exercise 10.83</a>, but this time combine the data from Laps 1 and 5.</p> </div> <div id="fs-idm39528240" data-type="solution"></div> </div> <div id="fs-idm80120272" data-type="exercise"><div id="fs-idm80120016" data-type="problem"><p id="fs-idm80119760"></p></div> </div> <p><em data-effect="italics">Use the following information to answer the next two exercises.</em> The Eastern and Western Major League Soccer conferences have a new Reserve Division that allows new players to develop their skills. Data for a randomly picked date showed the following annual goals.</p> <table summary="Eastern and Western Major League Soccer"><thead><tr><th>Western</th> <th>Eastern</th> </tr> </thead> <tbody><tr><td>Los Angeles 9</td> <td>D.C. United 9</td> </tr> <tr><td>FC Dallas 3</td> <td>Chicago 8</td> </tr> <tr><td>Chivas USA 4</td> <td>Columbus 7</td> </tr> <tr><td>Real Salt Lake 3</td> <td>New England 6</td> </tr> <tr><td>Colorado 4</td> <td>MetroStars 5</td> </tr> <tr><td>San Jose 4</td> <td>Kansas City 3</td> </tr> </tbody> </table> <p id="eip-549"><em data-effect="italics">Conduct a hypothesis test to answer the next two exercises.</em></p> <p>&nbsp;</p> <div id="fs-idm67199056" data-type="exercise"><div id="fs-idm67198800" data-type="problem"><p id="fs-idm67198544">8) The <strong>exact</strong> distribution for the hypothesis test is:</p> <ol id="fs-idm91569200" type="a"><li>the normal distribution</li> <li>the Student&#8217;s <em data-effect="italics">t</em>-distribution</li> <li>the uniform distribution</li> <li>the exponential distribution</li> </ol> <p>&nbsp;</p> </div> </div> <div id="fs-idm3207760" data-type="exercise"><div id="fs-idm104292672" data-type="problem"><p id="fs-idm104292416">9) If the level of significance is 0.05, the conclusion is:</p> <ol id="fs-idm104292032" type="a"><li>There is sufficient evidence to conclude that the <strong>W</strong> Division teams score fewer goals, on average, than the <strong>E</strong> teams</li> <li>There is insufficient evidence to conclude that the <strong>W</strong> Division teams score more goals, on average, than the <strong>E</strong> teams.</li> <li>There is insufficient evidence to conclude that the <strong>W</strong> teams score fewer goals, on average, than the <strong>E</strong> teams score.</li> <li>Unable to determine</li> </ol> </div> <div id="fs-idm178991312" data-type="solution"><p id="fs-idm178991056"></p></div> </div> <div id="fs-idm108118736" data-type="exercise"><div id="fs-idm122622576" data-type="problem"><p id="eip-idp273849808">10) Suppose a statistics instructor believes that there is no significant difference between the mean class scores of statistics day students on Exam 2 and statistics night students on Exam 2. She takes random samples from each of the populations. The mean and standard deviation for 35 statistics day students were 75.86 and 16.91. The mean and standard deviation for 37 statistics night students were 75.41 and 19.73. The “day” subscript refers to the statistics day students. The “night” subscript refers to the statistics night students. A concluding statement is:</p> <ol id="fs-idm122621936" type="a"><li>There is sufficient evidence to conclude that statistics night students&#8217; mean on Exam 2 is better than the statistics day students&#8217; mean on Exam 2.</li> <li>There is insufficient evidence to conclude that the statistics day students&#8217; mean on Exam 2 is better than the statistics night students&#8217; mean on Exam 2.</li> <li>There is insufficient evidence to conclude that there is a significant difference between the means of the statistics day students and night students on Exam 2.</li> <li>There is sufficient evidence to conclude that there is a significant difference between the means of the statistics day students and night students on Exam 2.</li> </ol> </div> </div> <div id="fs-idm81478720" data-type="exercise"><div id="fs-idp12884432" data-type="problem"><p id="fs-idp12884688">11) Researchers interviewed street prostitutes in Canada and the United States. The mean age of the 100 Canadian prostitutes upon entering prostitution was 18 with a standard deviation of six. The mean age of the 130 United States prostitutes upon entering prostitution was 20 with a standard deviation of eight. Is the mean age of entering prostitution in Canada lower than the mean age in the United States? Test at a 1% significance level.</p> </div> <div id="fs-idm189266736" data-type="solution"><p id="fs-idm125542800"></p></div> </div> <div id="fs-idm65734896" data-type="exercise"><div id="fs-idm65734640" data-type="problem"><p id="fs-idp44866512">12) A powder diet is tested on 49 people, and a liquid diet is tested on 36 different people. Of interest is whether the liquid diet yields a higher mean weight loss than the powder diet. The powder diet group had a mean weight loss of 42 pounds with a standard deviation of 12 pounds. The liquid diet group had a mean weight loss of 45 pounds with a standard deviation of 14 pounds.</p> </div> </div> <div data-type="exercise"><div data-type="problem"><p>13) Suppose a statistics instructor believes that there is no significant difference between the mean class scores of statistics day students on Exam 2 and statistics night students on Exam 2. She takes random samples from each of the populations. The mean and standard deviation for 35 statistics day students were 75.86 and 16.91, respectively. The mean and standard deviation for 37 statistics night students were 75.41 and 19.73. The “day” subscript refers to the statistics day students. The “night” subscript refers to the statistics night students. An appropriate alternative hypothesis for the hypothesis test is:</p> <ol id="eip-idp47881856" type="a"><li><em data-effect="italics">μ</em><sub>day</sub> &gt; <em data-effect="italics">μ</em><sub>night</sub></li> <li><em data-effect="italics">μ</em><sub>day</sub> &lt; <em data-effect="italics">μ</em><sub>night</sub></li> <li><em data-effect="italics">μ</em><sub>day</sub> = <em data-effect="italics">μ</em><sub>night</sub></li> <li><em data-effect="italics">μ</em><sub>day</sub> ≠ <em data-effect="italics">μ</em><sub>night</sub></li> </ol> </div> <div id="eip-605" data-type="solution"><p>&nbsp;</p> <p>1)</p> <p id="fs-idm89316432">Subscripts: 1: two-year colleges; 2: four-year colleges</p> <ol id="fs-idm89316048" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ<sub>1</sub></em> ≥ <em data-effect="italics">μ<sub>2</sub></em></li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ<sub>1</sub></em> &lt; <em data-effect="italics">μ<sub>2</sub></em></li> <li>\({\overline{X}}_{1}–{\overline{X}}_{2}\) is the difference between the mean enrollments of the two-year colleges and the four-year colleges.</li> <li>Student’s-<em data-effect="italics">t</em></li> <li>test statistic: -0.2480</li> <li><em data-effect="italics">p</em>-value: 0.4019</li> <li>Check student’s solution.</li> <li><ol id="fs-idp113692848" type="i"><li>Alpha: 0.05</li> <li>Decision: Do not reject</li> <li>Reason for Decision: <em data-effect="italics">p</em>-value &gt; alpha</li> <li>Conclusion: At the 5% significance level, there is sufficient evidence to conclude that the mean enrollment at four-year colleges is higher than at two-year colleges.</li> </ol> </li> </ol> <p>3)</p> <p id="fs-idp57225296">Subscripts: 1: mechanical engineering; 2: electrical engineering</p> <ol id="fs-idp195702880" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">µ<sub>1</sub></em> ≥ <em data-effect="italics">µ<sub>2</sub></em></li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">µ<sub>1</sub></em> &lt; <em data-effect="italics">µ<sub>2</sub></em></li> <li>\({\overline{X}}_{1}-{\overline{X}}_{2}\) is the difference between the mean entry level salaries of mechanical engineers and electrical engineers.</li> <li><em data-effect="italics">t</em><sub>108</sub></li> <li>test statistic: <em data-effect="italics">t</em> = –0.82</li> <li><em data-effect="italics">p</em>-value: 0.2061</li> <li>Check student’s solution.</li> <li><ol id="fs-idp3469344" type="i"><li>Alpha: 0.05</li> <li>Decision: Do not reject the null hypothesis.</li> <li>Reason for Decision: <em data-effect="italics">p</em>-value &gt; alpha</li> <li>Conclusion: At the 5% significance level, there is insufficient evidence to conclude that the mean entry-level salaries of mechanical engineers is lower than that of electrical engineers.&#8217;</li> </ol> </li> </ol> <p>5)</p> <ol id="fs-idp184583584" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">µ<sub>1</sub></em> = <em data-effect="italics">µ<sub>2</sub></em></li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">µ<sub>1</sub></em> ≠ <em data-effect="italics">µ<sub>2</sub></em></li> <li>\({\overline{X}}_{1}-{\overline{X}}_{2}\) is the difference between the mean times for completing a lap in races and in practices.</li> <li><em data-effect="italics">t</em><sub>20.32</sub></li> <li>test statistic: –4.70</li> <li><em data-effect="italics">p</em>-value: 0.0001</li> <li>Check student’s solution.</li> <li><ol id="fs-idp173433584" type="i"><li>Alpha: 0.05</li> <li>Decision: Reject the null hypothesis.</li> <li>Reason for Decision: <em data-effect="italics">p</em>-value &lt; alpha</li> <li>Conclusion: At the 5% significance level, there is sufficient evidence to conclude that the mean time for completing a lap in races is different from that in practices.</li> </ol> </li> </ol> <p>7)</p> <div data-type="exercise"><div data-type="solution"><ol id="fs-idp172770560" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">µ<sub>1</sub></em> = <em data-effect="italics">µ<sub>2</sub></em></li> <li><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">µ<sub>1</sub></em> ≠ <em data-effect="italics">µ<sub>2</sub></em></li> <li>is the difference between the mean times for completing a lap in races and in practices.</li> <li><em data-effect="italics">t</em><sub>40.94</sub></li> <li>test statistic: –5.08</li> <li><em data-effect="italics">p</em>-value: zero</li> <li>Check student’s solution.</li> <li><ol id="fs-idp77676896" type="i"><li>Alpha: 0.05</li> <li>Decision: Reject the null hypothesis.</li> <li>Reason for Decision: <em data-effect="italics">p</em>-value &lt; alpha</li> <li>Conclusion: At the 5% significance level, there is sufficient evidence to conclude that the mean time for completing a lap in races is different from that in practices.</li> </ol> </li> </ol> </div> </div> <div data-type="exercise"><div data-type="problem"><p> In two to three complete sentences, explain in detail how you might use Terri Vogel’s data to answer the following question. “Does Terri Vogel drive faster in races than she does in practices?”</p> <p>9) c</p> <p>11)</p> <p id="fs-idm189266480">Test: two independent sample means, population standard deviations unknown.</p> <p id="fs-idm7792944">Random variable: \({\overline{X}}_{1}-{\overline{X}}_{2}\)</p> <p id="fs-idm96995296">Distribution: <em data-effect="italics">H<sub>0</sub></em>: <em data-effect="italics">μ</em><sub>1</sub> = <em data-effect="italics">μ</em><sub>2</sub><em data-effect="italics">H<sub>a</sub></em>: <em data-effect="italics">μ</em><sub>1</sub> &lt; <em data-effect="italics">μ</em><sub>2</sub> The mean age of entering prostitution in Canada is lower than the mean age in the United States.</p> <div id="fs-idm122708400" class="bc-figure figure"><span id="fs-idm5558192" data-type="media" data-alt="This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the left of zero extends from the axis to the curve. The region under the curve to the left of the line is shaded representing p-value = 0.0157."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_C10_M02_005anno-1.jpg" alt="This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the left of zero extends from the axis to the curve. The region under the curve to the left of the line is shaded representing p-value = 0.0157." width="380" data-media-type="image/jpg" /></span></div> <p id="fs-idm95276832">Graph: left-tailed</p> <p id="fs-idm5467280"><em data-effect="italics">p</em>-value : 0.0151</p> <p id="fs-idm171100928">Decision: Do not reject <em data-effect="italics">H</em><sub>0</sub>.</p> <p>Conclusion: At the 1% level of significance, from the sample data, there is not sufficient evidence to conclude that the mean age of entering prostitution in Canada is lower than the mean age in the United States.</p> <p>13) d</p> </div> </div> </div> </div> </div> <div class="textbox shaded" data-type="glossary"><h3 data-type="glossary-title">Glossary</h3> <dl><dt>Degrees of Freedom (<em data-effect="italics">df</em>)</dt> <dd id="id21146024">the number of objects in a sample that are free to vary.</dd> </dl> <dl><dt>Standard Deviation</dt> <dd id="id10539417">A number that is equal to the square root of the variance and measures how far data values are from their mean; notation: <em data-effect="italics">s</em> for sample standard deviation and <em data-effect="italics">σ</em> for population standard deviation.</dd> </dl> <dl id="variable"><dt>Variable (Random Variable)</dt> <dd id="id1167450448490">a characteristic of interest in a population being studied. Common notation for variables are upper-case Latin letters <em data-effect="italics">X</em>, <em data-effect="italics">Y</em>, <em data-effect="italics">Z</em>,&#8230; Common notation for a specific value from the domain (set of all possible values of a variable) are lower-case Latin letters <em data-effect="italics">x</em>, <em data-effect="italics">y</em>, <em data-effect="italics">z</em>,&#8230;. For example, if <em data-effect="italics">X</em> is the number of children in a family, then <em data-effect="italics">x</em> represents a specific integer 0, 1, 2, 3, &#8230;. Variables in statistics differ from variables in intermediate algebra in two following ways. <ul id="fs-idp40452080"><li>The domain of the random variable (RV) is not necessarily a numerical set; the domain may be expressed in words; for example, if <em data-effect="italics">X</em> = hair color, then the domain is {black, blond, gray, green, orange}.</li> <li>We can tell what specific value <em data-effect="italics">x</em> of the random variable <em data-effect="italics">X</em> takes only after performing the experiment.</li> </ul> </dd> </dl> </div> </div></div>
<div class="chapter standard" id="chapter-hypothesis-testing-for-two-means-and-two-proportions" title="Activity 11.5: Hypothesis Testing for Two Means and Two Proportions"><div class="chapter-title-wrap"><h3 class="chapter-number">66</h3><h2 class="chapter-title"><span class="display-none">Activity 11.5: Hypothesis Testing for Two Means and Two Proportions</span></h2></div><div class="ugc chapter-ugc"><p>&nbsp;</p> <div id="fs-id1918441" class="statistics lab" data-type="note" data-has-label="true" data-label=""><div data-type="title">Hypothesis Testing for Two Means and Two Proportions</div> <p id="id5260693">Class Time:</p> <p id="id6186867">Names:</p> <p id="fs-idm110643216"><span data-type="title">Student Learning Outcomes</span></p> <ul id="id3866304"><li>The student will select the appropriate distributions to use in each case.</li> <li>The student will conduct hypothesis tests and interpret the results.</li> </ul> <div id="id6083673" data-type="list"><div data-type="title">Supplies:</div> <ul><li>the business section from two consecutive days’ newspapers</li> <li>three small packages of M&amp;Ms®</li> <li>five small packages of Reese&#8217;s Pieces®</li> </ul> </div> <p><span data-type="title">Increasing Stocks Survey</span><span data-type="newline"><br /> </span>Look at yesterday’s newspaper business section. Conduct a hypothesis test to determine if the proportion of New York Stock Exchange (NYSE) stocks that increased is greater than the proportion of NASDAQ stocks that increased. As randomly as possible, choose 40 NYSE stocks, and 32 NASDAQ stocks and complete the following statements.</p> <ol id="list-86476434" data-mark-suffix="."><li><em data-effect="italics">H<sub>0</sub>: _________</em></li> <li><em data-effect="italics">H<sub>a</sub>: _________</em></li> <li>In words, define the random variable.</li> <li>The distribution to use for the test is _____________.</li> <li>Calculate the test statistic using your data.</li> <li>Draw a graph and label it appropriately. Shade the actual level of significance. <ol id="list-875872354" type="a"><li>Graph: <div id="id6690350" class="bc-figure figure"><span id="id6690354" data-type="media" data-alt="This is a blank graph template. The vertical and horizontal axes are unlabeled."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch10_11_02-1.png" alt="This is a blank graph template. The vertical and horizontal axes are unlabeled." width="380" data-media-type="image/png" /></span></div> </li> <li>Calculate the <em data-effect="italics">p</em>-value.</li> </ol> </li> <li>Do you reject or not reject the null hypothesis? Why?</li> <li>Write a clear conclusion using a complete sentence.</li> </ol> <p><span data-type="title">Decreasing Stocks Survey</span> <span data-type="newline"><br /> </span>Randomly pick eight stocks from the newspaper. Using two consecutive days’ business sections, test whether the stocks went down, on average, for the second day.</p> <ol data-mark-suffix="."><li><em data-effect="italics">H<sub>0</sub></em>: ________</li> <li><em data-effect="italics">H<sub>a</sub>: ________</em></li> <li>In words, define the random variable.</li> <li>The distribution to use for the test is _____________.</li> <li>Calculate the test statistic using your data.</li> <li>Draw a graph and label it appropriately. Shade the actual level of significance. <ol id="list-8758723552564" type="a"><li>Graph: <div id="id8343352" class="bc-figure figure"><span id="id8343356" data-type="media" data-alt="This is a blank graph template. The vertical and horizontal axes are unlabeled."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/fig-ch10_11_02-1.png" alt="This is a blank graph template. The vertical and horizontal axes are unlabeled." width="380" data-media-type="image/png" /></span></div> </li> <li>Calculate the <em data-effect="italics">p</em>-value:</li> </ol> </li> <li>Do you reject or not reject the null hypothesis? Why?</li> <li>Write a clear conclusion using a complete sentence.</li> </ol> <p id="element-97568234"><span data-type="title">Candy Survey</span> Buy three small packages of M&amp;Ms and five small packages of Reese&#8217;s Pieces (same net weight as the M&amp;Ms). Test whether or not the mean number of candy pieces per package is the same for the two brands.</p> <ol id="list343" data-mark-suffix="."><li><em data-effect="italics">H<sub>0</sub></em>: ________</li> <li><em data-effect="italics">H<sub>a</sub></em>: ________</li> <li>In words, define the random variable.</li> <li>What distribution should be used for this test?</li> <li>Calculate the test statistic using your data.</li> <li>Draw a graph and label it appropriately. Shade the actual level of significance. <ol id="list-896293765" type="a"><li>Graph: <div id="id8343564" class="bc-figure figure"><span id="id8343568" data-type="media" data-alt="This is a blank graph template. The vertical and horizontal axes are unlabeled."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch10_11_03-1.png" alt="This is a blank graph template. The vertical and horizontal axes are unlabeled." width="380" data-media-type="image/png" /></span></div> </li> <li>Calculate the <em data-effect="italics">p</em>-value.</li> </ol> </li> <li>Do you reject or not reject the null hypothesis? Why?</li> <li>Write a clear conclusion using a complete sentence.</li> </ol> <p id="element-858234"><span data-type="title">Shoe Survey</span> Test whether women have, on average, more pairs of shoes than men. Include all forms of sneakers, shoes, sandals, and boots. Use your class as the sample.</p> <ol id="list287" data-mark-suffix="."><li><em data-effect="italics">H<sub>0</sub></em>: ________</li> <li><em data-effect="italics">H<sub>a</sub></em>: ________</li> <li>In words, define the random variable.</li> <li>The distribution to use for the test is ________________.</li> <li>Calculate the test statistic using your data.</li> <li>Draw a graph and label it appropriately. Shade the actual level of significance. <ol id="list-8758723875552564" type="a"><li>Graph: <div id="id8357842" class="bc-figure figure"><span id="id8357847" data-type="media" data-alt="This is a blank graph template. The vertical and horizontal axes are unlabeled."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/fig-ch10_11_04-1.png" alt="This is a blank graph template. The vertical and horizontal axes are unlabeled." width="380" data-media-type="image/png" /></span></div> </li> <li>Calculate the <em data-effect="italics">p</em>-value.</li> </ol> </li> <li>Do you reject or not reject the null hypothesis? Why?</li> <li>Write a clear conclusion using a complete sentence.</li> </ol> </div> </div></div>
<div class="back-matter appendix" id="back-matter-appendix" title="Appendix"><div class="back-matter-title-wrap"><h3 class="back-matter-number">1</h3><h1 class="back-matter-title"><span class="display-none">Appendix</span></h1></div><div class="ugc back-matter-ugc"><p>This is where you can add appendices or other back matter.</p> </div></div>
<div class="back-matter miscellaneous" id="back-matter-group-and-partner-projects" title="Group and Partner Projects"><div class="back-matter-title-wrap"><h3 class="back-matter-number">2</h3><h1 class="back-matter-title"><span class="display-none">Group and Partner Projects</span></h1></div><div class="ugc back-matter-ugc"><p>&nbsp;</p> <div id="fs-id1168805567129" class="bc-section section" data-depth="1"><h3 data-type="title">Univariate Data</h3> <div id="element-141" class="bc-section section" data-depth="2"><h4 data-type="title">Student Learning Objectives</h4> <ul id="element-815"><li>The student will design and carry out a survey.</li> <li>The student will analyze and graphically display the results of the survey.</li> </ul> </div> <div id="element-782" class="bc-section section" data-depth="2"><h4 data-type="title">Instructions</h4> <p>As you complete each task below, check it off. Answer all questions in your summary. <span data-type="newline"><br /> </span>____ Decide what data you are going to study.</p> <div id="eip-idm151881920" data-type="note"><p id="eip-idm149175552">Here are two examples, but you may <strong>NOT</strong> use them: number of M&amp;M&#8217;s per bag, number of pencils students have in their backpacks.</p> </div> <p><span data-type="newline"><br /> </span>____ Are your data discrete or continuous? How do you know? <span data-type="newline"><br /> </span>____ Decide how you are going to collect the data (for instance, buy 30 bags of M&amp;M&#8217;s; collect data from the World Wide Web). <span data-type="newline"><br /> </span>____ Describe your sampling technique in detail. Use cluster, stratified, systematic, or simple random (using a random number generator) sampling. Do not use convenience sampling. Which method did you use? Why did you pick that method? <span data-type="newline"><br /> </span>____ Conduct your survey. <strong>Your data size must be at least 30.</strong><span data-type="newline"><br /> </span>____ Summarize your data in a chart with columns showing <strong>data value, frequency, relative frequency and cumulative relative frequency.</strong><span data-type="newline"><br /> </span>Answer the following (rounded to two decimal places):</p> <ol id="eip-idm192589760" type="a"><li>\(\overline{x}\) = _____</li> <li><em data-effect="italics">s</em> = _____</li> <li>First quartile = _____</li> <li>Median = _____</li> <li>70<sup>th</sup> percentile = _____</li> </ol> <p>____ What value is two standard deviations above the mean?</p> <p id="eip-778">____ What value is 1.5 standard deviations below the mean? <span data-type="newline"><br /> </span>____ Construct a histogram displaying your data. <span data-type="newline"><br /> </span>____ In complete sentences, describe the shape of your graph. <span data-type="newline"><br /> </span>____ Do you notice any potential outliers? If so, what values are they? Show your work in how you used the potential outlier formula to determine whether or not the values might be outliers. <span data-type="newline"><br /> </span>____ Construct a box plot displaying your data. <span data-type="newline"><br /> </span>____ Does the middle 50% of the data appear to be concentrated together or spread apart? Explain how you determined this. <span data-type="newline"><br /> </span>____ Looking at both the histogram and the box plot, discuss the distribution of your data.</p> </div> <div id="element-175" class="bc-section section" data-depth="2"><h4 data-type="title">Assignment Checklist</h4> <p id="element-810">You need to turn in the following typed and stapled packet, with pages in the following order:</p> <ul id="element-276" data-labeled-item="true" data-mark-suffix=""><li data-label="____"><strong>Cover sheet</strong>: name, class time, and name of your study</li> <li data-label="____"><strong>Summary page</strong>: This should contain paragraphs written with complete sentences. It should include answers to all the questions above. It should also include statements describing the population under study, the sample, a parameter or parameters being studied, and the statistic or statistics produced.</li> <li data-label="____"><strong>URL</strong> for data, if your data are from the World Wide Web</li> <li data-label="____"><strong>Chart of data, frequency, relative frequency, and cumulative relative frequency</strong></li> <li data-label="____"><strong>Page(s) of graphs:</strong> histogram and box plot</li> </ul> </div> </div> <div id="eip-143" class="bc-section section" data-depth="1"><h3 data-type="title">Continuous Distributions and Central Limit Theorem</h3> <div id="element-958" class="bc-section section" data-depth="2"><h4 data-type="title">Student Learning Objectives</h4> <ul id="element-721"><li>The student will collect a sample of continuous data.</li> <li>The student will attempt to fit the data sample to various distribution models.</li> <li>The student will validate the central limit theorem.</li> </ul> </div> <div id="element-463" class="bc-section section" data-depth="2"><h4 data-type="title">Instructions</h4> <p id="element-419">As you complete each task below, check it off. Answer all questions in your summary.</p> </div> <div class="bc-section section" data-depth="2"><h4 data-type="title">Part I: Sampling</h4> <p id="eip-705">____ Decide what <strong>continuous</strong> data you are going to study. (Here are two examples, but you may NOT use them: the amount of money a student spent on college supplies this term, or the length of time distance telephone call lasts.) <span data-type="newline"><br /> </span>____ Describe your sampling technique in detail. Use cluster, stratified, systematic, or simple random (using a random number generator) sampling. Do not use convenience sampling. What method did you use? Why did you pick that method? <span data-type="newline"><br /> </span>____ Conduct your survey. Gather <strong>at least 150 pieces of continuous, quantitative data</strong>. <span data-type="newline"><br /> </span>____ Define (in words) the random variable for your data. <em data-effect="italics">X</em> = _______ <span data-type="newline"><br /> </span>____ Create two lists of your data: (1) unordered data, (2) in order of smallest to largest. <span data-type="newline"><br /> </span>____ Find the sample mean and the sample standard deviation (rounded to two decimal places).</p> <ol id="list-168" type="a"><li>\(\overline{x}\) = ______</li> <li><em data-effect="italics">s</em> = ______</li> </ol> <p>____ Construct a histogram of your data containing five to ten intervals of equal width. The histogram should be a representative display of your data. Label and scale it.</p> </div> <div id="element-747" class="bc-section section" data-depth="2"><h4 data-type="title">Part II: Possible Distributions</h4> <p id="eip-156">____ Suppose that <em data-effect="italics">X</em> followed the following theoretical distributions. Set up each distribution using the appropriate information from your data. <span data-type="newline"><br /> </span>____ Uniform: <em data-effect="italics">X</em> ~ <em data-effect="italics">U</em> ____________ Use the lowest and highest values as <em data-effect="italics">a</em> and <em data-effect="italics">b</em>. <span data-type="newline"><br /> </span>____ Normal: <em data-effect="italics">X</em> ~ <em data-effect="italics">N</em> ____________ Use \(\overline{x}\) to estimate for <em data-effect="italics">μ</em> and <em data-effect="italics">s</em> to estimate for <em data-effect="italics">σ</em>. <span data-type="newline"><br /> </span>____ <strong>Must</strong> your data fit one of the above distributions? Explain why or why not. <span data-type="newline"><br /> </span>____ <strong>Could</strong> the data fit two or three of the previous distributions (at the same time)? Explain. <span data-type="newline"><br /> </span>____ Calculate the value <em data-effect="italics">k</em>(an <em data-effect="italics">X</em> value) that is 1.75 standard deviations above the sample mean. <em data-effect="italics">k</em> = _________ (rounded to two decimal places) Note: <em data-effect="italics">k</em> = \(\overline{x}\) + (1.75)<em data-effect="italics">s</em> <span data-type="newline"><br /> </span>____ Determine the relative frequencies (<em data-effect="italics">RF</em>) rounded to four decimal places.</p> <div id="eip-354" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="eip-idm12100736">\(RF=\frac{\text{frequency}}{\text{total number surveyed}}\)</p> </div> <ol id="eip-965" type="a"><li><em data-effect="italics">RF</em>(<em data-effect="italics">X</em> &lt; <em data-effect="italics">k</em>) = ______</li> <li><em data-effect="italics">RF</em>(<em data-effect="italics">X</em> &gt; <em data-effect="italics">k</em>) = ______</li> <li><em data-effect="italics">RF</em>(<em data-effect="italics">X</em> = <em data-effect="italics">k</em>) = ______</li> </ol> <div id="id16765214" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="fs-idp165084800">You should have one page for the uniform distribution, one page for the exponential distribution, and one page for the normal distribution.</p> </div> <p id="eip-917">____ State the distribution: <em data-effect="italics">X</em> ~ _________ <span data-type="newline"><br /> </span>____ Draw a graph for each of the three theoretical distributions. Label the axes and mark them appropriately. <span data-type="newline"><br /> </span>____ Find the following theoretical probabilities (rounded to four decimal places).</p> <ol id="eip-idm28652656" type="a"><li><em data-effect="italics">P</em>(<em data-effect="italics">X</em> &lt; <em data-effect="italics">k</em>) = ______</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">X</em> &gt; <em data-effect="italics">k</em>) = ______</li> <li><em data-effect="italics">P</em>(<em data-effect="italics">X</em> = <em data-effect="italics">k</em>) = ______</li> </ol> <p>____ Compare the relative frequencies to the corresponding probabilities. Are the values close? <span data-type="newline"><br /> </span>____ Does it appear that the data fit the distribution well? Justify your answer by comparing the probabilities to the relative frequencies, and the histograms to the theoretical graphs.</p> </div> <div class="bc-section section" data-depth="2"><h4 data-type="title">Part III: CLT Experiments</h4> <p id="eip-159">______ From your original data (before ordering), use a random number generator to pick 40 samples of size five. For each sample, calculate the average. <span data-type="newline"><br /> </span>______ On a separate page, attached to the summary, include the 40 samples of size five, along with the 40 sample averages. <span data-type="newline"><br /> </span>______ List the 40 averages in order from smallest to largest. <span data-type="newline"><br /> </span>______ Define the random variable, \(\overline{X}\), in words. \(\overline{X}\) = _______________ <span data-type="newline"><br /> </span>______ State the approximate theoretical distribution of \(\overline{X}\). \(\overline{X}\) ~ ______________ <span data-type="newline"><br /> </span>______ Base this on the mean and standard deviation from your original data. <span data-type="newline"><br /> </span>______ Construct a histogram displaying your data. Use five to six intervals of equal width. Label and scale it. <span data-type="newline"><br /> </span>Calculate the value \(\overline{k}\) (an \(\overline{X}\) value) that is 1.75 standard deviations above the sample mean. \(\overline{k}\) = _____ (rounded to two decimal places) <span data-type="newline"><br /> </span>Determine the relative frequencies (<em data-effect="italics">RF</em>) rounded to four decimal places.</p> <ol id="eip-idp20014992" type="a"><li><em data-effect="italics">RF</em>(\(\overline{X}\) &lt; \(\overline{k}\)) = _______</li> <li><em data-effect="italics">RF</em>(\(\overline{X}\) &gt; \(\overline{k}\)) = _______</li> <li><em data-effect="italics">RF</em>(\(\overline{X}\) = \(\overline{k}\)) = _______</li> </ol> <p>Find the following theoretical probabilities (rounded to four decimal places).</p> <ol id="list-12-2" type="a"><li><em data-effect="italics">P</em>(\(\overline{X}\) &lt; \(\overline{k}\)) = _______</li> <li><em data-effect="italics">P</em>(\(\overline{X}\) &gt; \(\overline{k}\)) = _______</li> <li><em data-effect="italics">P</em>(\(\overline{X}\) = \(\overline{k}\)) = _______</li> </ol> <p>______ Draw the graph of the theoretical distribution of \(X\). <span data-type="newline"><br /> </span>______ Compare the relative frequencies to the probabilities. Are the values close? <span data-type="newline"><br /> </span>______ Does it appear that the data of averages fit the distribution of \(\overline{X}\) well? Justify your answer by comparing the probabilities to the relative frequencies, and the histogram to the theoretical graph. <span data-type="newline"><br /> </span>In three to five complete sentences for each, answer the following questions. Give thoughtful explanations. <span data-type="newline"><br /> </span>______ In summary, do your original data seem to fit the uniform, exponential, or normal distributions? Answer why or why not for each distribution. If the data do not fit any of those distributions, explain why. <span data-type="newline"><br /> </span>______ What happened to the shape and distribution when you averaged your data? <strong>In theory,</strong> what should have happened? In theory, would “it” always happen? Why or why not? <span data-type="newline"><br /> </span>______ Were the relative frequencies compared to the theoretical probabilities closer when comparing the \(X\) or \(\overline{X}\) distributions? Explain your answer.</p> </div> <div id="element-413" class="bc-section section" data-depth="2"><h4 data-type="title">Assignment Checklist</h4> <p id="element-394">You need to turn in the following typed and stapled packet, with pages in the following order: <span data-type="newline"><br /> </span>____ <strong>Cover sheet</strong>: name, class time, and name of your study <span data-type="newline"><br /> </span>____ <strong>Summary pages</strong>: These should contain several paragraphs written with complete sentences that describe the experiment, including what you studied and your sampling technique, as well as answers to all of the questions previously asked questions <span data-type="newline"><br /> </span>____ <strong>URL</strong> for data, if your data are from the World Wide Web <span data-type="newline"><br /> </span>____ <strong>Pages, one for each theoretical distribution</strong>, with the distribution stated, the graph, and the probability questions answered <span data-type="newline"><br /> </span>____ <strong>Pages of the data requested</strong> <span data-type="newline"><br /> </span>____ <strong>All graphs required</strong></p> </div> </div> <div id="eip-496" class="bc-section section" data-depth="1"><h3 data-type="title">Hypothesis Testing-Article</h3> <div id="element-517" class="bc-section section" data-depth="2"><h4 data-type="title">Student Learning Objectives</h4> <ul id="element-599"><li>The student will identify a hypothesis testing problem in print.</li> <li>The student will conduct a survey to verify or dispute the results of the hypothesis test.</li> <li>The student will summarize the article, analysis, and conclusions in a report.</li> </ul> </div> <div id="element-708" class="bc-section section" data-depth="2"><h4 data-type="title">Instructions</h4> <p id="element-262">As you complete each task, check it off. Answer all questions in your summary. <span data-type="newline"><br /> </span>____<strong>Find an article</strong> in a newspaper, magazine, or on the internet which makes a claim about <strong>ONE</strong> population mean or <strong>ONE</strong> population proportion. The claim may be based upon a survey that the article was reporting on. Decide whether this claim is the null or alternate hypothesis. <span data-type="newline"><br /> </span>____<strong>Copy or print out the article</strong> and include a copy in your project, along with the source. <span data-type="newline"><br /> </span>____<strong>State how you will collect your data.</strong> (Convenience sampling is not acceptable.) <span data-type="newline"><br /> </span>____<strong>Conduct your survey. You must have more than 50 responses in your sample.</strong> When you hand in your final project, attach the tally sheet or the packet of questionnaires that you used to collect data. Your data must be real. <span data-type="newline"><br /> </span>____<strong>State the statistics</strong> that are a result of your data collection: sample size, sample mean, and sample standard deviation, OR sample size and number of successes. <span data-type="newline"><br /> </span>____<strong>Make two copies of the appropriate solution sheet.</strong> <span data-type="newline"><br /> </span>____<strong>Record the hypothesis test</strong> on the solution sheet, based on your experiment. <strong>Do a DRAFT solution</strong> first on one of the solution sheets and check it over carefully. Have a classmate check your solution to see if it is done correctly. Make your decision using a 5% level of significance. Include the 95% confidence interval on the solution sheet. <span data-type="newline"><br /> </span>____<strong>Create a graph that illustrates your data.</strong> This may be a pie or bar graph or may be a histogram or box plot, depending on the nature of your data. Produce a graph that makes sense for your data and gives useful visual information about your data. You may need to look at several types of graphs before you decide which is the most appropriate for the type of data in your project. <span data-type="newline"><br /> </span>____<strong>Write your summary</strong> (in complete sentences and paragraphs, with proper grammar and correct spelling) that describes the project. The summary <strong>MUST</strong> include:</p> <ol id="eip-idm60281968" type="a"><li>Brief discussion of the article, including the source</li> <li>Statement of the claim made in the article (one of the hypotheses).</li> <li>Detailed description of how, where, and when you collected the data, including the sampling technique; did you use cluster, stratified, systematic, or simple random sampling (using a random number generator)? As previously mentioned, convenience sampling is not acceptable.</li> <li>Conclusion about the article claim in light of your hypothesis test; this is the conclusion of your hypothesis test, stated in words, in the context of the situation in your project in sentence form, as if you were writing this conclusion for a non-statistician.</li> <li>Sentence interpreting your confidence interval in the context of the situation in your project</li> </ol> </div> <div id="element-253" class="bc-section section" data-depth="2"><h4 data-type="title">Assignment Checklist</h4> <p id="element-22">Turn in the following typed (12 point) and stapled packet for your final project: <span data-type="newline"><br /> </span>____<strong>Cover sheet</strong> containing your name(s), class time, and the name of your study <span data-type="newline"><br /> </span>____<strong>Summary</strong>, which includes all items listed on summary checklist <span data-type="newline"><br /> </span>____<strong>Solution sheet</strong> neatly and completely filled out. The solution sheet does not need to be typed. <span data-type="newline"><br /> </span>____<strong>Graphic representation of your data</strong>, created following the guidelines previously discussed; include only graphs which are appropriate and useful. <span data-type="newline"><br /> </span>____<strong>Raw data collected AND a table summarizing the sample data</strong> (<em data-effect="italics">n</em>, \(\overline{x}\) and <em data-effect="italics">s</em>; or <em data-effect="italics">x</em>, <em data-effect="italics">n</em>, and <em data-effect="italics">p</em>’, as appropriate for your hypotheses); the raw data does not need to be typed, but the summary does. Hand in the data as you collected it. (Either attach your tally sheet or an envelope containing your questionnaires.)</p> </div> </div> <div id="eip-625" class="bc-section section" data-depth="1"><h3 data-type="title">Bivariate Data, Linear Regression, and Univariate Data</h3> <div id="element-564" class="bc-section section" data-depth="2"><h4 data-type="title">Student Learning Objectives</h4> <ul id="list-564"><li>The students will collect a bivariate data sample through the use of appropriate sampling techniques.</li> <li>The student will attempt to fit the data to a linear model.</li> <li>The student will determine the appropriateness of linear fit of the model.</li> <li>The student will analyze and graph univariate data.</li> </ul> </div> <div id="element-765" class="bc-section section" data-depth="2"><h4 data-type="title">Instructions</h4> <ol id="list-765" type="1"><li>As you complete each task below, check it off. Answer all questions in your introduction or summary.</li> <li>Check your course calendar for intermediate and final due dates.</li> <li>Graphs may be constructed by hand or by computer, unless your instructor informs you otherwise. All graphs must be neat and accurate.</li> <li>All other responses must be done on the computer.</li> <li>Neatness and quality of explanations are used to determine your final grade.</li> </ol> </div> <div id="element-108" class="bc-section section" data-depth="2"><h4 data-type="title">Part I: Bivariate Data</h4> <p id="eip-idm74600944"><span data-type="title">Introduction</span>____State the bivariate data your group is going to study.</p> <div id="eip-idp126031264" data-type="note" data-label="Examples" data-element-type="Examples"><p id="eip-idp41011600">Here are two examples, but you may <strong>NOT</strong> use them: height vs. weight and age vs. running distance.</p> </div> <p><span data-type="newline"><br /> </span>____Describe your sampling technique in detail. Use cluster, stratified, systematic, or simple random sampling (using a random number generator) sampling. Convenience sampling is <strong>NOT</strong> acceptable. <span data-type="newline"><br /> </span>____Conduct your survey. Your number of pairs must be at least 30. <span data-type="newline"><br /> </span>____Print out a copy of your data.</p> <p id="eip-idp338032"><span data-type="title">Analysis</span> ____On a separate sheet of paper construct a scatter plot of the data. Label and scale both axes. <span data-type="newline"><br /> </span>____State the least squares line and the correlation coefficient. <span data-type="newline"><br /> </span>____On your scatter plot, in a different color, construct the least squares line. <span data-type="newline"><br /> </span>____Is the correlation coefficient significant? Explain and show how you determined this. <span data-type="newline"><br /> </span>____Interpret the slope of the linear regression line in the context of the data in your project. Relate the explanation to your data, and quantify what the slope tells you. <span data-type="newline"><br /> </span>____Does the regression line seem to fit the data? Why or why not? If the data does not seem to be linear, explain if any other model seems to fit the data better. <span data-type="newline"><br /> </span>____Are there any outliers? If so, what are they? Show your work in how you used the potential outlier formula in the Linear Regression and Correlation chapter (since you have bivariate data) to determine whether or not any pairs might be outliers.</p> </div> <div id="element-645" class="bc-section section" data-depth="2"><h4 data-type="title">Part II: Univariate Data</h4> <p>In this section, you will use the data for <strong>ONE</strong> variable only. Pick the variable that is more interesting to analyze. For example: if your independent variable is sequential data such as year with 30 years and one piece of data per year, your <em data-effect="italics">x</em>-values might be 1971, 1972, 1973, 1974, …, 2000. This would not be interesting to analyze. In that case, choose to use the dependent variable to analyze for this part of the project. <span data-type="newline"><br /> </span>_____Summarize your data in a chart with columns showing data value, frequency, relative frequency, and cumulative relative frequency. <span data-type="newline"><br /> </span>_____Answer the following question, rounded to two decimal places:</p> <ol id="eip-idp79349232" type="a"><li>Sample mean = ______</li> <li>Sample standard deviation = ______</li> <li>First quartile = ______</li> <li>Third quartile = ______</li> <li>Median = ______</li> <li>70th percentile = ______</li> <li>Value that is 2 standard deviations above the mean = ______</li> <li>Value that is 1.5 standard deviations below the mean = ______</li> </ol> <p>_____Construct a histogram displaying your data. Group your data into six to ten intervals of equal width. Pick regularly spaced intervals that make sense in relation to your data. For example, do NOT group data by age as 20-26,27-33,34-40,41-47,48-54,55-61 . . . Instead, maybe use age groups 19.5-24.5, 24.5-29.5, . . . or 19.5-29.5, 29.5-39.5, 39.5-49.5, . . . <span data-type="newline"><br /> </span>_____In complete sentences, describe the shape of your histogram. <span data-type="newline"><br /> </span>_____Are there any potential outliers? Which values are they? Show your work and calculations as to how you used the potential outlier formula in <a href="/contents/67ff0f10-8867-4852-85f6-0f0be2257ed4">Descriptive Statistics</a> (since you are now using univariate data) to determine which values might be outliers. <span data-type="newline"><br /> </span>_____Construct a box plot of your data. <span data-type="newline"><br /> </span>_____Does the middle 50% of your data appear to be concentrated together or spread out? Explain how you determined this. <span data-type="newline"><br /> </span>_____Looking at both the histogram AND the box plot, discuss the distribution of your data. For example: how does the spread of the middle 50% of your data compare to the spread of the rest of the data represented in the box plot; how does this correspond to your description of the shape of the histogram; how does the graphical display show any outliers you may have found; does the histogram show any gaps in the data that are not visible in the box plot; are there any interesting features of your data that you should point out.</p> </div> <div id="element-460" class="bc-section section" data-depth="2"><h4 data-type="title">Due Dates</h4> <ul id="eip-idm37795808"><li>Part I, Intro: __________ (keep a copy for your records)</li> <li>Part I, Analysis: __________ (keep a copy for your records)</li> <li><p id="eip-idp72137232">Entire Project, typed and stapled: __________</p> <p id="eip-idp63390944">____ Cover sheet: names, class time, and name of your study</p> <p id="eip-idp75829792">____ Part I: label the sections “Intro” and “Analysis.”</p> <p id="eip-idp27788944">____ Part II:</p> <p id="eip-idm94529296">____ Summary page containing several paragraphs written in complete sentences describing the experiment, including what you studied and how you collected your data. The summary page should also include answers to ALL the questions asked above.</p> <p id="eip-idp59858688">____ All graphs requested in the project</p> <p id="eip-idm78775584">____ All calculations requested to support questions in data</p> <p id="eip-idp46295168">____ Description: what you learned by doing this project, what challenges you had, how you overcame the challenges</p> </li> </ul> <div id="id41138276" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="eip-idp153219504">Include answers to ALL questions asked, even if not explicitly repeated in the items above.</p> </div> </div> </div> </div></div>
<div class="back-matter miscellaneous" id="back-matter-data-sets" title="Data Sets"><div class="back-matter-title-wrap"><h3 class="back-matter-number">3</h3><h1 class="back-matter-title"><span class="display-none">Data Sets</span></h1></div><div class="ugc back-matter-ugc"> <div id="element-244" class="bc-section section" data-depth="1"><h3 data-type="title">Lap Times</h3> <p id="element-336">The following tables provide lap times from Terri Vogel&#8217;s log book. Times are recorded in seconds for 2.5-mile laps completed in a series of races and practice runs.</p> <table id="id5853402" summary=""><caption><span data-type="title">Race Lap Times (in seconds)</span></caption> <thead><tr><th></th> <th>Lap 1</th> <th>Lap 2</th> <th>Lap 3</th> <th>Lap 4</th> <th>Lap 5</th> <th>Lap 6</th> <th>Lap 7</th> </tr> </thead> <tbody><tr><td>Race 1</td> <td>135</td> <td>130</td> <td>131</td> <td>132</td> <td>130</td> <td>131</td> <td>133</td> </tr> <tr><td>Race 2</td> <td>134</td> <td>131</td> <td>131</td> <td>129</td> <td>128</td> <td>128</td> <td>129</td> </tr> <tr><td>Race 3</td> <td>129</td> <td>128</td> <td>127</td> <td>127</td> <td>130</td> <td>127</td> <td>129</td> </tr> <tr><td>Race 4</td> <td>125</td> <td>125</td> <td>126</td> <td>125</td> <td>124</td> <td>125</td> <td>125</td> </tr> <tr><td>Race 5</td> <td>133</td> <td>132</td> <td>132</td> <td>132</td> <td>131</td> <td>130</td> <td>132</td> </tr> <tr><td>Race 6</td> <td>130</td> <td>130</td> <td>130</td> <td>129</td> <td>129</td> <td>130</td> <td>129</td> </tr> <tr><td>Race 7</td> <td>132</td> <td>131</td> <td>133</td> <td>131</td> <td>134</td> <td>134</td> <td>131</td> </tr> <tr><td>Race 8</td> <td>127</td> <td>128</td> <td>127</td> <td>130</td> <td>128</td> <td>126</td> <td>128</td> </tr> <tr><td>Race 9</td> <td>132</td> <td>130</td> <td>127</td> <td>128</td> <td>126</td> <td>127</td> <td>124</td> </tr> <tr><td>Race 10</td> <td>135</td> <td>131</td> <td>131</td> <td>132</td> <td>130</td> <td>131</td> <td>130</td> </tr> <tr><td>Race 11</td> <td>132</td> <td>131</td> <td>132</td> <td>131</td> <td>130</td> <td>129</td> <td>129</td> </tr> <tr><td>Race 12</td> <td>134</td> <td>130</td> <td>130</td> <td>130</td> <td>131</td> <td>130</td> <td>130</td> </tr> <tr><td>Race 13</td> <td>128</td> <td>127</td> <td>128</td> <td>128</td> <td>128</td> <td>129</td> <td>128</td> </tr> <tr><td>Race 14</td> <td>132</td> <td>131</td> <td>131</td> <td>131</td> <td>132</td> <td>130</td> <td>130</td> </tr> <tr><td>Race 15</td> <td>136</td> <td>129</td> <td>129</td> <td>129</td> <td>129</td> <td>129</td> <td>129</td> </tr> <tr><td>Race 16</td> <td>129</td> <td>129</td> <td>129</td> <td>128</td> <td>128</td> <td>129</td> <td>129</td> </tr> <tr><td>Race 17</td> <td>134</td> <td>131</td> <td>132</td> <td>131</td> <td>132</td> <td>132</td> <td>132</td> </tr> <tr><td>Race 18</td> <td>129</td> <td>129</td> <td>130</td> <td>130</td> <td>133</td> <td>133</td> <td>127</td> </tr> <tr><td>Race 19</td> <td>130</td> <td>129</td> <td>129</td> <td>129</td> <td>129</td> <td>129</td> <td>128</td> </tr> <tr><td>Race 20</td> <td>131</td> <td>128</td> <td>130</td> <td>128</td> <td>129</td> <td>130</td> <td>130</td> </tr> </tbody> </table> <table id="id11030993" summary=""><caption><span data-type="title">Practice Lap Times (in seconds)</span></caption> <thead><tr><th></th> <th>Lap 1</th> <th>Lap 2</th> <th>Lap 3</th> <th>Lap 4</th> <th>Lap 5</th> <th>Lap 6</th> <th>Lap 7</th> </tr> </thead> <tbody><tr><td>Practice 1</td> <td>142</td> <td>143</td> <td>180</td> <td>137</td> <td>134</td> <td>134</td> <td>172</td> </tr> <tr><td>Practice 2</td> <td>140</td> <td>135</td> <td>134</td> <td>133</td> <td>128</td> <td>128</td> <td>131</td> </tr> <tr><td>Practice 3</td> <td>130</td> <td>133</td> <td>130</td> <td>128</td> <td>135</td> <td>133</td> <td>133</td> </tr> <tr><td>Practice 4</td> <td>141</td> <td>136</td> <td>137</td> <td>136</td> <td>136</td> <td>136</td> <td>145</td> </tr> <tr><td>Practice 5</td> <td>140</td> <td>138</td> <td>136</td> <td>137</td> <td>135</td> <td>134</td> <td>134</td> </tr> <tr><td>Practice 6</td> <td>142</td> <td>142</td> <td>139</td> <td>138</td> <td>129</td> <td>129</td> <td>127</td> </tr> <tr><td>Practice 7</td> <td>139</td> <td>137</td> <td>135</td> <td>135</td> <td>137</td> <td>134</td> <td>135</td> </tr> <tr><td>Practice 8</td> <td>143</td> <td>136</td> <td>134</td> <td>133</td> <td>134</td> <td>133</td> <td>132</td> </tr> <tr><td>Practice 9</td> <td>135</td> <td>134</td> <td>133</td> <td>133</td> <td>132</td> <td>132</td> <td>133</td> </tr> <tr><td>Practice 10</td> <td>131</td> <td>130</td> <td>128</td> <td>129</td> <td>127</td> <td>128</td> <td>127</td> </tr> <tr><td>Practice 11</td> <td>143</td> <td>139</td> <td>139</td> <td>138</td> <td>138</td> <td>137</td> <td>138</td> </tr> <tr><td>Practice 12</td> <td>132</td> <td>133</td> <td>131</td> <td>129</td> <td>128</td> <td>127</td> <td>126</td> </tr> <tr><td>Practice 13</td> <td>149</td> <td>144</td> <td>144</td> <td>139</td> <td>138</td> <td>138</td> <td>137</td> </tr> <tr><td>Practice 14</td> <td>133</td> <td>132</td> <td>137</td> <td>133</td> <td>134</td> <td>130</td> <td>131</td> </tr> <tr><td>Practice 15</td> <td>138</td> <td>136</td> <td>133</td> <td>133</td> <td>132</td> <td>131</td> <td>131</td> </tr> </tbody> </table> </div> <div id="element-949" class="bc-section section" data-depth="1"><h3 data-type="title">Stock Prices</h3> <p id="element-492">The following table lists initial public offering (IPO) stock prices for all 1999 stocks that at least doubled in value during the first day of trading.</p> <table id="id10255660" summary=""><caption><span data-type="title">IPO Offer Prices</span></caption> <tbody><tr><td>\$17.00</td> <td>\$23.00</td> <td>\$14.00</td> <td>\$16.00</td> <td>\$12.00</td> <td>\$26.00</td> </tr> <tr><td>\$20.00</td> <td>\$22.00</td> <td>\$14.00</td> <td>\$15.00</td> <td>\$22.00</td> <td>\$18.00</td> </tr> <tr><td>\$18.00</td> <td>\$21.00</td> <td>\$21.00</td> <td>\$19.00</td> <td>\$15.00</td> <td>\$21.00</td> </tr> <tr><td>\$18.00</td> <td>\$17.00</td> <td>\$15.00</td> <td>\$25.00</td> <td>\$14.00</td> <td>\$30.00</td> </tr> <tr><td>\$16.00</td> <td>\$10.00</td> <td>\$20.00</td> <td>\$12.00</td> <td>\$16.00</td> <td>\$17.44</td> </tr> <tr><td>\$16.00</td> <td>\$14.00</td> <td>\$15.00</td> <td>\$20.00</td> <td>\$20.00</td> <td>\$16.00</td> </tr> <tr><td>\$17.00</td> <td>\$16.00</td> <td>\$15.00</td> <td>\$15.00</td> <td>\$19.00</td> <td>\$48.00</td> </tr> <tr><td>\$16.00</td> <td>\$18.00</td> <td>\$9.00</td> <td>\$18.00</td> <td>\$18.00</td> <td>\$20.00</td> </tr> <tr><td>\$8.00</td> <td>\$20.00</td> <td>\$17.00</td> <td>\$14.00</td> <td>\$11.00</td> <td>\$16.00</td> </tr> <tr><td>\$19.00</td> <td>\$15.00</td> <td>\$21.00</td> <td>\$12.00</td> <td>\$8.00</td> <td>\$16.00</td> </tr> <tr><td>\$13.00</td> <td>\$14.00</td> <td>\$15.00</td> <td>\$14.00</td> <td>\$13.41</td> <td>\$28.00</td> </tr> <tr><td>\$21.00</td> <td>\$17.00</td> <td>\$28.00</td> <td>\$17.00</td> <td>\$19.00</td> <td>\$16.00</td> </tr> <tr><td>\$17.00</td> <td>\$19.00</td> <td>\$18.00</td> <td>\$17.00</td> <td>\$15.00</td> <td></td> </tr> <tr><td>\$14.00</td> <td>\$21.00</td> <td>\$12.00</td> <td>\$18.00</td> <td>\$24.00</td> <td></td> </tr> <tr><td>\$15.00</td> <td>\$23.00</td> <td>\$14.00</td> <td>\$16.00</td> <td>\$12.00</td> <td></td> </tr> <tr><td>\$24.00</td> <td>\$20.00</td> <td>\$14.00</td> <td>\$14.00</td> <td>\$15.00</td> <td></td> </tr> <tr><td>\$14.00</td> <td>\$19.00</td> <td>\$16.00</td> <td>\$38.00</td> <td>\$20.00</td> <td></td> </tr> <tr><td>\$24.00</td> <td>\$16.00</td> <td>\$8.00</td> <td>\$18.00</td> <td>\$17.00</td> <td></td> </tr> <tr><td>\$16.00</td> <td>\$15.00</td> <td>\$7.00</td> <td>\$19.00</td> <td>\$12.00</td> <td></td> </tr> <tr><td>\$8.00</td> <td>\$23.00</td> <td>\$12.00</td> <td>\$18.00</td> <td>\$20.00</td> <td></td> </tr> <tr><td>\$21.00</td> <td>\$34.00</td> <td>\$16.00</td> <td>\$26.00</td> <td>\$14.00</td> <td></td> </tr> </tbody> </table> </div> <div id="fs-idp192653104" class="footnotes" data-depth="1"><h3 data-type="title">References</h3> <p id="fs-idp188238272">Data compiled by Jay R. Ritter of University of Florida using data from <em data-effect="italics">Securities Data Co.</em> and <em data-effect="italics">Bloomberg</em>.</p> </div> </div></div>
<div class="back-matter miscellaneous" id="back-matter-solution-sheets" title="Solution Sheets"><div class="back-matter-title-wrap"><h3 class="back-matter-number">4</h3><h1 class="back-matter-title"><span class="display-none">Solution Sheets</span></h1></div><div class="ugc back-matter-ugc"><p>&nbsp;</p> <div id="fs-id1168979367091" class="bc-section section" data-depth="1"><h3 data-type="title">Hypothesis Testing with One Sample</h3> <p id="id46712476">Class Time: __________________________<span data-type="newline"><br /> </span> Name: _____________________________________</p> <ol id="id46712495" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: _______</li> <li><em data-effect="italics">H<sub>a</sub></em>: _______</li> <li>In words, <strong>CLEARLY</strong> state what your random variable \(\overline{X}\) or \({P}^{\prime }\) represents.</li> <li>State the distribution to use for the test.</li> <li>What is the test statistic?</li> <li>What is the <em data-effect="italics">p</em>-value? In one or two complete sentences, explain what the <em data-effect="italics">p</em>-value means for this problem.</li> <li>Use the previous information to sketch a picture of this situation. CLEARLY, label and scale the horizontal axis and shade the region(s) corresponding to the <em data-effect="italics">p</em>-value. <div id="id13552252" class="bc-figure figure"><span id="id13552256" data-type="media" data-alt="This is the frequency curve of a normal distribution with blank horizontal and vertical axes."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/CNX_Stats_Appendix_ART_Figure_14.1-1.jpg" alt="This is the frequency curve of a normal distribution with blank horizontal and vertical axes." width="380" data-media-type="image/png" /></span></div> </li> <li>Indicate the correct decision (“reject” or “do not reject” the null hypothesis), the reason for it, and write an appropriate conclusion, using <strong>complete sentences</strong>. <ol id="list1" type="i"><li>Alpha: _______</li> <li>Decision: _______</li> <li>Reason for decision: _______</li> <li>Conclusion: _______</li> </ol> </li> <li>Construct a 95% confidence interval for the true mean or proportion. Include a sketch of the graph of the situation. Label the point estimate and the lower and upper bounds of the confidence interval. <div id="id13552369" class="bc-figure figure"><span id="id13552374" data-type="media" data-alt="This is the frequency curve of a normal distribution with blank horizontal and vertical axes."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_Appendix_ART_Figure_14.2-1.jpg" alt="This is the frequency curve of a normal distribution with blank horizontal and vertical axes." width="380" data-media-type="image/png" /></span></div> </li> </ol> </div> <div id="eip-422" class="bc-section section" data-depth="1"><h3 data-type="title">Hypothesis Testing with Two Samples</h3> <p id="eip-id1165747001060">Class Time: __________________________<span data-type="newline"><br /> </span> Name: _____________________________________</p> <ol id="eip-id1165746505302" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: _______</li> <li><em data-effect="italics">H<sub>a</sub></em>: _______</li> <li>In words, <strong>clearly</strong> state what your random variable \({\overline{X}}_{1}-{\overline{X}}_{2}\), \({{P}^{\prime }}_{1}-{{P}^{\prime }}_{2}\) or \({\overline{X}}_{d}\) represents.</li> <li>State the distribution to use for the test.</li> <li>What is the test statistic?</li> <li>What is the <em data-effect="italics">p</em>-value? In one to two complete sentences, explain what the p-value means for this problem.</li> <li>Use the previous information to sketch a picture of this situation. <strong>CLEARLY</strong> label and scale the horizontal axis and shade the region(s) corresponding to the <em data-effect="italics">p</em>-value. <div id="id13017561" class="bc-figure figure"><span id="id13017566" data-type="media" data-alt="This is the frequency curve of a normal distribution with blank horizontal and vertical axes."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_Appendix_ART_Figure_14.3-1.jpg" alt="This is the frequency curve of a normal distribution with blank horizontal and vertical axes." width="380" data-media-type="image/png" /></span></div> </li> <li>Indicate the correct decision (“reject” or “do not reject” the null hypothesis), the reason for it, and write an appropriate conclusion, using <strong>complete sentences</strong>. <ol id="eip-id1165749453228" type="a"><li>Alpha: _______</li> <li>Decision: _______</li> <li>Reason for decision: _______</li> <li>Conclusion: _______</li> </ol> </li> <li>In complete sentences, explain how you determined which distribution to use.</li> </ol> </div> <div id="eip-494" class="bc-section section" data-depth="1"><h3 data-type="title">The Chi-Square Distribution</h3> <p id="id4294030">Class Time: __________________________<span data-type="newline"><br /> </span> Name: ____________________________________</p> <ol id="id8612793" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: _______</li> <li><em data-effect="italics">H<sub>a</sub></em>: _______</li> <li>What are the degrees of freedom?</li> <li>State the distribution to use for the test.</li> <li>What is the test statistic?</li> <li>What is the <em data-effect="italics">p</em>-value? In one to two complete sentences, explain what the <em data-effect="italics">p</em>-value means for this problem.</li> <li>Use the previous information to sketch a picture of this situation. <strong>Clearly</strong> label and scale the horizontal axis and shade the region(s) corresponding to the <em data-effect="italics">p</em>-value. <div id="id10146455" class="bc-figure figure"><span id="id10146460" data-type="media" data-alt="This is a right-skewed frequency curve with blank horizontal and vertical axes."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_Appendix_ART_Figure_14.4-1.jpg" alt="This is a right-skewed frequency curve with blank horizontal and vertical axes." width="380" data-media-type="image/png" /></span></div> </li> <li>Indicate the correct decision (“reject” or “do not reject” the null hypothesis) and write appropriate conclusions, using <strong>complete sentences. </strong> <ol id="list1232523432" type="i"><li>Alpha: _______</li> <li>Decision: _______</li> <li>Reason for decision: _______</li> <li>Conclusion: _______</li> </ol> </li> </ol> </div> <div id="eip-893" class="bc-section section" data-depth="1"><h3 data-type="title">F Distribution and One-Way ANOVA</h3> <p id="eip-id1172364029192">Class Time: __________________________<span data-type="newline"><br /> </span> Name: ____________________________________</p> <ol id="id3582263" type="a"><li><em data-effect="italics">H<sub>0</sub></em>: _______</li> <li><em data-effect="italics">H<sub>a</sub></em>: _______</li> <li><em data-effect="italics">df</em>(<em data-effect="italics">n</em>) = ______ <em data-effect="italics">df</em>(<em data-effect="italics">d</em>) = _______</li> <li>State the distribution to use for the test.</li> <li>What is the test statistic?</li> <li>What is the <em data-effect="italics">p</em>-value?</li> <li>Use the previous information to sketch a picture of this situation. <strong>Clearly</strong> label and scale the horizontal axis and shade the region(s) corresponding to the <em data-effect="italics">p</em>-value. <div id="element-123123" class="bc-figure figure"><span id="id27331291" data-type="media" data-alt="This is an unlabeled number line."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/CNX_Stats_Appendix_ART_Figure_14.5-1.jpg" alt="This is an unlabeled number line." width="350" data-media-type="image/png" /></span></div> </li> <li>Indicate the correct decision (“reject” or “do not reject” the null hypothesis) and write appropriate conclusions, using <strong>complete sentences</strong>. <ol id="list12345" type="a"><li>Alpha: _______</li> <li>Decision: _______</li> <li>Reason for decision: _______</li> <li>Conclusion: _______</li> </ol> </li> </ol> </div> </div></div>
<div class="back-matter miscellaneous" id="back-matter-mathematical-phrases-symbols-and-formulas" title="Mathematical Phrases, Symbols, and Formulas"><div class="back-matter-title-wrap"><h3 class="back-matter-number">5</h3><h1 class="back-matter-title"><span class="display-none">Mathematical Phrases, Symbols, and Formulas</span></h1></div><div class="ugc back-matter-ugc"><p>&nbsp;</p> <div id="fs-id1168975964661" class="bc-section section" data-depth="1"><h3 data-type="title">English Phrases Written Mathematically</h3> <table id="id7514633" summary="A mathematical translation of English phrases (for example, X is at least 4)."><thead><tr><th>When the English says:</th> <th>Interpret this as:</th> </tr> </thead> <tbody><tr><td><em data-effect="italics">X</em> is at least 4.</td> <td><em data-effect="italics">X</em> ≥ 4</td> </tr> <tr><td>The minimum of <em data-effect="italics">X</em> is 4.</td> <td><em data-effect="italics">X</em> ≥ 4</td> </tr> <tr><td><em data-effect="italics">X</em> is no less than 4.</td> <td><em data-effect="italics">X</em> ≥ 4</td> </tr> <tr><td><em data-effect="italics">X</em> is greater than or equal to 4.</td> <td><em data-effect="italics">X</em> ≥ 4</td> </tr> <tr><td><em data-effect="italics">X</em> is at most 4.</td> <td><em data-effect="italics">X</em> ≤ 4</td> </tr> <tr><td>The maximum of <em data-effect="italics">X</em> is 4.</td> <td><em data-effect="italics">X</em> ≤ 4</td> </tr> <tr><td><em data-effect="italics">X</em> is no more than 4.</td> <td><em data-effect="italics">X</em> ≤ 4</td> </tr> <tr><td><em data-effect="italics">X</em> is less than or equal to 4.</td> <td><em data-effect="italics">X</em> ≤ 4</td> </tr> <tr><td><em data-effect="italics">X</em> does not exceed 4.</td> <td><em data-effect="italics">X</em> ≤ 4</td> </tr> <tr><td><em data-effect="italics">X</em> is greater than 4.</td> <td><em data-effect="italics">X</em> &gt; 4</td> </tr> <tr><td><em data-effect="italics">X</em> is more than 4.</td> <td><em data-effect="italics">X</em> &gt; 4</td> </tr> <tr><td><em data-effect="italics">X</em> exceeds 4.</td> <td><em data-effect="italics">X</em> &gt; 4</td> </tr> <tr><td><em data-effect="italics">X</em> is less than 4.</td> <td><em data-effect="italics">X</em> &lt; 4</td> </tr> <tr><td>There are fewer <em data-effect="italics">X</em> than 4.</td> <td><em data-effect="italics">X</em> &lt; 4</td> </tr> <tr><td><em data-effect="italics">X</em> is 4.</td> <td><em data-effect="italics">X</em> = 4</td> </tr> <tr><td><em data-effect="italics">X</em> is equal to 4.</td> <td><em data-effect="italics">X</em> = 4</td> </tr> <tr><td><em data-effect="italics">X</em> is the same as 4.</td> <td><em data-effect="italics">X</em> = 4</td> </tr> <tr><td><em data-effect="italics">X</em> is not 4.</td> <td><em data-effect="italics">X</em> ≠ 4</td> </tr> <tr><td><em data-effect="italics">X</em> is not equal to 4.</td> <td><em data-effect="italics">X</em> ≠ 4</td> </tr> <tr><td><em data-effect="italics">X</em> is not the same as 4.</td> <td><em data-effect="italics">X</em> ≠ 4</td> </tr> <tr><td><em data-effect="italics">X</em> is different than 4.</td> <td><em data-effect="italics">X</em> ≠ 4</td> </tr> </tbody> </table> </div> <div id="eip-732" class="bc-section section" data-depth="1"><h3 data-type="title">Formulas</h3> <div id="eip-idm676519360" class="bc-section section" data-depth="2"><h4 data-type="title">Formula 1: Factorial</h4> <p id="fs-idp61451504">\(n!=n\left(n-1\right)\left(n-2\right)&#8230;\left(1\right)\text{}\)</p> <p id="fact2">\(0!=1\text{}\)<span data-type="newline" data-count="2"></span></p> <p></p> </div> <div id="eip-idm1323712048" class="bc-section section" data-depth="2"><h4 data-type="title">Formula 2: Combinations</h4> <p id="combin">\(\left(\begin{array}{l}n\\ r\end{array}\right)=\frac{n!}{\left(n-r\right)!r!}\)<span data-type="newline" data-count="2"></span></p> <p></p> </div> <div id="eip-idm1272586464" class="bc-section section" data-depth="2"><h4 data-type="title">Formula 3: Binomial Distribution</h4> <p id="ruleexp1">\(X\phantom{\rule{2px}{0ex}}~\phantom{\rule{2px}{0ex}}B\left(n,p\right)\)</p> <p id="bindist1">\(P\left(X=x\right)=\left(\begin{array}{c}n\\ x\end{array}\right){p}^{x}{q}^{n-x}\), for \(x=0,1,2,&#8230;,n\)<span data-type="newline" data-count="2"></span></p> <p></p> </div> <div id="eip-idm772198512" class="bc-section section" data-depth="2"><h4 data-type="title">Formula 4: Geometric Distribution</h4> <p id="geodist1">\(X\phantom{\rule{2px}{0ex}}~\phantom{\rule{2px}{0ex}}G\left(p\right)\)</p> <p id="geodist2">\(P\left(X=x\right)={q}^{x-1}p\), for \(x=1,2,3,&#8230;\)<span data-type="newline" data-count="2"></span></p> <p></p> </div> <div id="eip-idm731367056" class="bc-section section" data-depth="2"><h4 data-type="title">Formula 5: Hypergeometric Distribution</h4> <p id="hypgeodist1">\(X\phantom{\rule{2px}{0ex}}~\phantom{\rule{2px}{0ex}}H\left(r,b,n\right)\)</p> <p id="element-527">\(P\text{(}X=x\text{)}=\left(\frac{\left(\genfrac{}{}{0}{}{r}{x}\right)\left(\genfrac{}{}{0}{}{b}{n-x}\right)}{\left(\genfrac{}{}{0}{}{r+b}{n}\right)}\right)\)<span data-type="newline" data-count="2"></span></p> <p></p> </div> <div id="eip-idm701382416" class="bc-section section" data-depth="2"><h4 data-type="title">Formula 6: Poisson Distribution</h4> <p id="psndist1">\(X\phantom{\rule{2px}{0ex}}~\phantom{\rule{2px}{0ex}}P\left(\mu \right)\)</p> <p id="psndist2">\(P\text{(}X=x\text{)}=\frac{{\mu }^{x}{e}^{-\mu }}{x!}\)<span data-type="newline" data-count="2"></span></p> <p></p> </div> <div id="eip-idm742753328" class="bc-section section" data-depth="2"><h4 data-type="title">Formula 7: Uniform Distribution</h4> <p id="unidist1">\(X\phantom{\rule{2px}{0ex}}~\phantom{\rule{2px}{0ex}}U\left(a,b\right)\)</p> <p id="unidist2">\(f\left(X\right)=\frac{1}{b-a}\), \(a&lt;x&lt;b\)<span data-type="newline" data-count="2"></span></p> <p></p> </div> <div id="eip-idm1453705808" class="bc-section section" data-depth="2"><h4 data-type="title">Formula 8: Exponential Distribution</h4> <p id="expdist1">\(X\phantom{\rule{2px}{0ex}}~\phantom{\rule{2px}{0ex}}Exp\left(m\right)\)</p> <p id="expdist2">\(f\left(x\right)=m{e}^{-mx}m&gt;0,x\ge 0\)<span data-type="newline" data-count="2"></span></p> <p></p> <p id="normdist1"><span data-type="title">Formula 9: Normal Distribution</span>\(X\phantom{\rule{2px}{0ex}}~\phantom{\rule{2px}{0ex}}N\left(\mu ,{\sigma }^{2}\right)\)</p> <p id="normdist2">\(f\text{(}x\text{)}=\frac{1}{\sigma \sqrt{2\pi }}{e}^{\frac{{-\left(x-\mu \right)}^{2}}{{2\sigma }^{2}}}\) , \(\phantom{\rule{12pt}{0ex}}–\infty &lt;x&lt;\infty \) <span data-type="newline" data-count="2"></span></p> <p></p> </div> <div id="eip-idm1165720912" class="bc-section section" data-depth="2"><h4 data-type="title">Formula 10: Gamma Function</h4> <p id="gammafn1">\(\Gamma \left(z\right)=\underset{\infty }{\overset{0}{{\int }^{\text{​}}}}{x}^{z-1}{e}^{-x}dx\)\(z&gt;0\)</p> <p id="gammafn2">\(\Gamma \left(\frac{1}{2}\right)=\sqrt{\pi }\)</p> <p id="gammafn3">\(\Gamma \left(m+1\right)=m!\) for \(m\), a nonnegative integer</p> <p id="gammafn4">otherwise: \(\Gamma \left(a+1\right)=a\Gamma \left(a\right)\) <span data-type="newline" data-count="2"></span></p> <p></p> </div> <div id="eip-idm682236624" class="bc-section section" data-depth="2"><h4 data-type="title">Formula 11: Student&#8217;s <em data-effect="italics">t</em>-distribution</h4> <p id="stdtdist1">\(X\phantom{\rule{2px}{0ex}}~\phantom{\rule{2px}{0ex}}{t}_{df}\)</p> <p id="stdtdist2">\(f\text{(}x\text{)}=\frac{{\left(1+\frac{{x}^{2}}{n}\right)}^{\frac{-\left(n+1\right)}{2}}\Gamma \left(\frac{n+1}{2}\right)}{\sqrt{\mathrm{n\pi }}\Gamma \left(\frac{n}{2}\right)}\)</p> <p id="stdtdist3">\(X=\frac{Z}{\sqrt{\frac{Y}{n}}}\)</p> <p id="stdtdist4">\(Z\phantom{\rule{2px}{0ex}}~\phantom{\rule{2px}{0ex}}N\left(0,1\right),\phantom{\rule{2px}{0ex}}Y\phantom{\rule{2px}{0ex}}~\phantom{\rule{2px}{0ex}}{Χ}_{df}^{2}\), \(n\) = degrees of freedom <span data-type="newline" data-count="2"></span></p> <p></p> </div> <div id="eip-idm1453739360" class="bc-section section" data-depth="2"><h4 data-type="title">Formula 12: Chi-Square Distribution</h4> <p id="chisq1">\(X\phantom{\rule{2px}{0ex}}~\phantom{\rule{2px}{0ex}}{Χ}_{df}^{2}\)</p> <p id="chisq2">\(f\text{(}x\text{)}=\frac{{x}^{\frac{n-2}{2}}{e}^{\frac{-x}{2}}}{{2}^{\frac{n}{2}}\Gamma \left(\frac{n}{2}\right)}\), \(x&gt;0\) , \(n\) = positive integer and degrees of freedom <span data-type="newline" data-count="2"></span></p> <p></p> </div> <div id="eip-idm696624960" class="bc-section section" data-depth="2"><h4 data-type="title">Formula 13: F Distribution</h4> <p id="fdis1">\(X\phantom{\rule{2px}{0ex}}~\phantom{\rule{2px}{0ex}}{F}_{df\left(n\right),df\left(d\right)}\)</p> <p id="fdis2">\(df\left(n\right)\phantom{\rule{2px}{0ex}}=\phantom{\rule{2px}{0ex}}\)degrees of freedom for the numerator</p> <p id="fdis3">\(df\left(d\right)\phantom{\rule{2px}{0ex}}=\phantom{\rule{2px}{0ex}}\)degrees of freedom for the denominator</p> <p id="fdis4">\(f\left(x\right)=\frac{\Gamma \left(\frac{u+v}{2}\right)}{\Gamma \left(\frac{u}{2}\right)\Gamma \left(\frac{v}{2}\right)}{\left(\frac{u}{v}\right)}^{\frac{u}{2}}{x}^{\left(\frac{u}{2}-1\right)}\left[1+\left(\frac{u}{v}\right){x}^{-0.5\left(u+v\right)}\right]\)</p> <p id="fdis5">\(X=\frac{{Y}_{u}}{{W}_{v}}\), \(Y\), \(W\) are chi-square</p> </div> </div> <div class="bc-section section" data-depth="1"><h3 data-type="title">Symbols and Their Meanings</h3> <table id="id7923354" summary="Symbols together with how they are pronounced are shown in a table."><caption><span data-type="title">Symbols and their Meanings</span></caption> <thead><tr><th>Chapter (1st used)</th> <th>Symbol</th> <th>Spoken</th> <th>Meaning</th> </tr> </thead> <tbody><tr><td>Sampling and Data</td> <td>\(\sqrt{\begin{array}{c}\text{  }\\ \text{      }\end{array}}\)</td> <td>The square root of</td> <td>same</td> </tr> <tr><td>Sampling and Data</td> <td>\(\pi \)</td> <td>Pi</td> <td>3.14159… (a specific number)</td> </tr> <tr><td>Descriptive Statistics</td> <td><em data-effect="italics">Q</em><sub>1</sub></td> <td>Quartile one</td> <td>the first quartile</td> </tr> <tr><td>Descriptive Statistics</td> <td><em data-effect="italics">Q</em><sub>2</sub></td> <td>Quartile two</td> <td>the second quartile</td> </tr> <tr><td>Descriptive Statistics</td> <td><em data-effect="italics">Q</em><sub>3</sub></td> <td>Quartile three</td> <td>the third quartile</td> </tr> <tr><td>Descriptive Statistics</td> <td><em data-effect="italics">IQR</em></td> <td>interquartile range</td> <td><em data-effect="italics">Q</em><sub>3</sub> – <em data-effect="italics">Q</em><sub>1</sub> = <em data-effect="italics">IQR</em></td> </tr> <tr><td>Descriptive Statistics</td> <td>\(\overline{x}\)</td> <td>x-bar</td> <td>sample mean</td> </tr> <tr><td>Descriptive Statistics</td> <td>\(\mu \)</td> <td>mu</td> <td>population mean</td> </tr> <tr><td>Descriptive Statistics</td> <td><strong>s</strong><em data-effect="italics">s<sub>x</sub></em><em data-effect="italics">sx</em></td> <td>s</td> <td>sample standard deviation</td> </tr> <tr><td>Descriptive Statistics</td> <td>\({s}^{2}\)\({s}_{x}^{2}\)</td> <td>s squared</td> <td>sample variance</td> </tr> <tr><td>Descriptive Statistics</td> <td>\(\sigma \)\({\sigma }_{x}\)<em data-effect="italics">σx</em></td> <td>sigma</td> <td>population standard deviation</td> </tr> <tr><td>Descriptive Statistics</td> <td>\({\sigma }^{2}\)\({\sigma }_{x}^{2}\)</td> <td>sigma squared</td> <td>population variance</td> </tr> <tr><td>Descriptive Statistics</td> <td>\(\Sigma \)</td> <td>capital sigma</td> <td>sum</td> </tr> <tr><td>Probability Topics</td> <td>\(\left\{\right\}\)</td> <td>brackets</td> <td>set notation</td> </tr> <tr><td>Probability Topics</td> <td>\(S\)</td> <td>S</td> <td>sample space</td> </tr> <tr><td>Probability Topics</td> <td>\(A\)</td> <td>Event A</td> <td>event A</td> </tr> <tr><td>Probability Topics</td> <td>\(P\left(A\right)\)</td> <td>probability of A</td> <td>probability of A occurring</td> </tr> <tr><td>Probability Topics</td> <td>\(P\left(\mathit{\text{A}}\text{|}\mathit{\text{B}}\right)\)</td> <td>probability of A given B</td> <td>prob. of A occurring given B has occurred</td> </tr> <tr><td>Probability Topics</td> <td>\(P\left(A\text{ OR }B\right)\)</td> <td>prob. of A or B</td> <td>prob. of A or B or both occurring</td> </tr> <tr><td>Probability Topics</td> <td>\(P\left(A\text{ AND }B\right)\)</td> <td>prob. of A and B</td> <td>prob. of both A and B occurring (same time)</td> </tr> <tr><td>Probability Topics</td> <td><em data-effect="italics">A</em>′</td> <td>A-prime, complement of A</td> <td>complement of A, not A</td> </tr> <tr><td>Probability Topics</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">A</em>&#8216;)</td> <td>prob. of complement of A</td> <td>same</td> </tr> <tr><td>Probability Topics</td> <td><em data-effect="italics">G</em><sub>1</sub></td> <td>green on first pick</td> <td>same</td> </tr> <tr><td>Probability Topics</td> <td><em data-effect="italics">P</em>(<em data-effect="italics">G</em><sub>1</sub>)</td> <td>prob. of green on first pick</td> <td>same</td> </tr> <tr><td>Discrete Random Variables</td> <td><em data-effect="italics">PDF</em></td> <td>prob. distribution function</td> <td>same</td> </tr> <tr><td>Discrete Random Variables</td> <td><em data-effect="italics">X</em></td> <td>X</td> <td>the random variable X</td> </tr> <tr><td>Discrete Random Variables</td> <td><em data-effect="italics">X</em> ~</td> <td>the distribution of X</td> <td>same</td> </tr> <tr><td>Discrete Random Variables</td> <td><em data-effect="italics">B</em></td> <td>binomial distribution</td> <td>same</td> </tr> <tr><td>Discrete Random Variables</td> <td><em data-effect="italics">G</em></td> <td>geometric distribution</td> <td>same</td> </tr> <tr><td>Discrete Random Variables</td> <td><em data-effect="italics">H</em></td> <td>hypergeometric dist.</td> <td>same</td> </tr> <tr><td>Discrete Random Variables</td> <td><em data-effect="italics">P</em></td> <td>Poisson dist.</td> <td>same</td> </tr> <tr><td>Discrete Random Variables</td> <td>\(\lambda \)</td> <td>Lambda</td> <td>average of Poisson distribution</td> </tr> <tr><td>Discrete Random Variables</td> <td>\(\ge \)</td> <td>greater than or equal to</td> <td>same</td> </tr> <tr><td>Discrete Random Variables</td> <td>\(\le \)</td> <td>less than or equal to</td> <td>same</td> </tr> <tr><td>Discrete Random Variables</td> <td>=</td> <td>equal to</td> <td>same</td> </tr> <tr><td>Discrete Random Variables</td> <td>≠</td> <td>not equal to</td> <td>same</td> </tr> <tr><td>Continuous Random Variables</td> <td><em data-effect="italics">f</em>(<em data-effect="italics">x</em>)</td> <td><em data-effect="italics">f</em> of <em data-effect="italics">x</em></td> <td>function of <em data-effect="italics">x</em></td> </tr> <tr><td>Continuous Random Variables</td> <td><em data-effect="italics">pdf</em></td> <td>prob. density function</td> <td>same</td> </tr> <tr><td>Continuous Random Variables</td> <td><em data-effect="italics">U</em></td> <td>uniform distribution</td> <td>same</td> </tr> <tr><td>Continuous Random Variables</td> <td><em data-effect="italics">Exp</em></td> <td>exponential distribution</td> <td>same</td> </tr> <tr><td>Continuous Random Variables</td> <td><em data-effect="italics">k</em></td> <td><em data-effect="italics">k</em></td> <td>critical value</td> </tr> <tr><td>Continuous Random Variables</td> <td><em data-effect="italics">f</em>(<em data-effect="italics">x</em>) =</td> <td><em data-effect="italics">f</em> of <em data-effect="italics">x</em> equals</td> <td>same</td> </tr> <tr><td>Continuous Random Variables</td> <td><em data-effect="italics">m</em></td> <td><em data-effect="italics">m</em></td> <td>decay rate (for exp. dist.)</td> </tr> <tr><td>The Normal Distribution</td> <td><em data-effect="italics">N</em></td> <td>normal distribution</td> <td>same</td> </tr> <tr><td>The Normal Distribution</td> <td><em data-effect="italics">z</em></td> <td><em data-effect="italics">z</em>-score</td> <td>same</td> </tr> <tr><td>The Normal Distribution</td> <td><em data-effect="italics">Z</em></td> <td>standard normal dist.</td> <td>same</td> </tr> <tr><td>The Central Limit Theorem</td> <td><em data-effect="italics">CLT</em></td> <td>Central Limit Theorem</td> <td>same</td> </tr> <tr><td>The Central Limit Theorem</td> <td>\(\overline{X}\)</td> <td><em data-effect="italics">X</em>-bar</td> <td>the random variable <em data-effect="italics">X</em>-bar</td> </tr> <tr><td>The Central Limit Theorem</td> <td>\({\mu }_{x}\)</td> <td>mean of <em data-effect="italics">X</em></td> <td>the average of <em data-effect="italics">X</em></td> </tr> <tr><td>The Central Limit Theorem</td> <td>\({\mu }_{\overline{x}}\)</td> <td>mean of <em data-effect="italics">X</em>-bar</td> <td>the average of <em data-effect="italics">X</em>-bar</td> </tr> <tr><td>The Central Limit Theorem</td> <td>\({\sigma }_{x}\)</td> <td>standard deviation of <em data-effect="italics">X</em></td> <td>same</td> </tr> <tr><td>The Central Limit Theorem</td> <td>\({\sigma }_{\overline{x}}\)</td> <td>standard deviation of <em data-effect="italics">X</em>-bar</td> <td>same</td> </tr> <tr><td>The Central Limit Theorem</td> <td>\(\Sigma X\)</td> <td>sum of <em data-effect="italics">X</em></td> <td>same</td> </tr> <tr><td>The Central Limit Theorem</td> <td>\(\Sigma x\)</td> <td>sum of <em data-effect="italics">x</em></td> <td>same</td> </tr> <tr><td>Confidence Intervals</td> <td><em data-effect="italics">CL</em></td> <td>confidence level</td> <td>same</td> </tr> <tr><td>Confidence Intervals</td> <td><em data-effect="italics">CI</em></td> <td>confidence interval</td> <td>same</td> </tr> <tr><td>Confidence Intervals</td> <td><em data-effect="italics">EBM</em></td> <td>error bound for a mean</td> <td>same</td> </tr> <tr><td>Confidence Intervals</td> <td><em data-effect="italics">EBP</em></td> <td>error bound for a proportion</td> <td>same</td> </tr> <tr><td>Confidence Intervals</td> <td><em data-effect="italics">t</em></td> <td>Student&#8217;s <em data-effect="italics">t</em>-distribution</td> <td>same</td> </tr> <tr><td>Confidence Intervals</td> <td><em data-effect="italics">df</em></td> <td>degrees of freedom</td> <td>same</td> </tr> <tr><td>Confidence Intervals</td> <td>\({t}_{\frac{\alpha }{2}}\)</td> <td>student t with <em data-effect="italics">a</em>/2 area in right tail</td> <td>same</td> </tr> <tr><td>Confidence Intervals</td> <td>\(p\prime \); \(\stackrel{^}{p}\)</td> <td><em data-effect="italics">p</em>-prime; <em data-effect="italics">p</em>-hat</td> <td>sample proportion of success</td> </tr> <tr><td>Confidence Intervals</td> <td>\(q\prime \); \(\stackrel{^}{q}\)</td> <td><em data-effect="italics">q</em>-prime; <em data-effect="italics">q</em>-hat</td> <td>sample proportion of failure</td> </tr> <tr><td>Hypothesis Testing</td> <td>\({H}_{0}\)</td> <td><em data-effect="italics">H</em>-naught, <em data-effect="italics">H</em>-sub 0</td> <td>null hypothesis</td> </tr> <tr><td>Hypothesis Testing</td> <td>\({H}_{a}\)</td> <td><em data-effect="italics">H-a</em>, <em data-effect="italics">H</em>-sub <em data-effect="italics">a</em></td> <td>alternate hypothesis</td> </tr> <tr><td>Hypothesis Testing</td> <td>\({H}_{1}\)</td> <td><em data-effect="italics">H</em>-1, <em data-effect="italics">H</em>-sub 1</td> <td>alternate hypothesis</td> </tr> <tr><td>Hypothesis Testing</td> <td>\(\alpha \)</td> <td>alpha</td> <td>probability of Type I error</td> </tr> <tr><td>Hypothesis Testing</td> <td>\(\beta \)</td> <td>beta</td> <td>probability of Type II error</td> </tr> <tr><td>Hypothesis Testing</td> <td>\(\overline{X1}-\overline{X2}\)</td> <td><em data-effect="italics">X</em>1-bar minus <em data-effect="italics">X</em>2-bar</td> <td>difference in sample means</td> </tr> <tr><td>Hypothesis Testing</td> <td>\({\mu }_{1}-{\mu }_{2}\)</td> <td><em data-effect="italics">mu</em>-1 minus <em data-effect="italics">mu</em>-2</td> <td>difference in population means</td> </tr> <tr><td>Hypothesis Testing</td> <td>\({{P}^{\prime }}_{1}-{{P}^{\prime }}_{2}\)</td> <td><em data-effect="italics">P</em>1-prime minus <em data-effect="italics">P</em>2-prime</td> <td>difference in sample proportions</td> </tr> <tr><td>Hypothesis Testing</td> <td>\({p}_{1}-{p}_{2}\)</td> <td><em data-effect="italics">p</em>1 minus <em data-effect="italics">p</em>2</td> <td>difference in population proportions</td> </tr> <tr><td>Chi-Square Distribution</td> <td>\({Χ}^{2}\)</td> <td><em data-effect="italics">Ky</em>-square</td> <td>Chi-square</td> </tr> <tr><td>Chi-Square Distribution</td> <td>\(O\)</td> <td>Observed</td> <td>Observed frequency</td> </tr> <tr><td>Chi-Square Distribution</td> <td>\(E\)</td> <td>Expected</td> <td>Expected frequency</td> </tr> <tr><td>Linear Regression and Correlation</td> <td><em data-effect="italics">y</em> = <em data-effect="italics">a</em> + <em data-effect="italics">bx</em></td> <td><em data-effect="italics">y</em> equals a plus <em data-effect="italics">b-x</em></td> <td>equation of a line</td> </tr> <tr><td>Linear Regression and Correlation</td> <td>\(\stackrel{^}{y}\)</td> <td><em data-effect="italics">y</em>-hat</td> <td>estimated value of <em data-effect="italics">y</em></td> </tr> <tr><td>Linear Regression and Correlation</td> <td>\(r\)</td> <td>correlation coefficient</td> <td>same</td> </tr> <tr><td>Linear Regression and Correlation</td> <td>\(\epsilon \)</td> <td>error</td> <td>same</td> </tr> <tr><td>Linear Regression and Correlation</td> <td><em data-effect="italics">SSE</em></td> <td>Sum of Squared Errors</td> <td>same</td> </tr> <tr><td>Linear Regression and Correlation</td> <td>1.9<em data-effect="italics">s</em></td> <td>1.9 times <em data-effect="italics">s</em></td> <td>cut-off value for outliers</td> </tr> <tr><td><em data-effect="italics">F</em>-Distribution and ANOVA</td> <td><em data-effect="italics">F</em></td> <td><em data-effect="italics">F</em>-ratio</td> <td><em data-effect="italics">F</em>-ratio</td> </tr> </tbody> </table> </div> </div></div>
<div class="back-matter miscellaneous" id="back-matter-notes-for-the-ti-83-83-84-84-calculators" title="Notes for the TI-83, 83+, 84, 84+ Calculators"><div class="back-matter-title-wrap"><h3 class="back-matter-number">6</h3><h1 class="back-matter-title"><span class="display-none">Notes for the TI-83, 83+, 84, 84+ Calculators</span></h1></div><div class="ugc back-matter-ugc"><p>&nbsp;</p> <div id="fs-id15632468" class="bc-section section" data-depth="1"><h3 data-type="title">Quick Tips</h3> <p id="fs-id14313746"><span data-type="title">Legend</span></p> <ul id="fs-id15693270"><li><span id="fs-id15740147" data-type="media" data-display="inline" data-alt="A blank calculator button"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-button-1.png" alt="A blank calculator button" data-media-type="image/png" /> represents a button press</span></li> <li><code>[ ]</code> represents yellow command or green letter behind a key</li> <li><code>&lt; &gt;</code> represents items on the screen</li> </ul> <p id="fs-id14248473"><span data-type="title">To adjust the contrast</span>Press <span id="fs-id14772600" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" />, then hold <span id="fs-id13797489" data-type="media" data-display="inline" data-alt="arrow up"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-dir-up-1.png" alt="arrow up" data-media-type="image/png" /> to increase the contrast or <span id="fs-id12745418" data-type="media" data-display="inline" data-alt="arrow down"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-dir-down-1.png" alt="arrow down" data-media-type="image/png" /> to decrease the contrast. </span></span></span></p> <p id="fs-id15738732"><span data-type="title">To capitalize letters and words</span>Press <span id="fs-id15336309" data-type="media" data-display="inline" data-alt="alpha key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-alpha-1.png" alt="alpha key" data-media-type="image/png" /> to get one capital letter, or press <span id="fs-id15256555" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" />, then <span id="fs-id14248475" data-type="media" data-display="inline" data-alt="alpha key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-alpha-1.png" alt="alpha key" data-media-type="image/png" /> to set all button presses to capital letters. You can return to the top-level button values by pressing <span id="fs-id13796242" data-type="media" data-display="inline" data-alt="alpha key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-alpha-1.png" alt="alpha key" data-media-type="image/png" /> again. </span></span></span></span></p> <p id="fs-id15448816"><span data-type="title">To correct a mistake</span>If you hit a wrong button, just hit <span id="fs-id15674844" data-type="media" data-display="inline" data-alt="clear key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-clear-1.png" alt="clear key" data-media-type="image/png" /> and start again. </span></p> <p id="fs-id8629783"><span data-type="title">To write in scientific notation</span>Numbers in scientific notation are expressed on the TI-83, 83+, 84, and 84+ using E notation, such that&#8230;</p> <ul id="fs-id12894103"><li>4.321 E 4 = \(\text{4}\text{.321}×{\text{10}}^{4}\)</li> <li>4.321 E –4 = \(\text{4}\text{.321}×{\text{10}}^{–4}\)</li> </ul> <p id="fs-id14327214"><span data-type="title">To transfer programs or equations from one calculator to another:</span><strong>Both calculators:</strong> Insert your respective end of the link cable cable and press <span id="fs-id15384266" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" />, then <code>[LINK]</code>. </span></p> <p id="eip-607"><span data-type="title">Calculator receiving information:</span></p> <ol id="fs-id15504626" class="stepwise" type="1"><li>Use the arrows to navigate to and select <code>&lt;RECEIVE&gt;</code></li> <li>Press <span id="fs-id14419024" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" />.</span></li> </ol> <p id="eip-442"><span data-type="title">Calculator sending information:</span></p> <ol id="fs-id13666196" class="stepwise" type="1"><li>Press appropriate number or letter.</li> <li>Use up and down arrows to access the appropriate item.</li> <li>Press <span id="fs-id14358160" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /> to select item to transfer.</span></li> <li>Press right arrow to navigate to and select <code>&lt;TRANSMIT&gt;</code>.</li> <li>Press <span id="fs-id15557080" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" />.</span></li> </ol> <div id="eip-623" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="eip-idp123049456">ERROR 35 LINK generally means that the cables have not been inserted far enough.</p> </div> <p id="eip-432"><strong>Both calculators:</strong> Insert your respective end of the link cable cable Both calculators: press <span id="fs-id4481246" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" />, then <code>[QUIT]</code> to exit when done.</span></p> </div> <div id="eip-367" class="bc-section section" data-depth="1"><h3 data-type="title">Manipulating One-Variable Statistics</h3> <div id="builtinprogram" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="eip-idp125210352">These directions are for entering data with the built-in statistical program.</p> </div> <table id="eip-886" summary="A table of sample data used throughout the rest of this section. The first column is data, while the second column represents the frequency of that data."><caption><span data-type="title">Sample Data We are manipulating one-variable statistics.</span></caption> <thead valign="top"><tr><th data-align="center">Data</th> <th data-align="center">Frequency</th> </tr> </thead> <tbody><tr><td>–2</td> <td>10</td> </tr> <tr><td>–1</td> <td>3</td> </tr> <tr><td>0</td> <td>4</td> </tr> <tr><td>1</td> <td>5</td> </tr> <tr><td>3</td> <td>8</td> </tr> </tbody> </table> <p id="eip-653"><span data-type="title">To begin:</span></p> <ol id="eip-idm14073664" type="1"><li><p id="fs-id15582062">Turn on the calculator.<span data-type="newline"><br /> </span> <span id="fs-id14420511" data-type="media" data-display="inline" data-alt="on key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-on-1.png" alt="on key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id10641750">Access statistics mode.<span data-type="newline"><br /> </span> <span id="fs-id13748217" data-type="media" data-display="inline" data-alt="stat key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-stat-1.png" alt="stat key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id15437009">Select <code>&lt;4:ClrList&gt;</code> to clear data from lists, if desired. <span data-type="newline"><br /> </span><span id="fs-id15649038" data-type="media" data-display="inline" data-alt="number 4 key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-four-1.png" alt="number 4 key" data-media-type="image/png" /></span> , <span id="fs-id9695112" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id13785018">Enter list <code>[L1]</code> to be cleared.<span data-type="newline"><br /> </span> <span id="fs-id13768754" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[L1]</code> , <span id="fs-id14335493" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id14419501">Display last instruction.<span data-type="newline"><br /> </span> <span id="fs-id14831640" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[ENTRY]</code></p> </li> <li><p id="fs-id14326768">Continue clearing remaining lists in the same fashion, if desired.<span data-type="newline"><br /> </span> <span id="fs-id14329997" data-type="media" data-display="inline" data-alt="arrow left key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-dir-left-1.png" alt="arrow left key" data-media-type="image/png" /></span> , <span id="fs-id15742818" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[L2]</code> , <span id="fs-id15395962" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id14832244">Access statistics mode.<span data-type="newline"><br /> </span> <span id="fs-id14891229" data-type="media" data-display="inline" data-alt="stat key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-stat-1.png" alt="stat key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id10927723">Select <code>&lt;1:Edit . . .&gt;</code><span data-type="newline"><br /> </span> <span id="fs-id15712626" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li><p id="eip-idm96290832">Enter data. Data values go into <code>[L1]</code>. (You may need to arrow over to <code>[L1]</code>).</p> <ul id="fs-id15712194" data-bullet-style="bullet"><li><p id="fs-id14421999">Type in a data value and enter it. (For negative numbers, use the negate (-) key at the bottom of the keypad).<span data-type="newline"><br /> </span> <span id="fs-id15478563" data-type="media" data-display="inline" data-alt="negative sign key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-negate-1.png" alt="negative sign key" data-media-type="image/png" /></span> , <span id="fs-id15648786" data-type="media" data-display="inline" data-alt="number 9 key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-nine-1.png" alt="number 9 key" data-media-type="image/png" /></span> , <span id="fs-id14357766" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li>Continue in the same manner until all data values are entered.</li> </ul> </li> <li><p id="eip-idm74059056">In <code>[L2]</code>, enter the frequencies for each data value in <code>[L1]</code>.</p> <ul id="fs-id4481227" data-bullet-style="bullet"><li><p id="fs-id15505645">Type in a frequency and enter it. (If a data value appears only once, the frequency is &#8220;1&#8221;).<span data-type="newline"><br /> </span> <span id="fs-id15710714" data-type="media" data-display="inline" data-alt="number 4 key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-four-1.png" alt="number 4 key" data-media-type="image/png" /></span> , <span id="fs-id15650017" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li>Continue in the same manner until all data values are entered.</li> </ul> </li> <li><p id="fs-id15743532">Access statistics mode.<span data-type="newline"><br /> </span> <span id="fs-id14418553" data-type="media" data-display="inline" data-alt="stat key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-stat-1.png" alt="stat key" data-media-type="image/png" /></span></p> </li> <li>Navigate to <code>&lt;CALC&gt;</code>.</li> <li><p id="fs-id14357978">Access <code>&lt;1:1-var Stats&gt;</code>.<span data-type="newline"><br /> </span> <span id="fs-id13797091" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id13788979">Indicate that the data is in <code>[L1]</code>&#8230;<span data-type="newline"><br /> </span> <span id="fs-id15374940" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[L1]</code> , <span id="fs-id13776903" data-type="media" data-display="inline" data-alt="comma key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-comma-1.png" alt="comma key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id15665979">&#8230;and indicate that the frequencies are in <code>[L2]</code>.<span data-type="newline"><br /> </span> <span id="fs-id15677570" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[L2]</code> , <span id="fs-id15735160" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li>The statistics should be displayed. You may arrow down to get remaining statistics. Repeat as necessary.</li> </ol> </div> <div id="fs-id15737311" class="bc-section section" data-depth="1"><h3 data-type="title">Drawing Histograms</h3> <div id="fs-id14426441" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="eip-idp35239360">We will assume that the data is already entered.</p> </div> <p id="fs-id15680349">We will construct two histograms with the built-in STATPLOT application. The first way will use the default ZOOM. The second way will involve customizing a new graph.</p> <ol id="fs-id15742651" type="1"><li><p id="fs-id15419737">Access graphing mode.<span data-type="newline"><br /> </span> <span id="fs-id15448652" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[STAT PLOT]</code></p> </li> <li><p id="fs-id15557620">Select <code>&lt;1:plot 1&gt;</code> to access plotting &#8211; first graph.<span data-type="newline"><br /> </span> <span id="fs-id13682805" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id15739558">Use the arrows navigate go to <code>&lt;ON&gt;</code> to turn on Plot 1.<span data-type="newline"><br /> </span> <code>&lt;ON&gt;</code> , <span id="fs-id10512380" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li>Use the arrows to go to the histogram picture and select the histogram. <span id="fs-id13783163" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></li> <li>Use the arrows to navigate to <code>&lt;Xlist&gt;</code>.</li> <li><p id="fs-id14329150">If &#8220;L1&#8221; is not selected, select it.<span data-type="newline"><br /> </span> <span id="fs-id15663891" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[L1]</code> , <span id="fs-id14338110" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li>Use the arrows to navigate to <code>&lt;Freq&gt;</code>.</li> <li><p id="fs-id14831268">Assign the frequencies to <code>[L2]</code>.<span data-type="newline"><br /> </span> <span id="fs-id15664107" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[L2]</code> , <span id="fs-id15681866" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id15772224">Go back to access other graphs.<span data-type="newline"><br /> </span> <span id="fs-id15861321" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[STAT PLOT]</code></p> </li> <li>Use the arrows to turn off the remaining plots.</li> <li><strong>Be sure to deselect or clear all equations before graphing.</strong></li> </ol> <p id="eip-907"><span data-type="title">To deselect equations:</span></p> <ol id="eip-idm66550672" type="1"><li><p id="fs-id10677318">Access the list of equations.<span data-type="newline"><br /> </span> <span id="fs-id14831127" data-type="media" data-display="inline" data-alt="Y= key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-yequals-1.png" alt="Y= key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id14360577">Select each equal sign (=).<span data-type="newline"><br /> </span> <span id="fs-id15443354" data-type="media" data-display="inline" data-alt="arrow down key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-dir-down-1.png" alt="arrow down key" data-media-type="image/png" /></span> <span id="fs-id15666036" data-type="media" data-display="inline" data-alt="arrow right key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-dir-right-1.png" alt="arrow right key" data-media-type="image/png" /></span> <span id="fs-id15685297" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li>Continue, until all equations are deselected.</li> </ol> <p id="eip-556"><span data-type="title">To clear equations:</span></p> <ol id="eip-idp10993520" type="1"><li><p id="fs-id15384255">Access the list of equations.<span data-type="newline"><br /> </span> <span id="fs-id14831410" data-type="media" data-display="inline" data-alt="Y= key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-yequals-1.png" alt="Y= key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id4458420">Use the arrow keys to navigate to the right of each equal sign (=) and clear them.<span data-type="newline"><br /> </span> <span id="fs-id15399911" data-type="media" data-display="inline" data-alt="arrow down key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-dir-down-1.png" alt="arrow down key" data-media-type="image/png" /></span> <span id="fs-id15664574" data-type="media" data-display="inline" data-alt="arrow right key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-dir-right-1.png" alt="arrow right key" data-media-type="image/png" /></span> <span id="fs-id15392714" data-type="media" data-display="inline" data-alt="clear key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-clear-1.png" alt="clear key" data-media-type="image/png" /></span></p> </li> <li>Repeat until all equations are deleted.</li> </ol> <p id="eip-649"><span data-type="title">To draw default histogram:</span></p> <ol id="eip-idm164199760" type="1"><li><p id="fs-id12596788">Access the ZOOM menu.<span data-type="newline"><br /> </span> <span id="fs-id15558179" data-type="media" data-display="inline" data-alt="ZOOM key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-zoom-1.png" alt="ZOOM key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id15743779">Select <code>&lt;9:ZoomStat&gt;</code>.<span data-type="newline"><br /> </span> <span id="fs-id15676186" data-type="media" data-display="inline" data-alt="number 9 key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-nine-1.png" alt="number 9 key" data-media-type="image/png" /></span></p> </li> <li>The histogram will show with a window automatically set.</li> </ol> <p id="eip-296"><span data-type="title">To draw custom histogram:</span></p> <ol id="eip-idm110111776" type="1"><li>Access window mode to set the graph parameters.<span data-type="newline"><br /> </span> <span id="fs-id15413869" data-type="media" data-display="inline" data-alt="window key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-window-1.png" alt="window key" data-media-type="image/png" /></span></li> <li><ul id="fs-id15770367" data-bullet-style="bullet"><li>\({X}_{\mathrm{min}}=–2.5\)</li> <li>\({X}_{\mathrm{max}}=3.5\)</li> <li>\({X}_{scl}=1\) (width of bars)</li> <li>\({Y}_{\mathrm{min}}=0\)</li> <li>\({Y}_{\mathrm{max}}=10\)</li> <li>\({Y}_{scl}=1\) (spacing of tick marks on <em data-effect="italics">y</em>-axis)</li> <li>\({X}_{res}=1\)</li> </ul> </li> <li>Access graphing mode to see the histogram.<span data-type="newline"><br /> </span> <span id="fs-id14338525" data-type="media" data-display="inline" data-alt="graph key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-graph-1.png" alt="graph key" data-media-type="image/png" /></span></li> </ol> <p id="eip-140"><span data-type="title">To draw box plots:</span></p> <ol id="eip-idp73874128" type="1"><li><p id="fs-id13795663">Access graphing mode.<span data-type="newline"><br /> </span> <span id="fs-id8283012" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[STAT PLOT]</code></p> </li> <li><p id="fs-id15609035">Select <code>&lt;1:Plot 1&gt;</code> to access the first graph.<span data-type="newline"><br /> </span> <span id="fs-id15436944" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id10982525">Use the arrows to select <code>&lt;ON&gt;</code> and turn on Plot 1.<span data-type="newline"><br /> </span> <span id="fs-id15407549" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id14339453">Use the arrows to select the box plot picture and enable it.<span data-type="newline"><br /> </span> <span id="fs-id15682750" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li>Use the arrows to navigate to <code>&lt;Xlist&gt;</code>.</li> <li><p id="fs-id12116579">If &#8220;L1&#8221; is not selected, select it.<span data-type="newline"><br /> </span> <span id="fs-id13785172" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[L1]</code> , <span id="fs-id13796670" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li>Use the arrows to navigate to <code>&lt;Freq&gt;</code>.</li> <li><p id="fs-id14792157">Indicate that the frequencies are in <code>[L2]</code>.<span data-type="newline"><br /> </span> <span id="fs-id14857230" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[L2]</code> , <span id="fs-id15400636" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id15259023">Go back to access other graphs.<span data-type="newline"><br /> </span> <span id="fs-id3301526" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[STAT PLOT]</code></p> </li> <li><strong>Be sure to deselect or clear all equations before graphing</strong> using the method mentioned above.</li> <li><p id="fs-id15428121">View the box plot.<span data-type="newline"><br /> </span> <span id="fs-id15743723" data-type="media" data-display="inline" data-alt="graph key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-graph-1.png" alt="graph key" data-media-type="image/png" /></span> , <code>[STAT PLOT]</code></p> </li> </ol> </div> <div id="fs-id14569516" class="bc-section section" data-depth="1"><h3 data-type="title">Linear Regression</h3> <div id="fs-id15377984" class="bc-section section" data-depth="2"><h4 data-type="title">Sample Data</h4> <p id="fs-id15306460">The following data is real. The percent of declared ethnic minority students at De Anza College for selected years from 1970–1995 was:</p> <table id="fs-id14336458" summary="The first column is data represents years, from 1970 to 1995, while the second column represents the percentage of declared ethnic minority students at De Anza College with respect to the entire student body for that year."><caption>The independent variable is &#8220;Year,&#8221; while the independent variable is &#8220;Student Ethnic Minority Percent.&#8221;</caption> <thead valign="top"><tr><th data-align="center">Year</th> <th data-align="center">Student Ethnic Minority Percentage</th> </tr> </thead> <tbody><tr><td>1970</td> <td>14.13</td> </tr> <tr><td>1973</td> <td>12.27</td> </tr> <tr><td>1976</td> <td>14.08</td> </tr> <tr><td>1979</td> <td>18.16</td> </tr> <tr><td>1982</td> <td>27.64</td> </tr> <tr><td>1983</td> <td>28.72</td> </tr> <tr><td>1986</td> <td>31.86</td> </tr> <tr><td>1989</td> <td>33.14</td> </tr> <tr><td>1992</td> <td>45.37</td> </tr> <tr><td>1995</td> <td>53.1</td> </tr> </tbody> </table> <div id="fs-id13815357" class="bc-figure figure"><div data-type="title">Student Ethnic Minority Percentage</div> <div class="bc-figcaption figcaption">By hand, verify the scatterplot above.</div> <p><span id="fs-id15377645" data-type="media" data-display="block" data-alt="This is a scatterplot for the data provided. Year is plotted on the horizontal axis and percent is plotted on the vertical axis. The points show a strong, curved, upward trend."><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/CNX_Stats_Appendix_ART_Figure_14.6-1.jpg" alt="This is a scatterplot for the data provided. Year is plotted on the horizontal axis and percent is plotted on the vertical axis. The points show a strong, curved, upward trend." width="350" data-media-type="image/png" /></span></p> </div> <div id="fs-id15648837" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="eip-idp56972304">The TI-83 has a built-in linear regression feature, which allows the data to be edited.The <em data-effect="italics">x</em>-values will be in <code>[L1]</code>; the <em data-effect="italics">y</em>-values in <code>[L2]</code>.</p> </div> <p id="eip-901"><span data-type="title">To enter data and do linear regression:</span></p> <ol id="eip-idp244900032" type="1"><li><p id="fs-id5077977">ON Turns calculator on.<span data-type="newline"><br /> </span> <span id="fs-id7330302" data-type="media" data-display="inline" data-alt="on key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-on-1.png" alt="on key" data-media-type="image/png" /></span></p> </li> <li>Before accessing this program, be sure to turn off all plots. <ul id="fs-id14326364" data-bullet-style="bullet"><li><p id="fs-id14859632">Access graphing mode.<span data-type="newline"><br /> </span> <span id="fs-id13808044" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[STAT PLOT]</code></p> </li> <li><p id="fs-id10740237">Turn off all plots.<span data-type="newline"><br /> </span> <span id="fs-id14046599" data-type="media" data-display="inline" data-alt="number 4 key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-four-1.png" alt="number 4 key" data-media-type="image/png" /></span> , <span id="fs-id13782731" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> </ul> </li> <li>Round to three decimal places. To do so: <ul id="fs-id13796521" data-bullet-style="bullet"><li><p id="fs-id13815985">Access the mode menu.<span data-type="newline"><br /> </span> <span id="fs-id15436477" data-type="media" data-display="inline" data-alt="mode key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/ti83-mode-1.png" alt="mode key" data-media-type="image/png" /></span> , <code>[STAT PLOT]</code></p> </li> <li><p id="fs-id4472705">Navigate to <code>&lt;Float&gt;</code> and then to the right to <code>&lt;3&gt;</code>.<span data-type="newline"><br /> </span> <span id="fs-id12738760" data-type="media" data-display="inline" data-alt="arrow down key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-dir-down-1.png" alt="arrow down key" data-media-type="image/png" /></span> <span id="fs-id14332544" data-type="media" data-display="inline" data-alt="arrow right key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-dir-right-1.png" alt="arrow right key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id14770838">All numbers will be rounded to three decimal places until changed.<span data-type="newline"><br /> </span> <span id="fs-id14064149" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> </ul> </li> <li><p id="fs-id5168875">Enter statistics mode and clear lists <code>[L1]</code> and <code>[L2]</code>, as describe previously.<span data-type="newline"><br /> </span> <span id="fs-id13746924" data-type="media" data-display="inline" data-alt="stat key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-stat-1.png" alt="stat key" data-media-type="image/png" /></span> , <span id="fs-id14325808" data-type="media" data-display="inline" data-alt="number 4 key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-four-1.png" alt="number 4 key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id15672446">Enter editing mode to insert values for <em data-effect="italics">x</em> and <em data-effect="italics">y</em>.<span data-type="newline"><br /> </span> <span id="fs-id15871012" data-type="media" data-display="inline" data-alt="stat key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-stat-1.png" alt="stat key" data-media-type="image/png" /></span> , <span id="fs-id15414493" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li>Enter each value. Press <span id="fs-id15320435" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span> to continue.</li> </ol> <p id="eip-909"><span data-type="title">To display the correlation coefficient:</span></p> <ol id="eip-idp108929536" type="1"><li><p id="fs-id15415238">Access the catalog.<span data-type="newline"><br /> </span> <span id="fs-id13748163" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[CATALOG]</code></p> </li> <li><p id="fs-id13412820">Arrow down and select <code>&lt;DiagnosticOn&gt;</code><span data-type="newline"><br /> </span> <span id="fs-id14419965" data-type="media" data-display="inline" data-alt="arrow down key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-dir-down-1.png" alt="arrow down key" data-media-type="image/png" /></span>&#8230; , <span id="fs-id14758720" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span> , <span id="fs-id15298972" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li>\(r\) and \({r}^{2}\) will be displayed during regression calculations.</li> <li><p id="fs-id15861386">Access linear regression.<span data-type="newline"><br /> </span> <span id="fs-id14358803" data-type="media" data-display="inline" data-alt="stat key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-stat-1.png" alt="stat key" data-media-type="image/png" /></span> <span id="fs-id15770182" data-type="media" data-display="inline" data-alt="arrow right key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-dir-right-1.png" alt="arrow right key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id15677479">Select the form of <em data-effect="italics">y</em> = <em data-effect="italics">a</em> + <em data-effect="italics">bx</em>.<span data-type="newline"><br /> </span> <span id="fs-id15413761" data-type="media" data-display="inline" data-alt="number 8 key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/ti83-eight-1.png" alt="number 8 key" data-media-type="image/png" /></span> , <span id="fs-id14568976" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> </ol> <p id="fs-id13602951"><span data-type="newline"><br /> </span>The display will show:</p> <p id="eip-878"><span data-type="title">LinReg</span></p> <ul id="fs-id13602953"><li><em data-effect="italics">y</em> = <em data-effect="italics">a</em> + <em data-effect="italics">bx</em></li> <li><em data-effect="italics">a</em> = –3176.909</li> <li><em data-effect="italics">b</em> = 1.617</li> <li><em data-effect="italics">r</em> = 2 0.924</li> <li><em data-effect="italics">r</em> = 0.961</li> </ul> <p id="fs-id4451506"><span data-type="newline"><br /> </span>This means the Line of Best Fit (Least Squares Line) is:</p> <ul id="fs-id10515055" data-bullet-style="bullet"><li><em data-effect="italics">y</em> = –3176.909 + 1.617<em data-effect="italics">x</em></li> <li>Percent = –3176.909 + 1.617 (year #)</li> </ul> <p>The correlation coefficient <em data-effect="italics">r</em> = 0.961</p> <p id="fs-idm68992784"><span data-type="title">To see the scatter plot:</span></p> <ol id="fs-id14860854" type="1"><li><p id="fs-id15375252">Access graphing mode.<span data-type="newline"><br /> </span> <span id="fs-id14743415" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[STAT PLOT]</code></p> </li> <li><p id="fs-id13782813">Select <code>&lt;1:plot 1&gt;</code> To access plotting &#8211; first graph.<span data-type="newline"><br /> </span> <span id="fs-id13782817" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id9642211">Navigate and select <code>&lt;ON&gt;</code> to turn on Plot 1.<span data-type="newline"><br /> </span> <code>&lt;ON&gt;</code> <span id="fs-id15326624" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li>Navigate to the first picture.</li> <li><p id="fs-id14425261">Select the scatter plot.<span data-type="newline"><br /> </span> <span id="fs-id14425265" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li>Navigate to <code>&lt;Xlist&gt;</code>.</li> <li>If <code>[L1]</code> is not selected, press <span id="fs-id15586352" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[L1]</code> to select it.</li> <li><p id="fs-id13616594">Confirm that the data values are in <code>[L1]</code>.<span data-type="newline"><br /> </span> <code>&lt;ON&gt;</code> <span id="fs-id15674802" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li>Navigate to <code>&lt;Ylist&gt;</code>.</li> <li><p id="fs-id12292882">Select that the frequencies are in <code>[L2]</code>.<span data-type="newline"><br /> </span> <span id="fs-id12292887" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[L2]</code> , <span id="fs-id13027628" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id8794800">Go back to access other graphs.<span data-type="newline"><br /> </span> <span id="fs-id15374874" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[STAT PLOT]</code></p> </li> <li>Use the arrows to turn off the remaining plots.</li> <li>Access window mode to set the graph parameters.<span data-type="newline"><br /> </span> <span id="fs-id15380303" data-type="media" data-display="inline" data-alt="window key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-window-1.png" alt="window key" data-media-type="image/png" /></span> <ul id="fs-id4833843" data-bullet-style="bullet"><li>\({X}_{\mathrm{min}}=1970\)</li> <li>\({X}_{\mathrm{max}}=2000\)</li> <li>\({X}_{scl}=10\) (spacing of tick marks on <em data-effect="italics">x</em>-axis)</li> <li>\({Y}_{\mathrm{min}}=-0.05\)</li> <li>\({Y}_{\mathrm{max}}=60\)</li> <li>\({Y}_{scl}=10\) (spacing of tick marks on <em data-effect="italics">y</em>-axis)</li> <li>\({X}_{res}=1\)</li> </ul> </li> <li>Be sure to deselect or clear all equations before graphing, using the instructions above.</li> <li>Press the graph button to see the scatter plot. <span id="fs-id11867940" data-type="media" data-display="inline" data-alt="graph key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-graph-1.png" alt="graph key" data-media-type="image/png" /></span></li> </ol> <p id="fs-idm30312176"><span data-type="title">To see the regression graph:</span></p> <ol id="fs-id15636297" type="1"><li><p id="fs-id14418754">Access the equation menu. The regression equation will be put into Y1.<span data-type="newline"><br /> </span> <span id="fs-id12873704" data-type="media" data-display="inline" data-alt="Y= key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-yequals-1.png" alt="Y= key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id15318374">Access the vars menu and navigate to <code>&lt;5: Statistics&gt;</code>.<span data-type="newline"><br /> </span> <span id="fs-id10268916" data-type="media" data-display="inline" data-alt="vars key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/ti83-vars-1.png" alt="vars key" data-media-type="image/png" /></span> , <span id="fs-id14419539" data-type="media" data-display="inline" data-alt="number 5 key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/ti83-five-1.png" alt="number 5 key" data-media-type="image/png" /></span></p> </li> <li>Navigate to <code>&lt;EQ&gt;</code>.</li> <li><p id="fs-id14622106"><code>&lt;1: RegEQ&gt;</code> contains the regression equation which will be entered in Y1.<span data-type="newline"><br /> </span> <span id="fs-id15667622" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li>Press the graphing mode button. The regression line will be superimposed over the scatter plot.<span data-type="newline"><br /> </span> <span id="fs-id15399739" data-type="media" data-display="inline" data-alt="graph key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-graph-1.png" alt="graph key" data-media-type="image/png" /></span></li> </ol> <p id="eip-385"><span data-type="title">To see the residuals and use them to calculate the critical point for an outlier:</span></p> <ol id="eip-idp57821472" type="1"><li><p id="fs-id14885511">Access the list. RESID will be an item on the menu. Navigate to it.<span data-type="newline"><br /> </span> <span id="fs-id15318284" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span>, <code>[LIST]</code>, <code>&lt;RESID&gt;</code></p> </li> <li><p id="fs-id15771552">Confirm twice to view the list of residuals. Use the arrows to select them.<span data-type="newline"><br /> </span> <span id="fs-id13622621" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span> , <span id="fs-id15558165" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li>The critical point for an outlier is: \(1.9V\frac{\mathrm{SSE}}{n-2}\) where: <ul id="fs-id13668874"><li>\(n\) = number of pairs of data</li> <li>\(\mathrm{SSE}\) = sum of the squared errors</li> <li>\(\sum _{}^{}{\mathrm{residual}}^{2}\)</li> </ul> </li> <li><p id="fs-id13856017">Store the residuals in <code>[L3]</code>.<span data-type="newline"><br /> </span> <span id="fs-id13855071" data-type="media" data-display="inline" data-alt="store key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/ti83-sto-1.png" alt="store key" data-media-type="image/png" /></span> , <span id="fs-id15769062" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[L3]</code> , <span id="fs-id15401211" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id15734182">Calculate the \(\frac{{\mathrm{\left(residual\right)}}^{2}}{n-2}\). Note that \(n-2=8\) <span data-type="newline"><br /> </span> <span id="fs-id15680372" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[L3]</code> , <span id="fs-id15332654" data-type="media" data-display="inline" data-alt="x-squared key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/ti83-xsquared-1.png" alt="x-squared key" data-media-type="image/png" /></span> , <span id="fs-id15664227" data-type="media" data-display="inline" data-alt="division key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/ti83-divide-1.png" alt="division key" data-media-type="image/png" /></span> , <span id="fs-id15443531" data-type="media" data-display="inline" data-alt="number 8 key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/ti83-eight-1.png" alt="number 8 key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id15492779">Store this value in <code>[L4]</code>.<span data-type="newline"><br /> </span> <span id="fs-id12493313" data-type="media" data-display="inline" data-alt="store key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/ti83-sto-1.png" alt="store key" data-media-type="image/png" /></span> , <span id="fs-id14857761" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[L4]</code> , <span id="fs-id15330584" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li><p id="fs-id13799897">Calculate the critical value using the equation above.<span data-type="newline"><br /> </span> <span id="fs-id15553710" data-type="media" data-display="inline" data-alt="number 1 key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/ti83-one-1.png" alt="number 1 key" data-media-type="image/png" /></span> , <span id="fs-id13679088" data-type="media" data-display="inline" data-alt="decimal point key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/ti83-decimal-1.png" alt="decimal point key" data-media-type="image/png" /></span> , <span id="fs-id15413832" data-type="media" data-display="inline" data-alt="number 9 key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-nine-1.png" alt="number 9 key" data-media-type="image/png" /></span> , <span id="fs-id13632053" data-type="media" data-display="inline" data-alt="multiplication key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/ti83-multiply-1.png" alt="multiplication key" data-media-type="image/png" /></span> , <span id="fs-id15673395" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[V]</code> , <span id="fs-id15682092" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[LIST]</code> <span id="fs-id12746411" data-type="media" data-display="inline" data-alt="arrow right key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-dir-right-1.png" alt="arrow right key" data-media-type="image/png" /></span> , <span id="fs-id15385301" data-type="media" data-display="inline" data-alt="arrow right key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-dir-right-1.png" alt="arrow right key" data-media-type="image/png" /></span> , <span id="fs-id14854303" data-type="media" data-display="inline" data-alt="number 5 key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/ti83-five-1.png" alt="number 5 key" data-media-type="image/png" /></span> , <span id="fs-id15331426" data-type="media" data-display="inline" data-alt="2nd key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-2nd-1.png" alt="2nd key" data-media-type="image/png" /></span> , <code>[L4]</code> , <span id="fs-id14040973" data-type="media" data-display="inline" data-alt="closing parenthesis key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/ti83-rightparenthesis-1.png" alt="closing parenthesis key" data-media-type="image/png" /></span> , <span id="fs-id12262413" data-type="media" data-display="inline" data-alt="closing parenthesis key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/ti83-rightparenthesis-1.png" alt="closing parenthesis key" data-media-type="image/png" /></span> , <span id="fs-id8733928" data-type="media" data-display="inline" data-alt="enter key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/05/ti83-enter-1.png" alt="enter key" data-media-type="image/png" /></span></p> </li> <li>Verify that the calculator displays: 7.642669563. This is the critical value.</li> <li>Compare the absolute value of each residual value in <code>[L3]</code> to 7.64. If the absolute value is greater than 7.64, then the (x, y) corresponding point is an outlier. In this case, none of the points is an outlier.</li> </ol> <p id="fs-id14035082"><span data-type="title">To obtain estimates of <em data-effect="italics">y</em> for various <em data-effect="italics">x</em>-values:</span>There are various ways to determine estimates for &#8220;<em data-effect="italics">y.</em>&#8221; One way is to substitute values for &#8220;<em data-effect="italics">x</em>&#8221; in the equation. Another way is to use the <span id="fs-id4485078" data-type="media" data-display="inline" data-alt="trace key"><img src="https://pressbooks.ccconline.org/acccomposition1/wp-content/uploads/sites/83/2022/08/ti83-trace-1.png" alt="trace key" data-media-type="image/png" /> on the graph of the regression line. </span></p> </div> </div> <div id="fs-id5169011" class="bc-section section" data-depth="1"><h3 data-type="title">TI-83, 83+, 84, 84+ instructions for distributions and tests</h3> <div id="fs-id15450346" class="bc-section section" data-depth="2"><h4 data-type="title">Distributions</h4> <p id="fs-id13791824">Access <code data-display="inline">DISTR</code> (for &#8220;Distributions&#8221;).</p> <p id="fs-id15526075">For technical assistance, visit the Texas Instruments website at <a href="http://www.ti.com" target="_window" rel="noopener">http://www.ti.com</a> and enter your calculator model into the &#8220;search&#8221; box.</p> <p id="fs-idm103615424"><span data-type="title">Binomial Distribution</span></p> <ul id="fs-id15384919"><li><code data-display="inline">binompdf(<em data-effect="italics">n</em>,<em data-effect="italics">p</em>,<em data-effect="italics">x</em>)</code> corresponds to <em data-effect="italics">P</em>(<em data-effect="italics">X</em> = <em data-effect="italics">x</em>)</li> <li><code data-display="inline">binomcdf(<em data-effect="italics">n</em>,<em data-effect="italics">p</em>,<em data-effect="italics">x</em>)</code> corresponds to <em data-effect="italics">P</em>(X ≤ x)</li> <li>To see a list of all probabilities for <em data-effect="italics">x</em>: 0, 1, . . . , <em data-effect="italics">n</em>, leave off the &#8220;<code data-display="inline"><em data-effect="italics">x</em></code>&#8221; parameter.</li> </ul> <p id="fs-idp1474048"><span data-type="title">Poisson Distribution</span></p> <ul id="fs-id13783374"><li><code data-display="inline">poissonpdf(λ,<em data-effect="italics">x</em>)</code> corresponds to <em data-effect="italics">P</em>(<em data-effect="italics">X</em> = <em data-effect="italics">x</em>)</li> <li><code data-display="inline">poissoncdf(λ,<em data-effect="italics">x</em>)</code> corresponds to <em data-effect="italics">P</em>(<em data-effect="italics">X</em> ≤ <em data-effect="italics">x</em>)</li> </ul> <p id="eip-283"><span data-type="title">Continuous Distributions (general)</span></p> <ul id="eip-idp115972320"><li>\(-\infty \) uses the value –1EE99 for left bound</li> <li>\(\infty \) uses the value 1EE99 for right bound</li> </ul> <p id="eip-402"><span data-type="title">Normal Distribution</span></p> <ul id="eip-idp89416480"><li><code data-display="inline">normalpdf(<em data-effect="italics">x</em>,<em data-effect="italics">μ</em>,<em data-effect="italics">σ</em>)</code> yields a probability density function value (only useful to plot the normal curve, in which case &#8220;<code data-display="inline"><em data-effect="italics">x</em></code>&#8221; is the variable)</li> <li><code data-display="inline">normalcdf(left bound, right bound, <em data-effect="italics">μ</em>, <em data-effect="italics">σ</em>)</code> corresponds to <em data-effect="italics">P</em>(left bound &lt; <em data-effect="italics">X</em> &lt; right bound)</li> <li><code data-display="inline">normalcdf(left bound, right bound)</code> corresponds to <em data-effect="italics">P</em>(left bound &lt; <em data-effect="italics">Z</em> &lt; right bound) – standard normal</li> <li><code data-display="inline">invNorm(<em data-effect="italics">p</em>,<em data-effect="italics">μ</em>,<em data-effect="italics">σ</em>)</code> yields the critical value, <em data-effect="italics">k</em>: <em data-effect="italics">P</em>(<em data-effect="italics">X</em> &lt; <em data-effect="italics">k</em>) = <em data-effect="italics">p</em></li> <li><code data-display="inline">invNorm(<em data-effect="italics">p</em>)</code> yields the critical value, <em data-effect="italics">k</em>: <em data-effect="italics">P</em>(<em data-effect="italics">Z</em> &lt; <em data-effect="italics">k</em>) = <em data-effect="italics">p</em> for the standard normal</li> </ul> <p id="eip-113"><span data-type="title">Student&#8217;s <em data-effect="italics">t</em>-Distribution</span></p> <ul id="eip-idm52146688"><li><code data-display="inline">tpdf(<em data-effect="italics">x</em>,<em data-effect="italics">df</em>)</code> yields the probability density function value (only useful to plot the student-<em data-effect="italics">t</em> curve, in which case &#8220;<code data-display="inline"><em data-effect="italics">x</em></code>&#8221; is the variable)</li> <li><code data-display="inline">tcdf(left bound, right bound, <em data-effect="italics">df</em>)</code> corresponds to <em data-effect="italics">P</em>(left bound &lt; <em data-effect="italics">t</em> &lt; right bound)</li> </ul> <p id="eip-771"><span data-type="title">Chi-square Distribution</span></p> <ul id="eip-idm44632080"><li><code data-display="inline">Χ<sup>2</sup>pdf(<em data-effect="italics">x</em>,<em data-effect="italics">df</em>)</code> yields the probability density function value (only useful to plot the chi<sup>2</sup> curve, in which case &#8220;<code data-display="inline"><em data-effect="italics">x</em></code>&#8221; is the variable)</li> <li><code data-display="inline">Χ<sup>2</sup>cdf(left bound, right bound, <em data-effect="italics">df</em>)</code> corresponds to <em data-effect="italics">P</em>(left bound &lt; <em data-effect="italics">Χ</em><sup>2</sup> &lt; right bound)</li> </ul> <p id="eip-380"><span data-type="title">F Distribution</span></p> <ul id="eip-idm166766896"><li><code data-display="inline">Fpdf(<em data-effect="italics">x</em>,<em data-effect="italics">dfnum</em>,<em data-effect="italics">dfdenom</em>)</code> yields the probability density function value (only useful to plot the <em data-effect="italics">F</em> curve, in which case &#8220;<code data-display="inline"><em data-effect="italics">x</em></code>&#8221; is the variable)</li> <li><code data-display="inline">Fcdf(left bound,right bound,<em data-effect="italics">dfnum</em>,<em data-effect="italics">dfdenom</em>)</code> corresponds to <em data-effect="italics">P</em>(left bound &lt; <em data-effect="italics">F</em> &lt; right bound)</li> </ul> </div> <div id="fs-id15677913" class="bc-section section" data-depth="2"><h4 data-type="title">Tests and Confidence Intervals</h4> <p id="fs-id15287935">Access <code data-display="inline">STAT</code> and <code data-display="inline">TESTS</code>.</p> <p id="fs-id14324569">For the confidence intervals and hypothesis tests, you may enter the data into the appropriate lists and press <code data-display="inline">DATA</code> to have the calculator find the sample means and standard deviations. Or, you may enter the sample means and sample standard deviations directly by pressing <code data-display="inline">STAT</code> once in the appropriate tests.</p> <p id="fs-idp96939616"><span data-type="title">Confidence Intervals</span></p> <ul id="fs-id14324572"><li><code data-display="inline">ZInterval</code> is the confidence interval for mean when σ is known.</li> <li><code data-display="inline">TInterval</code> is the confidence interval for mean when σ is unknown; <em data-effect="italics">s</em> estimates σ.</li> <li><code data-display="inline">1-PropZInt</code> is the confidence interval for proportion.</li> </ul> <div id="fs-id14799914" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="eip-idp1918880">The confidence levels should be given as percents (ex. enter &#8220;<code data-display="inline">95</code>&#8221; or &#8220;<code data-display="inline">.95</code>&#8221; for a 95% confidence level).</p> </div> <p id="fs-idp36667376"><span data-type="title">Hypothesis Tests</span></p> <ul id="fs-id14799919"><li><code data-display="inline">Z-Test</code> is the hypothesis test for single mean when σ is known.</li> <li><code data-display="inline">T-Test</code> is the hypothesis test for single mean when σ is unknown; <em data-effect="italics">s</em> estimates σ.</li> <li><code data-display="inline">2-SampZTest</code> is the hypothesis test for two independent means when both σ&#8217;s are known.</li> <li><code data-display="inline">2-SampTTest</code> is the hypothesis test for two independent means when both σ&#8217;s are unknown.</li> <li><code data-display="inline">1-PropZTest</code> is the hypothesis test for single proportion.</li> <li><code data-display="inline">2-PropZTest</code> is the hypothesis test for two proportions.</li> <li><code data-display="inline">Χ<sup>2</sup>-Test</code> is the hypothesis test for independence.</li> <li><code data-display="inline">Χ<sup>2</sup>GOF-Test</code> is the hypothesis test for goodness-of-fit (TI-84+ only).</li> <li><code data-display="inline">LinRegTTEST</code> is the hypothesis test for Linear Regression (TI-84+ only).</li> </ul> <div id="fs-id15374930" data-type="note" data-has-label="true" data-label=""><div data-type="title">Note</div> <p id="eip-idp140444871127008">Input the null hypothesis value in the row below &#8220;<code data-display="inline">Inpt</code>.&#8221; For a test of a single mean, &#8220;<code data-display="inline">μ∅</code>&#8221; represents the null hypothesis. For a test of a single proportion, &#8220;<code data-display="inline">p∅</code>&#8221; represents the null hypothesis. Enter the alternate hypothesis on the bottom row.</p> </div> </div> </div> </div></div>
<div class="back-matter miscellaneous" id="back-matter-tables" title="Tables"><div class="back-matter-title-wrap"><h3 class="back-matter-number">7</h3><h1 class="back-matter-title"><span class="display-none">Tables</span></h1></div><div class="ugc back-matter-ugc"> <p>&nbsp;</p> <p id="eip-753">This module contains links to government site tables used in statistics.</p> <div id="element-477" data-type="list"><div data-type="title">Tables (NIST/SEMATECH e-Handbook of Statistical Methods, http://www.itl.nist.gov/div898/handbook/, January 3, 2009)</div> <ul><li><a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda3672.htm" data-url="http://www.itl.nist.gov/div898/handbook/eda/section3/eda3672.htm">Student <em data-effect="italics">t</em> table</a></li> <li><a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda3671.htm" data-url="http://www.itl.nist.gov/div898/handbook/eda/section3/eda3671.htm">Normal table</a></li> <li><a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda3674.htm" data-url="http://www.itl.nist.gov/div898/handbook/eda/section3/eda3674.htm">Chi-Square table</a></li> <li><a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm" data-url="http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm">F-table</a></li> <li>All <a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda367.htm" data-url="http://www.itl.nist.gov/div898/handbook/eda/section3/eda367.htm">four tables</a> can be accessed by going to</li> </ul> </div> <div id="element-919" data-type="list"><div data-type="title">95% Critical Values of the Sample Correlation Coefficient Table</div> <ul><li><a href="#eip-idm31993488" data-url="/contents/d1c162aa-f63f-4466-9efd-27b902bf22d0#eip-idm31993488">95% Critical Values of the Sample Correlation Coefficient</a></li> </ul> </div> </div></div>

</body>
</html>