9.1: Point Estimates
Learning Objectives
- Determine point estimates in simple cases, and make the connection between the sampling distribution of a statistic, and its properties as a point estimator.
Point estimation is the form of statistical inference in which, based on the sample data, we estimate the unknown parameter of interest using a single value (hence the name point estimation). As the following two examples illustrate, this form of inference is quite intuitive.
Example
Suppose that we are interested in studying the IQ levels of students at Smart University (SU). In particular (since IQ level is a quantitative variable), we are interested in estimating μ, the mean IQ level of all the students at SU.
A random sample of 100 SU students was chosen, and their (sample) mean IQ level was found to be ¯x=115
If we wanted to estimate μ, the population mean IQ level, by a single number based on the sample, it would make intuitive sense to use the corresponding quantity in the sample, the sample mean ¯x=115. We say that 115 is the point estimate for μ, and in general, we’ll always use ¯x as the point estimator for μ. (Note that when we talk about the specific value (115), we use the term estimate, and when we talk in general about the statistic ¯x, we use the term estimator. The following figure summarizes this example:
Here is another example.
Example
Suppose that we are interested in the opinions of U.S. adults regarding legalizing the use of marijuana. In particular, we are interested in the parameter p, the proportion of U.S. adults who believe marijuana should be legalized.
Suppose a poll of 1,000 U.S. adults finds that 560 of them believe marijuana should be legalized. If we wanted to estimate p, the population proportion, using a single number based on the sample, it would make intuitive sense to use the corresponding quantity in the sample, the sample proportion ˆp=5601000=.56. We say in this case that .56 is the point estimate for p, and in general, we’ll always use ˆp as the point estimator for p. (Note, again, that when we talk about the specific value (.56), we use the term estimate, and when we talk in general about the statistic ˆp, we use the term estimator. Here is a visual summary of this example:
Comment 1
You may feel that since it is so intuitive, you could have figured out point estimation on your own, even without the benefit of an entire course in statistics. Certainly, our intuition tells us that the best estimator for μ should be ¯x, and the best estimator for p should be ˆp.
Probability theory does more than this; it actually gives an explanation (beyond intuition) why ¯x and ˆp are the good choices as point estimators for μ and p, respectively. In the Sampling Distributions module of the Probability unit, we learned about the sampling distributions of ¯¯¯X and found that as long as a sample is taken at random, the distribution of sample means is exactly centered at the value of population mean.
¯¯¯X is therefore said to be an unbiased estimator for μ . Any particular sample mean might turn out to be less than the actual population mean, or it might turn out to be more. But in the long run, such sample means are “on target” in that they will not underestimate any more or less often than they overestimate.
Likewise, we learned that the sampling distribution of the sample proportion, ˆp, is centered at the population proportion p (as long as the sample is taken at random), thus making ˆp an unbiased estimator for p.
As stated in the introduction, probability theory plays an essential role as we establish results for statistical inference. Our assertion above that sample mean and sample proportion are unbiased estimators is the first such instance.
Comment 2
Notice how important the principles of sampling and design are for our above results: if the sample of U.S. adults in (example 2 on the previous page) was not random, but instead included predominantly college students, then .56 would be a biased estimate for p, the proportion of all U.S. adults who believe marijuana should be legalized. If the survey design were flawed, such as loading the question with a reminder about the dangers of marijuana leading to hard drugs, or a reminder about the benefits of marijuana for cancer patients, then .56 would be biased on the low or high side, respectively. Our point estimates are truly unbiased estimates for the population parameter only if the sample is random and the study design is not flawed.
Comment 3
Not only are sample mean and sample proportion on target as long as the samples are random, but their accuracy improves as sample size increases. Again, there are two “layers” here for explaining this.
Intuitively, larger sample sizes give us more information with which to pin down the true nature of the population. We can therefore expect the sample mean and sample proportion obtained from a larger sample to be closer to the population mean and proportion, respectively. In the extreme, when we sample the whole population (which is called a census), the sample mean and sample proportion will exactly coincide with the population mean and population proportion.
There is another layer here that, again, comes from what we learned about the sampling distributions of the sample mean and the sample proportion. Let’s use the sample mean for the explanation.
Recall that the sampling distribution of the sample mean ¯¯¯X is, as we mentioned before, centered at the population mean μ and has a standard deviation of σ√n. As a result, as the sample size n increases, the sampling distribution of ¯¯¯X gets less spread out. This means that values of ¯¯¯X that are based on a larger sample are more likely to be closer to μ (as the figure below illustrates):
Similarly, since the sampling distribution of ˆp is centered at p and has a standard deviation of √p(1−p)n, which decreases as the sample size gets larger, values of ˆp are more likely to be closer to p when the sample size is larger.
Comment 4
Another example of a point estimate is using sample variance, s2=(x1−¯x)2+...+(xn−¯x)2n−1, to estimate population variance, σ2 .
In this course, we will not be concerned with estimating σ2 for its own sake, but since we will often substitute s for σ when standardizing the sample mean, it is worth pointing out that s2 is an unbiased estimator for σ2. If we had divided by n instead of n – 1 in our estimator for population variance, then in the long run our sample variance would be guilty of a slight underestimation. Division by n – 1 accomplishes the goal of making this point estimator unbiased. Making unbiased estimators a top priority is, in fact, the reason that our formula for s, introduced in the Exploratory Data Analysis unit, involves division by n – 1 instead of by n.
Let’s Summarize
We use ˆp (sample proportion) as a point estimator for p (population proportion). It is an unbiased estimator: its long-run distribution is centered at p as long as the sample is random.
We use ¯x (sample mean) as a point estimator for μ (population mean). It is an unbiased estimator: its long-run distribution is centered at μ as long as the sample is random.
In both cases, the larger the sample size, the more accurate the point estimator is. In other words, the larger the sample size, the more likely it is that the sample mean (proportion) is close to the unknown population mean (proportion).