![]() ![]() |
|||||
![]() |
![]() |
![]() |
![]() |
![]() |
![]() ![]() |
§ Sampling Distributions and Confidence Intervals §
Lesson 1: Introduction
In prior lessons, you have been concerned with the distributions of raw data, x values. In this lesson, the focus will switch the distribution of sample averages, a sampling distribution. If one starts with raw data in a population of x values (e.g. a normal distribution of x values) and obtains a sample from that population, then the average of those x values results in xbar, the sample mean. The best point estimate of the population mean (m) is xbar. A confidence interval gives a range of possible values for m at a given level of confidence.
Sampling Distribution
A single sample with a single xbar
would not likely reflect much information about the population of x values.
If resampling were done such that many samples were taken, then the result would
be an xbar for each sample. The distribution of these xbar values is the sampling
distribution.
(click me)
Sometimes the sampling process actually destroys the items being sampled. True
False
(click
one)
Central Limit Theorem (CLT)
The CLT states that the distribution of
xbars taken from a population with mean, m
, and standard deviation, s
, will be normally distributed with a mean of the xbars equal to m xbar = m and a standard deviation (standard
error), s xbar=
s /Ö (n). The CLT applies to the distribution of xbars from
any population as long as the sample size is large (n > 30) and to the distribution
of xbars from an normal population at any sample size. The distribution of xbars
is not as wide as the original distribution of xs.
(click me)
A uniform distribution of x values will generate a uniform distribution of xbars
if many samples were taken. True
False (click
one)
Confidence Intervals
Since the population mean, m , is usually unknown, an estimate of it is desirable. The sample
mean, xbar, is used as a point estimate of m . But, a single xbar is very unlikely to equal m . Thus, a range of possible values for m is used with the point estimate, xbar, in the middle of the
range. The amount added to xbar and subtracted from xbar is the maximum
error. The resulting range is xbar + maximum error = upper
limit of the confidence interval; and, xbar - maximum error =
lower limit of the confidence interval. The maximum error is
made up of two parts. First, a Z score (or t score)
counts the number of standard errors one is willing to go on each side xbar;
and second, the size of the standard error, s xbar= s
/Ö (n)
must be calculated. These two are multiplied to give the maximum error. A confidence
interval is two maximum errors wide with xbar in the middle. In selecting a
Z score or t score one must consider sample size, n (and the resulting
degrees of freedom = n -1) and the source of the standard deviation.
(click me)
Source of Standard Deviation
Population s | Sample s | ||
Sample Size | Small df £ 30 |
Z |
t |
Large df > 30 |
Z |
Z or t |
In addition, one must select an alpha value, a, in order to select a Z score or t score. Alpha is the proportion
of the time one is willing to not capture the true population mean, m (a parameter), with the confidence interval constructed around the
sample mean, xbar (a statistic). The usual values for alpha are
0.01, 0.05 and 0.10.
(click me)
When the decision has been made to use a Z score, Za/2,
one must look up 0.5 - a/2
= probability in the body of the Z
table. Alpha is divided by two, since there are two ends of the confidence
interval (the upper limit and the lower limit). Pick the closest probability,
and then read the Z score from the edges of the table.
(cick me)
When the decision has been made to use a t score, ta/2,df,
one must simply go to the t table and
find a/2
at the top of a column and the degrees of freedom, n
-1, along the left side of the table. The t score is found at the intersection
of that column and row. When the degrees of freedom is greater than 30, we will
used the convention of skipping to the infinity degrees of freedom row. That
t score at infinity degrees of freedom is equal to a Z score
utilizing the same a/2, ta/2,infinity
= Za/2.
(click me)
If the sample size is 16 and the sample standard deviation is known, then which
of the following is used to construct a confidence interval? Z
score t
score (click
one)
Go on to Lesson
2: Examples
or
Go back to Sampling
Distributions and Confidence Intervals
Please reference "BA501 (your last name) Assignment name and number" in the subject line of either below.
E-mail Dr. James V. Pinto at
BA501@mail.cba.nau.edu
or call (928) 523-7356. Use WebMail for attachments.
Copyright 2002 Northern
Arizona University
ALL RIGHTS RESERVED