More Stats Today
-
Confidence Intervals
-
Margin of Error
-
Hypothesis Testing
How many samples do I need?
-
the plus or minus margin m = z*(sigma/n1/2)
-
n=(z*sigma/m)2
An Example: Sam's weight
-
Assume sigma is 3 for this example.
-
want to be accurate to 2 pounds (plus or minus 2) with 95% confidence
-
z=1.96
-
n=(1.96*3/2)2=8.6=9
-
Let's say he only wants to be within 3 pounds
Properties of Confidence Intervals
-
What is the relationship between
-
50% confidence interval
-
95% confidence interval
-
99% confidence interval
-
Which is widest?
-
Are they always symmetric?
-
Confidence interval and sample size
-
You can use your desired width of a confidence interval to
decide how many observations you need.
How accurate is the confidence interval?
-
If data are well collected
-
The true mean will fall within in the interval with probability
p.
-
For a 95% confidence interval, the true mean will fall in
the interval 95% of the time
Potential Problems
-
Bad Experiment or Survey
-
Data might come from a poor sample
-
Bias will influence the mean of the sample
-
Outliers in the data can disrupt estimate of the mean
-
The standard deviation of the population has to be known
-
We will talk about what to do when the standard deviation
is not known soon.
Hypothesis Testing
-
Often we want to know whether our data reveal a reliable
effect.
-
We can test this possibility using the techniques we have
described
-
This process is called statistical inference or hypothesis
testing
-
Suppose Kelloggs claimed that there were an average of 250
raisins in each box of Raisin Bran.
-
How could we test this claim?
Null Hypothesis
-
Start by determining the null hypothesis
-
A state of affairs against which we are comparing
-
In our case
-
H0: mu = 250
-
H1: mu < > 250
-
The alternative hypothesis (H1) is always stated
relative to the null hypothesis.
One Tail vs Two Tail
-
Two Tail
-
Preferable
-
H1 is simply different than H0
-
One Tail
-
Avoid in general
-
More power (but don't cheat)
-
can't look at the data before choosing one tail
-
can miss big unexpected effects
P-value
-
We can use the normal distribution to determine the likelihood
of an observed outcome.
-
Calculate the z-score of the observed mean relative to the
sampling distribution with the mean in the null hypothesis.
-
Find the probability of a point as extreme or more extreme
than that (in either direction)
Calculating the p-value
-
In our example:
-
mu0 = 250
-
z (263) = (250 - 263) / 2.6 = -5.00
-
2.6 is the standard error of the mean (from previous lecture)
-
probability associated with a z-score of 5.00 is less
than .0001
Pick a significance level
-
How sure do you want to be in your answer?
-
The probability associated with your sureness is called the
a level
-
In science, the alpha level selected is usually .05
-
If the probability of observing a result is as extreme or
more extreme than the alpha level, then reject the null hypothesis
In our example
-
The probability associated with a mean of 263 is less than
.0001.
-
.0001 is smaller than .05
-
Reject the null hypothesis
-
That is, reject the statement that the population mean equals
250
Relationships to Confidence Intervals
-
Another way to do the same test
-
Formulate a confidence interval around the observed mean
-
If the mean from H0 is inside the interval
-
If the mean from H0 is outside the interval
What about smaller alpha levels?
-
Why do we use an a level of .05?
-
There is a 5% chance that we will reject H0 when
we should not have (Type I error).
-
If we used an a level of .01 there would only be a 1% chance
that would happen.
-
As the a level gets larger, there is a greater chance that
we will decide not to reject H0 when we should have
-
Type II error
-
It could be a problem to make a test too conservative
-
The a level of .05 is a good compromise
-
The Power of a test is 1 minus the probability of a Type
II error.
Demo on power and hypothesis testing.