As a follow-up, because I am dealing with this issue in my own work,
I have seen it said a number of times, in published articles and in
online "how to" guides, that when the confidence interval spans the null
value that this indicates statistical insignificance. Here is an
example from a website:
"When you see a confidence interval in a published medical report, you
should look for two things. First, does the interval contain a value
that implies no change or no effect? For example, with a confidence
interval for a difference look to see whether that interval includes
zero. With a confidence interval for a ratio, look to see whether that
interval contains one. The interval shown below implies no statistically
significant change."
The graphic shows a confidence interval spanning the zero point.
So my question is: when we have this case -- a confidence interval
spanning the zero point -- is it both technically correct and not unduly
misleading to the average reader to say that the effect is not
"statistically significant"?
Jason
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Michael
Blasnik
Sent: Friday, December 08, 2006 9:54 AM
To: [email protected]
Subject: st: Re: Statistical significance vs the 95% confidence
intervals -- how should i interpret these
Neither of your approaches is correct. In your first approach, you are
being overly conservative by just checking whether confidence intervals
overlap (but you aren't very far off). In your second approach,
-ttesti- is
looking for standard deviations, not standard errors. Look at the
output
and you will see how ttest has created a new standard error that is much
smaller (approxiately s.d./sqrt(n)).
Your question would be best answered by using whatever estimation method
you
used to come up with the estimates and standard errors and then use
Stata's -test- or -lincom- commands to test hypotheses that properly
account
for potential covariance. If you can't do that for some reason, then
you
would need to make an assumption about the covariance. If you assume
that
the two estimates are independent, then you can calculate a t-statistic
on
the difference based on the difference in the estimates and the standard
error of this difference:
di (.484721-.4556235)/sqrt(.015637^2+.0138994^2)
1.3907943
So t=1.39. You can get a p-value from:
di ttail(2398,(.484721-.4556235)/sqrt(.015637^2+.0138994^2))
.08220843
(or use whatever d.f. are appropriate if 2398 is not, but it won't make
any
real difference here)
So you could conclude that, if the two estimates are independent, then
the
difference between year1 and year3 is not quite statistically
significant at
the .05 level.
Michael Blasnik
----- Original Message -----
From: "Columbia & Belmont Apt" <[email protected]>
To: <[email protected]>
Sent: Friday, December 08, 2006 12:25 PM
Subject: st: Statistical significance vs the 95% confidence intervals --
how
should i interpret these
> Dear Statalist Colleagues,
> This might be a simple question but I cannot reconcile the following
> results.
>
> The question is "Are the estimates for year 1 and 3 statistically
> different?
> year estimate st.error 95% conf interval
> year1 .484721 .015637 .454068 .5153741
> year2 .4128893 .0145616 .3843443 .4414343
> year3 .4556235 .0138994 .4283765 .4828704
>
> 1. Using the 95% confidence interval that I obtained:
>
> Year 1 and year 3 confidence intervals overlap thus year 1&3 estimates
are
> not statistically different at the 95% level.
>
> 2. Using ttesti they are statistically different with 99% confidence:
>
> . ttesti 2398 .484721 .015637 2399 .4556235 .0138994
> Two-sample t test with equal variances
>
------------------------------------------------------------------------
------
> | Obs Mean Std. Err. Std. Dev. [95% Conf.
> Interval]
>
---------+--------------------------------------------------------------
------
> x | 2398 .484721 .0003193 .015637 .4840948
> .4853472
> y | 2399 .4556235 .0002838 .0138994 .455067
> .45618
>
---------+--------------------------------------------------------------
------
> combined | 4797 .4701692 .0002996 .0207488 .4695819
> .4707565
>
---------+--------------------------------------------------------------
------
> diff | .0290975 .0004272 .02826
> .029935
>
------------------------------------------------------------------------
------
> Degrees of freedom: 4795
> Ho: mean(x) - mean(y) = diff = 0
> Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
> t = 68.1143 t = 68.1143 t =
68.1143
> P < t = 1.0000 P > |t| = 0.0000 P > t =
0.0000
>
>
> Which one is the correct? What am I not interpreting right?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/