At 06:18 PM 7/6/03, Neumayer,E wrote:
Dear all,
could I try again as I received no answer first time around? This time I
also attach a log to make things clearer. How come the estimated fixed
effects are often statistically significant if they express a difference
to the reference category (it does not really matter which one), but are
all highly insignificant if expressed as the difference to the average
effect (see below). Any help still highly welcome! Best, Eric Neumayer
Because the way you test the difference from the "average" effect is incorrect.
. * Difference from average effect (no fixed effects significant)
. capture tab destination, gen(cdum)
. quietly xi: reg dyadasylumcorrpcshare l52dyadasylumcorrpcshare
destinationrwp
> vote destinationleftgs dyadcolony dyadlanguage dyaddistance
destinationunempl
> oyment lndestinationgdp destinationgdpgrowth destinationrecognition
schenge
> n cdum1-cdum17 if inc_highoecd==0, nocons robust cluster(originid)
. capture drop averageasylum
. ge averageasylum =
(_b[cdum1]+_b[cdum2]+_b[cdum3]+_b[cdum4]+_b[cdum5]+_b[cdum
>
6]+_b[cdum7]+_b[cdum8]+_b[cdum9]+_b[cdum10]+_b[cdum11]+_b[cdum12]+_b[cdum13]+
> _b[cdum14]+_b[cdum15]+_b[cdum16]+_b[cdum17])/17
Here is where the problem with your approach starts. You compute the
average from the regression. Then, you simply subtract it from the actual
observations (Ys) and rerun the regression. But that treats the subtracted
quantity (average) as KNOWN, when it is ESTIMATED from the same data.
. capture drop dyadasylumcorrpcshare2
. ge dyadasylumcorrpcshare2=dyadasylumcorrpcshare-averageasylum
(41572 missing values generated)
This is the model that you fit to test deviations from the average:
. xi: reg dyadasylumcorrpcshare2 l52dyadasylumcorrpcshare
destinationrwpvote de
> stinationleftgs dyadcolony dyadlanguage dyaddistance
destinationunemployment
> lndestinationgdp destinationgdpgrowth destinationrecognition
schengen cdum
> 1-cdum17 if inc_highoecd==0, nocons robust cluster(originid)
[snip]
cdum1
| .0082041 .1258245 0.07 0.948 -.2408377 .257246
...
cdum17
| -.031076 .1273527 -0.24 0.808 -.2831427 .2209907
------------------------------------------------------------------------------
What does this do? The p-value for CDUM1 (0.948) tests whether
cdum1 = 0
(and you'd expect this to be zero if CDUM1 mean is exactly equal to the
average of all group means, which you subtracted from your data).
But the way you've set this up, the test uses only the observations in the
CDUM1 group, cause you've told the computer that the "average" is just an
arbitrary (fixed) value. The correct test, on the other hand, should be
using all observations in the sample, since the average is estimated on the
basis of all observations.
You should not be subtracting the average from the original data. Instead,
you need to fit the model
. xi: reg dyadasylumcorrpcshare l52dyadasylumcorrpcshare destinationrwpvote des
> tinationleftgs dyadcolony dyadlanguage dyaddistance
destinationunemployment
> lndestinationgdp destinationgdpgrowth destinationrecognition
schengen cdum1
> -cdum17 if inc_highoecd==0, nocons robust cluster(originid)
This will give you the 17 estimated betas (but no constant) and is
equivalent to the "referent category" model that you have already fit.
Now, you want to test
cdum1 = (cdum1+...+cdum17)/17
cdum2 = (cdum1+...+cdum17)/17 etc
This can be accomplished by a series of -lincom- commands:
. lincom cdum1-(cdum1+cdum2+...+cdum17)/17
. lincom cdum2-(cdum1+cdum2+...+cdum17)/17
etc
The p-values you will get will be based on the data from all groups (since
each contrast correctly involves all 17 groups).
PS There is a multiple-comparison issue here, whether you do pairwise
comparisons between groups, or deviations from the estimated average. You
should be considering an appropriate multiple-comparison procedure to
adjust your p-values.
The documents accompanying this transmission may contain confidential
health or business information. This information is intended for the use of
the individual or entity named above. If you have received this information
in error, please notify the sender immediately and arrange for the return
or destruction of these documents.
--- NOTE NEW ADDRESS AS OF JULY 8, 2003 ---
________________________________________________________________
Constantine Daskalakis, ScD
Assistant Professor,
Biostatistics Section, Thomas Jefferson University,
211 S. 9th St. #602, Philadelphia, PA 19107
Tel: 215-955-5695
Fax: 215-503-3804
Email: [email protected]
Webpage: http://www.kcc.tju.edu/Science/SharedFacilities/Biostatistics
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/