Jonathan DePeri wrote:
> I am running a regression using sample weights on a model which
> involves binary independent variables. The question may be
> elementary, but how should I interpret the parameter estimate of
> such a variable? Since the application of sampel weights transforms
> the 0s and 1s of binary variables into 0s and values between 0 and
> 1, it is not clear to me that the interpretation should remain the
> same.
It's not clear from your post whether or not you're using
. reg y x1 x2 d1 [pw=weightvar] (or [aw=weightvar])
If you are, then it's more accurate to say that you are using weighted
_ordinary_ least squares (WOLS), rather than WLS, which is conceptually
(and statistically) different (see Winship and Radbill [1994: 241] for
more). In any case, I think your intuition is correct: the interpretation
_is_ the same. Others may want to put this is a more formal statistical
language, but fitting a regression model with sampling weights simply
adjusts the parameter estimates (and their standard errors) upwards or
downwards for _all_ the independent variables in that model, and not just
the continuous ones, in an effort to reduce the bias.
Fitting two models using the 'Garrett and Mitchell' dataset (available on
request) demonstrates this, in which I create -jobless- as a dummy
variable from a continuous variable recording the unemployment rate. The
variable -europe- is also a dummy:
. g jobless=1 if unem<=5
(293 missing values generated)
. recode jobless .=0
(jobless: 293 changes made)
. g weight=invnorm(uniform())
. reg spend trade jobless growthpc europe
Source | SS df MS Number of obs = 571
----------+------------------------------ F( 4, 566) = 125.92
Model | 32430.9725 4 8107.74312 Prob > F = 0.0000
Residual | 36444.4887 566 64.3895561 R-squared = 0.4709
----------+------------------------------ Adj R-squared = 0.4671
Total | 68875.4612 570 120.834142 Root MSE = 8.0243
---------------------------------------------------------------------------
spend | Coef. Std. Err. t P>|t| [95% Conf.Interval]
----------+----------------------------------------------------------------
trade | .1298481 .0147212 8.82 0.000 .1009332 .1587629
jobless | -5.381454 .7148419 -7.53 0.000 -6.785521 -3.977387
growthpc | -1.165842 .1431781 -8.14 0.000 -1.447067 -.8846165
europe | 6.994002 .9683677 7.22 0.000 5.091969 8.896035
_cons | 34.88938 .9811031 35.56 0.000 32.96233 36.81642
---------------------------------------------------------------------------
. reg spend trade jobless growthpc europe [pw=weight]
(sum of wgt is 2.2378e+02)
Linear regression Number of obs = 282
F( 4, 277) = 94.57
Prob > F = 0.0000
R-squared = 0.4746
Root MSE = 7.3868
---------------------------------------------------------------------------
| Robust
spend | Coef. Std. Err. t P>|t| [95% Conf. Interval]
----------+----------------------------------------------------------------
trade | .1277293 .0168331 7.59 0.000 .0945923 .1608663
jobless | -5.427786 1.377935 -3.94 0.000 -8.140341 -2.715231
growthpc | -1.13484 .2256215 -5.03 0.000 -1.578991 -.6906894
europe | 6.744966 1.26138 5.35 0.000 4.261858 9.228074
_cons | 35.44936 .9949323 35.63 0.000 33.49077 37.40795
---------------------------------------------------------------------------
The weight I generated is, of course, nonsensical since this is a panel
dataset of eighteen countries, but it illustrates my point. The standard
errors change dramatically, but note how the parameter estimates don't
change all that much (including the dummy variable, which shows that,
although it's still significant post-weighting, its p-value 'purchase' is
weaker: indeed, that's the story for all of the variables in this
example). They've all simply been adjusted slightly after weighting.
If your post was really asking about WLS, then you would find that if you
fitted an OLS 'between-effects' model (-xtreg, be-) both with and without
the -wls- option, you would find exactly the same thing.
CLIVE NICHOLAS |t: 0(044)7903 397793
Politics |e: [email protected]
Newcastle University |http://www.ncl.ac.uk/geps
Reference:
Winship C and Radbill L (1994) "Sampling Weights and Regression Analysis",
SOCIOLOGICAL METHODS AND RESEARCH 23(2): 230-57.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/