Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Xtmixed, Multilevel models and design weights

From	"Antonio Rodriguez Andres" <[email protected]>
To	<[email protected]>
Subject	st: Xtmixed, Multilevel models and design weights
Date	Mon, 3 Feb 2014 17:13:20 +0200
Dear Stata users
I am working on a multilevel model using  the European Social Survey, third
round. It is a two level model with individuals (level 1) and country (level
2). Based on a previous thread I type 
xtset cntry idno

In the original dataset, there are two types of weights:
.	Design weight: The design weights are inclusion probabilities for
individuals i in countries j. The design weight corrects for slightly
different probabilities of selection, thereby making the sample more
representative of a 'true' sample of individuals from each country.

.	Population size weights: The population size weight makes an
adjustment to ensure that each country is represented in proportion to its
population size. The population size weight is calculated as PWEIGHT=
[Population size]/[(Net sample size in data file)*10 000]

My question is: do I need to specify the population size weights when I run
the multilevel model? I tend to get different results. Below is the
regression with design weights applied 

xtmixed dprt age age2 gender married separated divorced widowed seced terted
chldhm missinc medinc highinc ihealth iuemp5yr iuemp12m
gender_index06[pw=dweight]  || cntry: , mle
(32181 missing values generated)

Obtaining starting values by EM: 

Performing gradient-based optimization: 

Iteration 0:   log pseudolikelihood = -29698.399  
Iteration 1:   log pseudolikelihood = -29698.399  

Computing standard errors:

Mixed-effects regression                        Number of obs      =
10819
Group variable: cntry                           Number of groups   =
23

                                                Obs per group: min =
190
                                                               avg =
470.4
                                                               max =
879


                                                Wald chi2(17)      =
1667.56
Log pseudolikelihood = -29698.399               Prob > chi2        =
0.0000

                                   (Std. Err. adjusted for 23 clusters in
cntry)
----------------------------------------------------------------------------
----
               |               Robust
          dprt |      Coef.   Std. Err.      z    P>|z|     [95% Conf.
Interval]
---------------+------------------------------------------------------------
----
           age |   .0531949   .0246902     2.15   0.031      .004803
.1015868
          age2 |  -.0006209   .0002598    -2.39   0.017    -.0011302
-.0001116
        gender |  -.4561274   .0618772    -7.37   0.000    -.5774044
-.3348504
       married |  -.7286765   .1096062    -6.65   0.000    -.9435007
-.5138522
     separated |   .9733665   .2900381     3.36   0.001     .4049023
1.541831
      divorced |   .2673798   .1585851     1.69   0.092    -.0434412
.5782009
       widowed |   1.378241   .2714682     5.08   0.000     .8461734
1.910309
         seced |  -.3752529    .096655    -3.88   0.000    -.5646931
-.1858126
        terted |  -.4058087   .1418846    -2.86   0.004    -.6838973
-.12772
        chldhm |   .0646216   .0830391     0.78   0.436    -.0981321
.2273752
       missinc |  -.5729247   .2264561    -2.53   0.011    -1.016771
-.129079
        medinc |  -.8394265   .2025874    -4.14   0.000    -1.236491
-.4423624
       highinc |  -1.333281   .2068405    -6.45   0.000    -1.738681
-.9278808
       ihealth |  -1.687627   .0660921   -25.53   0.000    -1.817165
-1.558089
      iuemp5yr |   .3617495   .0849113     4.26   0.000     .1953263
.5281726
      iuemp12m |   .4095986   .1104699     3.71   0.000     .1930816
.6261157
gender_index06 |  -5.556036    2.26839    -2.45   0.014      -10.002
-1.110074
         _cons |   16.54468   1.720549     9.62   0.000     13.17246
19.91689
----------------------------------------------------------------------------
----

----------------------------------------------------------------------------
--
                             |               Robust           
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf.
Interval]
-----------------------------+----------------------------------------------
--
cntry: Identity              |
                   sd(_cons) |   .4603558   .1069094      .2920231
.7257215
-----------------------------+----------------------------------------------
--
                sd(Residual) |   3.742975    .068211      3.611643
3.879082
----------------------------------------------------------------------------
--

Warning: Sampling weights were specified only at the first level in a
multilevel model. If these weights are indicative of overall and not
conditional inclusion probabilities, then results may be biased.

And here is the regression with both design weights, population size weights
and scaling applied:
. . xtmixed dprt age age2 gender married separated divorced widowed seced
terted chldhm missinc medinc highinc ihealth iuemp5yr iuemp12m
gender_index06[pw=dweight]  || cntry: , mle pweight(pweight) pwscale(size)

Obtaining starting values by EM: 

Performing gradient-based optimization: 

Iteration 0:   log pseudolikelihood = -36222.212  
Iteration 1:   log pseudolikelihood = -36219.813  
Iteration 2:   log pseudolikelihood = -36219.697  
Iteration 3:   log pseudolikelihood = -36219.693  
Iteration 4:   log pseudolikelihood = -36219.693  

Computing standard errors:

Mixed-effects regression                        Number of obs      =
10819
Group variable: cntry                           Number of groups   =
23

                                                Obs per group: min =
190
                                                               avg =
470.4
                                                               max =
879


                                                Wald chi2(17)      =
56602.23
Log pseudolikelihood = -36219.693               Prob > chi2        =
0.0000

                                   (Std. Err. adjusted for 23 clusters in
cntry)
----------------------------------------------------------------------------
----
               |               Robust
          dprt |      Coef.   Std. Err.      z    P>|z|     [95% Conf.
Interval]
---------------+------------------------------------------------------------
----
           age |   .0625417   .0318887     1.96   0.050     .0000409
.1250425
          age2 |  -.0007252   .0003449    -2.10   0.035    -.0014012
-.0000493
        gender |  -.3903829   .0778959    -5.01   0.000    -.5430561
-.2377097
       married |  -.7816344   .0755469   -10.35   0.000    -.9297036
-.6335652
     separated |   1.393169   .3505622     3.97   0.000     .7060796
2.080258
      divorced |   .4096387   .1762837     2.32   0.020      .064129
.7551483
       widowed |    1.67463   .2974701     5.63   0.000     1.091599
2.257661
         seced |  -.4187948   .0915211    -4.58   0.000    -.5981729
-.2394167
        terted |  -.3965199   .1042436    -3.80   0.000    -.6008336
-.1922063
        chldhm |   .0937831   .1176234     0.80   0.425    -.1367546
.3243208
       missinc |  -.4813444   .3198694    -1.50   0.132    -1.108277
.1455882
        medinc |  -.8523854   .2845472    -3.00   0.003    -1.410088
-.2946831
       highinc |   -1.39177   .2832477    -4.91   0.000    -1.946925
-.8366147
       ihealth |  -1.807828    .086356   -20.93   0.000    -1.977083
-1.638573
      iuemp5yr |   .4177802   .0926448     4.51   0.000     .2361997
.5993607
      iuemp12m |   .5078261   .0996891     5.09   0.000     .3124391
.7032132
gender_index06 |  -1.416382   1.961742    -0.72   0.470    -5.261326
2.428561
         _cons |   13.66673   1.514102     9.03   0.000     10.69915
16.63432
----------------------------------------------------------------------------
----

----------------------------------------------------------------------------
--
                             |               Robust           
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf.
Interval]
-----------------------------+----------------------------------------------
--
cntry: Identity              |
                   sd(_cons) |   .3021236   .1031259      .1547527
.5898356
-----------------------------+----------------------------------------------
--
                sd(Residual) |   3.802415   .0817433      3.645529
3.966052

Any suggestions?

Regards

Antonio

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Alfonso
Sánchez-Peñalver
Sent: Friday, January 31, 2014 6:24 PM
To: Stata List
Subject: Re: st: gllamm or xtmixed models?

Antonio,

before you do the fixed effects estimation you have to run -xtset- as

xtset country id

to follow the example you gave. Country is the grouping variable, and id is
the individual observation variable. You have changed the example in the
last email with respect to the previous ones. Going back to the previous
ones you can run the fixed effects estimation as

xtreg depression x1 x2 x3, fe.

Now, in your last email you use -vce(cluster country)-. Using cluster robust
variance accounts for correlation of the errors within the clusters, in your
case the countries, but not across clusters. The question is whether once
you have stripped the errors of the different intercepts by using fixed
effects, why do you expect the errors to be correlated within the countries?
Consider for example that there is an unobservable variable which measures
severity of winter. We expect that the more severe the winter is the more
cases of depression, so while controlling for everything else it would make
sense that the number of patients with depression in Sweden or Norway is
larger than the number of patients with depression in Spain or Italy. Since
we cannot control for the severity of the winter because we don't have a
measure for it, the errors would capture the effect of this variable on the
dependent variable, and thus you would expect the errors for Sweden to be
larger than the!
  errors for Spain, which creates the correlation between the errors in
Sweden, and the correlation between the errors in Spain, but not the
correlation between errors of Spain and Sweden. Now, when you use fixed
effects estimation you are in fact controlling for the average effect of all
unobserved characteristics of the countries, so you would be controlling for
the average effect of the severity of the winter (among other unobservables)
in the countries. Therefore, unless you think there is something else
causing the correlation between the errors within the different countries,
you don't need the -vce(cluster country)- option.

Best,

Alfonso Sánchez-Peñalver, PhD

Visiting Assistant Professor
Suffolk University
Senior Instructor
UMass Boston



On Jan 31, 2014, at 4:26 AM, Antonio Rodriguez Andres
<[email protected]> wrote:

> Alfonso
> 
> Thank you for your answer. As far as I understood, as the observations 
> are clustered within countries. I have to account this in my model and 
> use a two multilevel model. What I can try is a fixed effects model 
> with clustering at country level
> 
> xtreg dv iv, fe vce (cluster country)
> 
> I should also use the xtset command but I do not have a real panel. 
> Usually we declare with xtset id year (both dimensions of the panel 
> data ) but here it is only a cross section
> 
> Can I type
> 
> xtset id  country  (1 level and second level)?
> 
> 
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Alfonso 
> Sánchez-Peñalver
> Sent: Thursday, January 30, 2014 10:31 PM
> To: Stata List
> Subject: Re: st: gllamm or xtmixed models?
> 
> Hi again Antonio,
> 
> I haven't used -gllamm- (SSC) but my understanding is that you will 
> also be able to estimate the random effects with it. The fixed effects 
> can be estimated in two different ways:
> 
> 1. Pooled OLS (-regress-) with a dummy variable for each country and 
> no constant (-nocons- option) 2. -xtreg- with fe option
> 
> For the second option you will have to first use -xtset- to identify 
> which is the level 2 (cluster) variable (country) and the level 1 
> variable (the individuals).
> 
> As for random slopes, consider the random effects model. The random 
> effects model assumes that the intercept is a random variable across 
> countries. What if the intercept is not the only thing that varies 
> across countries? What if the effect (slope) of a certain variable 
> (age let's say) also varies across countries? You can include that 
> variable in the random part of the command to let the slope be a 
> random variable as well. So for example, going back to your syntax, 
> assume that you believe the coefficient on x2 to be random as well, you
can type:
> 
> xtmixed depression x1 x3 || country: x2
> 
> Best,
> 
> Alfonso Sánchez-Peñalver, PhD
> 
> Visiting Assistant Professor
> Suffolk University
> Senior Instructor
> UMass Boston
> 
> 
> 
> On Jan 30, 2014, at 3:09 PM, Antonio Rodriguez Andres 
> <[email protected]> wrote:
> 
>> Alfonso
>> 
>> Thank you for your answer.  On this way, can I estimate the fixed 
>> effects for each country? What do they mean by random slopes for all
data?
>> This can be done using the xtmixed or gllamm command?
>> 
>> 
>> 
>> -----Original Message-----
>> From: [email protected]
>> [mailto:[email protected]] On Behalf Of Alfonso 
>> Sánchez-Peñalver
>> Sent: Thursday, January 30, 2014 9:58 PM
>> To: Stata List
>> Subject: Re: st: gllamm or xtmixed models?
>> 
>> Hola Antonio,
>> 
>> I believe the correct syntax for the random effects model estimated 
>> via maximum likelihood would be
>> 
>> xtmixed depression x1 x2 x3 || country:
>> 
>> Alfonso Sánchez-Peñalver, PhD
>> 
>> Visiting Assistant Professor
>> Suffolk University
>> Senior Instructor
>> UMass Boston
>> 
>> 
>> 
>> On Jan 30, 2014, at 2:52 PM, Antonio Rodriguez Andres 
>> <[email protected]> wrote:
>> 
>>> Dear stata users
>>> 
>>> I want to estimate multilevel models as I have observations for 
>>> individuals across countries.  My dependent variable İs a measure of 
>>> mental health ranging from 0 to 24. I want to use hierarchical 
>>> linear models with fixed effects and random effects for countries. 
>>> The correct syntax is:
>>> 
>>> xtmixed depression   x1 x2 x3   || i(country)
>>> 
>>> Any clue
>>> 
>>> Regards
>>> 
>>> Antonio
>>> 
>>> 
>>> 
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>> 
>> 
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>> 
>> 
>> 
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
> 
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
Prev by Date: Re: st: How to graph this.
Next by Date: st: New commands -dmout- and -pctrim-
Previous by thread: st: RE: Combining a regression table with graph in Stata
Next by thread: st: New commands -dmout- and -pctrim-
Index(es):
- Date
- Thread