Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: svy + aweights
From
Stas Kolenikov <[email protected]>
To
[email protected]
Subject
Re: st: svy + aweights
Date
Thu, 10 Nov 2011 16:59:22 -0500
The nature of the -cluster()- variance estimators is such that they
control for any correlation pattern that might be observed within a
PSU. This is a non-parametric estimator, and you are probably thinking
along the lines of something like GEE.
Suppose you have a model
y = grand mean + {m==cluser mean} + {u==individual mean} +
{e==observation measurement error}
with n subjects and k observations per subject. Assume that u and e
are homoskedastic. If you have k=1 observation per subject, then you
cannot distinguish u and e, and have essentially one error term. The
covariance matrix is then Var[u+e] times an exchangeable correlation
structure with corr = Var[m]/(Var[m]+Var[u]+Var[e]). If you have
multiple observations, k>1, per subject, your covariance matrix is
J(kn,kn,Var[m]) + I(n) # J(k,k,Var[u]) + I(nk)*Var[e], which is a more
complicated pattern. In GEE, you have to put these structures into the
objective function as working correlation structures to get your
estimates. With -cluster()-, you don't have to, but you should expect
your estimates to be less efficient compared to a situation when the
above model were true, and you ran a (feasible) GLS estimation. As
long as you have # of clusters -> infinity, you can build a consistent
estimator of (within-cluster) Var[y], which will be accounted for in
-svy- commands.
Hope this helps.
On Thu, Nov 10, 2011 at 4:48 PM, Jeph Herrin <[email protected]> wrote:
> I'm not sure I get this. How can correlations at one level be "engulfed"
> by correlations at another? The PSUs account for subject level correlation,
> but for each subject I have multiple observations.
>
> Moreover, if I -reshape-, does it still make sense to -svyset psu-? I
> thought
> not.
>
> On 11/10/2011 4:40 PM, Stas Kolenikov wrote:
>>
>> On Thu, Nov 10, 2011 at 4:01 PM, JH<[email protected]> wrote:
>>>
>>> But doesn't your suggestion ignore the correlation of observations within
>>> subjects?
>>
>> No. Unless your current -svyset-ting is -svyset _n-... and frankly I
>> don't know how that would behave with -reshape-. If you have PSUs in
>> your -svyset- (and NHANES does have them), then the correlations of
>> observations within the subjects will be engulfed by the correlations
>> of observations within the PSUs that -svy:- controls for.
>>
>>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
--
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/