Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: GEE and weighting


From   "Gwyneth Vance" <[email protected]>
To   <[email protected]>
Subject   RE: st: GEE and weighting
Date   Thu, 5 Apr 2007 13:28:00 -0400

Stas, 

Thank you for your help!  I think the -gllamn- approach may not be the
best option for this data, as the health facilities were not randomly
selected for participation.    

The -svyset- approach may work, but I have one concern.  We know the
selection probabilities at post-test, but not at pre-test.  This is
because at pre-test providers were not trained yet, and they were not
selected for training until after the pre-test data was collected.  So,
basically, we don't know at pre-test who was going to be trained and who
was not going to be trained.  

It may be safe to assume that this lack of knowledge prevented the
pre-test sample from over-representing providers who would be trained
(as was the case with the post-test sample where data collectors
purposely over-sampled trained providers in an effort to capture enough
cases for meaningful comparisons between trained and untrained
providers). 

I do want to complete a pre to post-test analysis.  Do you see any
problems with applying different sampling weights to pre and post data
using the svyset approach?  

gv



-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Stas
Kolenikov
Sent: Thursday, April 05, 2007 10:43 AM
To: [email protected]
Subject: Re: st: GEE and weighting

I guess if you are able to track the selection probabilities for
individuals really well, you could use

svyset health_center [pw=weight]
svy: logit whatever

Or, if you have selection probabilities for both health centers and
individuals within those, you could use -gllamm- for two-level modeling
with weights for both levels.

-xtgee- would give you some gain in precision over -logit- if the
correlation structure is specified correctly, but other than that,
-logit- should do fine, too. The -svyset- here is effectively analogous
to specifying -, cluster(cpihfano)- with the (quasi-)likelihood
commands, which is another way to correct for correlation/clustering in
your data.

On 4/4/07, Gwyneth Vance <[email protected]> wrote:
> What does one do with binary, clustered data that must be weighted?
>
> I am working on a project using Stata 9.  The goal is to develop 
> models of various binary outcome measures pertaining to improved 
> counseling by health providers.  I am, however, running into several 
> challenges.  The first is that the sample taken was a cluster sample.

> Individuals were interviewed at various health centers, so the health 
> center was the primary sampling unit-health centers were selected for 
> participation and then the individual study participants.  Originally,

> I thought that I could use regression with GEE to account for the 
> clustering in the data; however, I discovered a second problem that 
> may limit my ability to do so.  Within each cluster, the individuals 
> sampled were not sampled in equal proportion on an important variable,
which was provider training.
> In other words, clients who received counseling from trained providers

> were over-represented in the sample.
>
> I thought the solution would be to apply a sample weight (pweight 
> command in Stata), but Stata does not allow the pweight to vary by 
> unit within a panel.  That is, the individuals within a cluster are 
> not allowed to have their own weight, only the panel or cluster may be

> weighted.  Below are the commands I keyed in, and the error message 
> that I received.
>
> Command:
> iis cpihfano
> xtgee cpi41 cpi2 ce39 [pweight = weight], family(binomial 1) 
> link(logit)
> corr(exchangeable)
>
> Error Message:
> weight must be constant within cpihfano r(199);
>
> I have done a bit of research on the topic, but am getting no where 
> other to discover that this problem may be; as yet, unsolved (please 
> refer to this link for further explanation 
> http://www.stata.com/support/faqs/stat/xtweight.html ).
>
> So, what can one do with binary, clustered data that should be
weighted?
> Does anyone know if progress has been made on this front?  What 
> solutions have others devised in similar situations? I can provide 
> more detail if necessary.
>
> Gwyneth
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>


--
Stas Kolenikov
http://stas.kolenikov.name
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index