| |
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: GEE and weighting
Oh my God. That is a mess.
If selection of the health centers was not random, you might want to
treat them as strata rather than as clusters -- but then some of the
strata (the centers that were not selected into the survey) would have
zero weights, and you cannot generalize your results to the whole
population (sorry). You can also consider your weights to be
post-stratification weights -- you can post-stratify in Stata, as
well. I am pretty sure your weights would have to be the same of all
analyses you would be undertaking.
If you have pre- and post-training results, you would want to have
something like -treatreg- where you would also need to model the
probability of getting trained, as function of the center
characteristics. I don't think that -treatreg- allows for complex
surveys, but I guess you could hack it by adding -svy- to the options
of -program-.
On 4/5/07, Gwyneth Vance <[email protected]> wrote:
Stas,
Thank you for your help! I think the -gllamn- approach may not be the
best option for this data, as the health facilities were not randomly
selected for participation.
The -svyset- approach may work, but I have one concern. We know the
selection probabilities at post-test, but not at pre-test. This is
because at pre-test providers were not trained yet, and they were not
selected for training until after the pre-test data was collected. So,
basically, we don't know at pre-test who was going to be trained and who
was not going to be trained.
It may be safe to assume that this lack of knowledge prevented the
pre-test sample from over-representing providers who would be trained
(as was the case with the post-test sample where data collectors
purposely over-sampled trained providers in an effort to capture enough
cases for meaningful comparisons between trained and untrained
providers).
I do want to complete a pre to post-test analysis. Do you see any
problems with applying different sampling weights to pre and post data
using the svyset approach?
gv
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Stas
Kolenikov
Sent: Thursday, April 05, 2007 10:43 AM
To: [email protected]
Subject: Re: st: GEE and weighting
I guess if you are able to track the selection probabilities for
individuals really well, you could use
svyset health_center [pw=weight]
svy: logit whatever
Or, if you have selection probabilities for both health centers and
individuals within those, you could use -gllamm- for two-level modeling
with weights for both levels.
-xtgee- would give you some gain in precision over -logit- if the
correlation structure is specified correctly, but other than that,
-logit- should do fine, too. The -svyset- here is effectively analogous
to specifying -, cluster(cpihfano)- with the (quasi-)likelihood
commands, which is another way to correct for correlation/clustering in
your data.
On 4/4/07, Gwyneth Vance <[email protected]> wrote:
> What does one do with binary, clustered data that must be weighted?
>
> I am working on a project using Stata 9. The goal is to develop
> models of various binary outcome measures pertaining to improved
> counseling by health providers. I am, however, running into several
> challenges. The first is that the sample taken was a cluster sample.
> Individuals were interviewed at various health centers, so the health
> center was the primary sampling unit-health centers were selected for
> participation and then the individual study participants. Originally,
> I thought that I could use regression with GEE to account for the
> clustering in the data; however, I discovered a second problem that
> may limit my ability to do so. Within each cluster, the individuals
> sampled were not sampled in equal proportion on an important variable,
which was provider training.
> In other words, clients who received counseling from trained providers
> were over-represented in the sample.
>
> I thought the solution would be to apply a sample weight (pweight
> command in Stata), but Stata does not allow the pweight to vary by
> unit within a panel. That is, the individuals within a cluster are
> not allowed to have their own weight, only the panel or cluster may be
> weighted. Below are the commands I keyed in, and the error message
> that I received.
>
> Command:
> iis cpihfano
> xtgee cpi41 cpi2 ce39 [pweight = weight], family(binomial 1)
> link(logit)
> corr(exchangeable)
>
> Error Message:
> weight must be constant within cpihfano r(199);
>
> I have done a bit of research on the topic, but am getting no where
> other to discover that this problem may be; as yet, unsolved (please
> refer to this link for further explanation
> http://www.stata.com/support/faqs/stat/xtweight.html ).
>
> So, what can one do with binary, clustered data that should be
weighted?
> Does anyone know if progress has been made on this front? What
> solutions have others devised in similar situations? I can provide
> more detail if necessary.
>
> Gwyneth
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
--
Stas Kolenikov
http://stas.kolenikov.name
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
--
Stas Kolenikov
http://stas.kolenikov.name
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/