Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: svylogitgof after logistic using subpop option
From
[email protected] (Jeff Pitblado, StataCorp LP)
To
[email protected]
Subject
Re: st: svylogitgof after logistic using subpop option
Date
Tue, 08 Mar 2011 18:27:47 -0600
Maria E. Montez Rath <[email protected]> is trying to perform a
goodness-of-fit test for -svy: logistic- with a subpopulation:
> I just found out that the -estat- Stata manual had been updated and
> now includes the goodness of fit test for binary data. I believe that
> -estat gof- is reporting the F-adjusted mean residual test according
> to Archer and Lemeshow (2006).
>
> Reference
> Archer, K. J., and S. Lemeshow. 2006. Goodness-of-fit test for a
> logistic regression model fitted using survey sample data. Stata
> Journal 6: 97--105.
-estat gof- after -svy: logistic- is in fact using the above referenced
method.
> But I still have a problem. I have 10 years of data and so I created a
> smaller dataset that includes my subpopulation augmented by one record
> for each PSU dropped when selecting the subpopulation. In theory this
> should work because the problem with selecting the subpopulation
> directly and doing a conditional analysis is that there is no way of
> the program to know how many PSUs were sampled. By augmenting my
> dataset with the PSUs dropped Stata can still compute n (total number
> of PSUs sampled). I tested that this would work by comparing the
> results from -svy: logistic- with -subpop()- option using 1) the
> complete one year of data and 2) my augmented data for that same year.
>
> The results from -svy: logistic- are identical using both methods
> (Point estimates and SEs are equal) but the results from -estat gof-
> are very different where using the entire data the test indicates a
> lack of fit while using my augmented data the test indicates good fit.
>
> So, I'm still wondering how does -estat gof- uses the results from
> -svy: logistic- with the subpopulation option.
At present, neither -svylogitgof- nor -estat gof- do anything to account for
subpopulation estimation.
Since the original article does not specifically address subpopulation
estimation, it is not immediately clear how -estat gof- can be changed to
handle subpopulation estimation results. We will add this to our research and
development list.
In the short term, we will change -estat gof- to report a warning when it is
used with subpopulation estimation results.
--Jeff
[email protected]
> Using ALL data:
>
> . use pah08
> . svy, subpop(pah): logistic dead i.aki2 i.diabetes i.mec_vent i.fem
> . estat gof if newpah==1
>
> Logistic model for dead, goodness-of-fit test
>
> F(9,961) = 3126.59
> Prob > F = 0.0000
>
> Using AUGMENTED data:
>
> . use pahsubpop08, clear
> . svy, subpop(pah): logistic died i.aki2 i.diabetes i.mec_vent i.fem
> . estat gof
>
> Logistic model for died, goodness-of-fit test
>
> F(9,961) = 0.66
> Prob > F = 0.7500
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/