Ryan Wells wrote:
> I am running a logistic regression in stata, but the dataset requires me
> to use weights and to account for complex survey design (which I do with
> the command "svy" and its associated arguments...)
>
> I normally get the R2, -2LL, etc with the command "fitstat", but when
> weights and/or the "svy" command is used, fitstat does not provide
> details. Is there another command that could be used, a workaround, or
> some way to both use weights and have the ease of getting statistics for
> goodness of fit??
Yes, there is, but you won't necessarily get all the GOF statistics you're
seeking. Try Jeroen Weesie's -multgof- package, which appeared as -sg68-
in STB 36, and was reprinted in STB Reprints Vol 6, pp 183-186.
An example of how to use -multgof- properly is displayed below:
. webuse union, clear
. g weight=invnorm(uniform())
. logit union year age grade south black [pw=weight], or
(sum of wgt is 1.0408e+04)
Iteration 0: log pseudolikelihood = -6861.8457
Iteration 1: log pseudolikelihood = -6575.2068
Iteration 2: log pseudolikelihood = -6568.7874
Iteration 3: log pseudolikelihood = -6568.7803
Logistic regression Number of obs = 13075
Wald chi2(5) = 349.58
Prob > chi2 = 0.0000
Log pseudolikelihood = -6568.7803 Pseudo R2 = 0.0427
----------------------------------------------------------------------------
| Robust
union | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-----------+----------------------------------------------------------------
year | .990755 .0095862 -0.96 0.337 .9721434 1.009723
age | 1.020531 .0092317 2.25 0.025 1.002597 1.038786
grade | 1.079468 .0134591 6.13 0.000 1.053408 1.106172
south | .4170912 .0254967 -14.30 0.000 .3699961 .4701808
black | 2.477861 .1538027 14.62 0.000 2.194028 2.798412
----------------------------------------------------------------------------
. predict obs, pr
. g exp=.5
. multgof obs exp if e(sample), df(5) n(13075)
observed do not sum to 1
expected do not sum to 1
Cressie-Read multinomial goodness-of-fit (5 df; 13075 cells)
cells(exp<1) = 0.00% cells(exp<5) = 100.00%
cells(obs<1) = 53.97% cells(obs<5) = 100.00%
Lambda(-2.00) Neyman's X2 = 65.533 p = 0.0000
Lambda(-1.00) Kullbach's KL = 66.239 p = 0.0000
Lambda(-0.50) Freeman-Tukey = 66.662 p = 0.0000
Lambda( 0.00) LR = 67.132 p = 0.0000
Lambda( 0.67) Cressie-Read = 67.836 p = 0.0000
Lambda( 1.00) Pearson's X2 = 68.222 p = 0.0000
With -multgof-, you must generate observed (here, -obs-) and expected
(here, -exp-) probabilities for each observation first before running it.
The -n(#)- option must be used if proportions are the metric used; you're
not restricted to the sample N, but it makes sense to do this in the above
example.
I hope all this helps.
CLIVE NICHOLAS |t: 0(044)7903 397793
Politics |e: [email protected]
Newcastle University |http://www.ncl.ac.uk/geps
Whereever you go and whatever you do, just remember this. No matter how
many like you, admire you, love you or adore you, the number of people
turning up to your funeral will be largely determined by local weather
conditions.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/