Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: svylogitgof: changes dramatically across models using the same pooled sample


From   Steven Samuels <sjsamuels@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: svylogitgof: changes dramatically across models using the same pooled sample
Date   Tue, 12 Apr 2011 20:53:28 -0400

Eileen, I assume that you're on Stata 11, since you don't say otherwise.  -svylogitgof- doesn't work with subpop() options and has other problems in Stata 11.

See: http://www.stata.com/statalist/archive/2011-03/msg00442.html

Steve
sjsamuels@gmail.com


On Apr 12, 2011, at 1:11 PM, Eileen Diaz McConnell wrote:

Hi Statalist Users:

Hoping that you can offer some advice about the following issue.

I am doing a logistic regression using the svy: logit command in Stata SE11.  I am running the identical model several times with the only change being the reference group.   

Here is an example of the same model with the only change being the reference group-this one leaves out race_whnb.

svy, subpop(finalp7):logistic hindpvt3 race_bknb ltcitizen ltauimm ltunauthmm x1 x2 x3 
svylogitgof

This second model leaves out ltunauthmm:

svy, subpop(finalp7):logistic hindpvt3 race_bknb race_whnb ltcitizen ltauimm x1 x2 x3
Svylogitgof

As expected, the odds ratios and standard errors change across the two models for the contrast variables listed (race_bknb, etc) but are the same for all the remaining independent variables (x1 x2 x3) in both models.

What seems strange to me; however, is that the F-adjusted t statistic (svylogitgof) is radically different for these two models.

With the first model:

F  adjusted test statistic= 1.0887
		  Pvalue=.386323

With the second model:

F  adjusted test statistic= 17.4193
		  Pvalue=4.629e-13


As I understand the interpretation of the F test statistic, these results suggest that the data are a good fit for the first model and not a good fit for the second model.

However, I'm concerned that the F statistic would change so dramatically when it's using the same pooled sample and simply changing the reference group.

I wonder if this is somehow due to the sample size and different distributions of these groups on the dependent variable (hindpvt3).

The sample size is fairly small, pooled sample is 1361; n for race_whnb=350 and n for ltunauthmm=247.
Descriptives of hindpvt3 for the 2 groups are:   race_whnb=33/350 have a value of 1 on hindpvt3;  ltaunauthmm= 117/247 have a value of 1 on hindpvt3.

What I also just noticed is that the missing values generated for each one is different, but not sure if that is normal or not, either.

For the first model:  
svylogitgof
(57 missing values generated)

And for the second model:
svylogitgof
(118 missing values generated).

Given this information, does the F statistic and interpretation sound valid in this case?  Or could the wide swing in the F statistic be due to very small sample sizes or something about the missing values generated?

Any suggestions about what to investigate further would be very much appreciated. Thanks for your consideration.

Eileen Diaz McConnell, Ph.D.
School of Transborder Studies
Arizona State University
Tempe, AZ 85287-3502


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index