Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: how can I test my NBREG model?


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   RE: st: how can I test my NBREG model?
Date   Wed, 18 Jun 2008 17:28:19 +0100

Thanks to Steven for the plug, but his recommendation of -cp- may be
problematic. My own -cp- was long since broken by Stata's own use of
-cp- as a synonym for -copy-. There is a -cpr- somewhere that gets round
that, but for the purposes here it may be better to look at my
-allpossible- or -selectvars-. Use -findit- for locations. 

Steven Samuels

Cynthia:

I advise you to look at examples of negative binomial regression in a  
good text. But, to briefly answer your questions:


1. Negative binomial regression -nbreg-, and its extensions -zinb-  
(zero-inflated), -ztnb- (positive counts only), fit a model to the  
log of the mean, not to the mean. So, the signs and relative  
magnitudes of coefficients should be comparable. Wherever they  
differ, I would believe the count data model. Unlike multiple  
regression, -nbreg- accommodates differences in the potential size of  
observations through an "exposure" or "offset" variable; a more  
populous census tract would have more physicians than a smalle z r  
tract, for example, so one would standardize for population size.


2. In the learning sample, you can use all of Stata's facilities for  
choosing a subset of "best" models. Compare the fits of ordinary - 
nbreg- to -zinb-. Check goodness of fit by using -linktest-  
(especially useful with continuous variables) and by comparing the  
observed to predicted counts. Use robust standard errors for  
inference. Choose transformations of continuous predictors with - 
fracpoly- or -mfp- . Select from "all possible combinations" of sets  
of predictors with a command like Nick Cox's -cp- (available from  
SSC). Compare alternative models with the BIC criterion with the - 
estat- command.

3. To apply your best models to the validation sample, predict for  
observations not used to create the estimates. Here's an example

sysuse auto
reg mpg weight if foreign
predict yhat if !foreign   // predicts for the other observations

Search also for "esample" to see another way of getting out-of-sample  
predictions.

4. Compare observed counts for your validation sample to those  
predicted by the learning sample. As a measure of "closeness" you  
might use a chi square statistic, divided by sample size.  A rank  
correlation could also work, but others may suggest better approaches.

You don't say much about your data-whether they were weighted,  
clustered, or in panel form, so I haven't covered all bases. Still, I  
hope this gives you a start.


-Steve

On Jun 17, 2008, at 2:20 PM, Cynthia Lokker wrote:

> Hi,
> I have a set of data with my dependant variable being a count and  
> with 19
> independent variables. I originally performed a multiple regression  
> on a 60%
> subset (n=757) and validated the model on the remaining 40% (n=504).
> It has since been brought to my attention to use a negative binomial
> regression since this fits my data better. I would now like to  
> repeat the
> analysis and compare the general findings of the nbreg with the former
> multiple regression (magnitude of co-efficients etc).
> I have the following questions:
> 1. Is it feasible to compare (generally) the 2 types of analysis?
> 2. Can I validate my nbreg model in the same way as I did with the  
> multiple
> regression?
> 3. What stata commands would I need to use to do #2?
>

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index