This raises several issues, mostly in my view well
dealt with by Maarten Buis. However, GLM with a log
link is not to be thought of as taking logs twice.
I think of it as doing calculations in terms of the
log response but reporting in terms of the response.
That is why "link" is not just an idiosyncratic
name for "transform": it is a different idea.
Also, it is not strictly necessary to have positive
responses, but that doesn't mean that -glm, log- is necessarily
a good solution to zero-inflation. It might be sensible
if your reported zeros were really all small positive quantities
too small to measure.
As you say nothing about your data, I cannot tell whether
this applies to you, but from the models you mention
it sounds as if you have counted data, so I guess not.
Nick
[email protected]
Charles Goss
> I am having some issues analyzing data with glm. I have tried several
> methods to analyze my zero-inflated data set (zinb, hurdle and glm).
> The best model fit that I get are when I log transform the response
> variable prior to analysis with a glm model using a negative binomial
> distribution. The negative binomial uses a log link function, so I
> think that this analysis is essentially double log-transforming the
> data, once initially, and then when the response is linked to the
> predictors it is log-transformed again. I have not been able to find
> any literature regarding this, so I was wondering if anyone knows if
> this is an appropriate way to analyze these data? Does it violate
> assumptions of the glm?? Thanks for your time.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/