dear statalist users,
i am new to the list and i have a probably rather simple question.
anyhow it is puzzling me. i am currently finishing my thesis.
i am writing on a topic about media and corporate governance. for some
regression models, i have the number of news articles as the dependent
variable. this variable is highly skewed to the right and if i use OLS
regressions, the residuals are definitely not normally distributed. so
i use GLM. however, when using GLM i have to specifiy the correct
distribution of the dependent variable and the link function. after
consulting the ucla stata webpage, i think the right distributional
specification is negative binominal. my variable is overdispersed,
ruling out poisson, and as it is conditioned on that at least one
article is written, i do not have zero values.
can somebody give me some help regarding if my way to use glm instead
of ols to overcome the problem of non-normally distributed dependent
variable is correct (in the end the residuals after the regression)?
also, with the skewed and overdispersed but not zero inflated
variable, is the negative binominal the right choice? how can i
understand the link function? i have choosen log. is that correct?
sorry for the blundness of the questions, it is the first time i am using glm
thank you very much for your help in advance,
and have a great easter break
peter
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/