Arne--
I'm not sure the concern about incidental parameters applies here. To
my mind, the question is, is there anything to be gained by using
-glm- with indicator variables to capture fixed effects to estimate
instead of transforming y by generating a new variable lny=ln(y) or
logity=logit(y) or invlogity=invlogit(y) and I'm not sure there is, in
this case. The poster specified that y measured proportions strictly
between 0 and 1, i.e. on the open interval. That is the crucial
point--there are no obs with y=0 or y=1. In this case, you may be
better off with -xtreg- (or -xtivreg2- with more SE adjustments) than
-glm- if only because estimation is so much faster! But you will get
numerically different answers, of course...
since y=f(Xb+e) is not the same as y=f(Xb)+e
webuse psidextract, clear
tsset id t
gen w=wks/53
g ilw=invlogit(w)
qui su ilw
replace ilw=ilw/r(sd)
qui reg ilw lw uni south smsa, cluster(id)
est sto reg
qui glm w lw uni south smsa, link(logit) fam(gauss) cl(id)
est sto glm
qui xtreg ilw lw uni south smsa, cluster(id) fe
est sto xtreg
qui xi: glm w lw uni sou sms i.id, link(logit) fam(gauss) cl(id)
est sto xtglm
esttab *, keep(lwage union south smsa) mti
----------------------------------------------------------------------------
(1) (2) (3) (4)
reg glm xtreg xtglm
----------------------------------------------------------------------------
main
lwage 0.139* 0.127* 0.0598 0.162
(2.41) (2.03) (0.83) (1.55)
union -0.309*** -0.286*** 0.158 0.171
(-6.09) (-6.22) (1.33) (1.38)
south 0.0361 0.0404 -0.122 -0.275
(0.67) (0.76) (-0.66) (-1.16)
smsa 0.0242 0.0176 0.0304 0.0468
(0.45) (0.35) (0.35) (0.38)
----------------------------------------------------------------------------
N 4165 4165 4165 4165
----------------------------------------------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001
Having accepted you might transform y, the question then is which
transformation is appropriate, and for that you need some theory.
Neglecting theory, you might explore whether regressions using
lny=ln(y) or logity=logit(y) or invlogity=invlogit(y) as the depvar
produce predictions that make more sense and residuals that look less
correlated with your transformed depvar.
tw function y=50*invlogit(x)-31||function y=logit(x)||function y=ln(x)
On 11/7/07, Arne Risa Hole <[email protected]> wrote:
> There was an extremely useful discussion on the list recently about
> this issue in the context of fixed effects binary logit models. In
> short, adding the fixed effects 'by hand' results in biased estimates
> unless the number of time periods is large. See the thread starting
> with:
>
> http://www.stata.com/statalist/archive/2007-10/msg00935.html
>
> Arne
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/