I have a data set in which counts were collected at 2 time points under 2
conditions for each subject. There are no missing data. Data for one
subject look like this:
id time condit~n diagno~s count
5 0 0 2 0
5 0 1 2 8
5 1 0 2 5
5 1 1 2 7
Question 1.
I fitted a fixed effects negative binomial model including time,
condition, and their interaction:
. xi: xtnbreg count i.time*condition, i(id) fe irr nolog
i.time _Itime_0-1 (naturally coded; _Itime_0 omitted)
i.time*condit~n _ItimXcondi_# (coded as above)
Conditional FE negative binomial regression Number of obs =
544 Group variable (i): id Number of groups =
136
Obs per group: min =
4 avg =
4.0 max =
4
Wald chi2(3) =
154.25 Log likelihood = -646.33027 Prob > chi2
= 0.0000
-------------------------------------------------------------------------
----- count | IRR Std. Err. z P>|z| [95%
Conf. Interval]
-------------+-----------------------------------------------------------
----- _Itime_1 | 2.069823 .1629906 9.24 0.000 1.773798
2.415249 condition | 1.201935 .1071924 2.06 0.039
1.009179 1.431508 _ItimXcond~1 | .3432684 .0418947 -8.76
0.000 .2702388 .4360336
-------------------------------------------------------------------------
-----
From the IRRs I calculated a table of predicted relative cell counts, with
time = 0 and condition = 0 as the reference category. They are:
------------------------
| condition
time | 0 1
----------+-------------
0 | 1 1.202
1 | 2.07 .854
------------------------
The means of observed cell counts are below:
----------------------------
| condition
time | 0 1
----------+-----------------
0 | 2.09559 2.64706
1 | 4.36765 1.91176
----------------------------
These are in the ratios below, which differ from the predicted ratios:
------------------------
| condition
time | 0 1
----------+-------------
0 | 1 1.263
1 | 2.084 .912
------------------------
I had expected that the observed and predicted relative cell frequencies
would be the same for the fitted model. Can anyone explain why they are
not?
Question 2.
I believed that variables which are constant within ID could not be used
in a fixed effects model. Am I wrong? It seems that I am, as illustrated
by the inclusion of diagnosis (which is constant within a subject).
. xi: xtnbreg count diagnosis i.time*condition, i(id) fe irr nolog
i.time _Itime_0-1 (naturally coded; _Itime_0 omitted)
i.time*condit~n _ItimXcondi_# (coded as above)
Conditional FE negative binomial regression Number of obs =
544 Group variable (i): id Number of groups =
136
Obs per group: min =
4 avg =
4.0 max =
4
Wald chi2(4) =
172.78 Log likelihood = -643.28819 Prob > chi2
= 0.0000
-------------------------------------------------------------------------
----- count | IRR Std. Err. z P>|z| [95%
Conf. Interval]
-------------+-----------------------------------------------------------
----- diagnosis | 7.784788 5.952364 2.68 0.007 1.739424
34.8408 _Itime_1 | 2.072968 .1601387 9.44 0.000
1.781708 2.411841 condition | 1.180407 .1038162 1.89
0.059 .9935027 1.402473 _ItimXcond~1 | .3444533 .0414192
-8.86 0.000 .2721301 .4359976
-------------------------------------------------------------------------
-----
Question 3
Can anyone offer advice as to how to assess goodness of fit for xtnbreg
... , fe? (some models I want to fit are rather more complex than the
above examples).
Any advice or comments will be appreciated
Thanks
John Plummer
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/