Re: st: xtreg - continuous or discrete time

From   José Maria Pacheco de Souza <[email protected]>
To   [email protected]
Subject   Re: st: xtreg - continuous or discrete time
Date   Tue, 16 Aug 2011 18:10:09 -0300

Em 16/08/2011 15:45, Ricardo Ovaldia escreveu:
I have a longitudinal data on children measured at ages 5, 10, 15 and 20.
They were all measured within two weeks of their birthday.
When using -xtreg-, I get very different results depending of whether I use time as a continuous or categorical variable.

I am tempted to use time as continuous, but I am not sure which to use. Any suggestions will be appreciated.

Below is my output from the two models. I am interested in the group differences:

Than you,

Ricardo Ovaldia, MS
Oklahoma City, OK

xtreg instad group##time ses

Random-effects GLS regression                   Number of obs      =      1413
Group variable: id                              Number of groups   =       360

R-sq:  within  = 0.1989                         Obs per group: min =         1
        between = 0.0435                                        avg =       3.9
        overall = 0.1426                                        max =         4

                                                 Wald chi2(12)      =    275.48
corr(u_i, X)   = 0 (assumed)                    Prob>  chi2        =    0.0000

       instad |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
        group |
           2  |  -.3593535   .8898889    -0.40   0.686    -2.103504    1.384797
           3  |  -1.664428   .8971943    -1.86   0.064    -3.422897    .0940402
         time |
          10  |   5.120189    .786916     6.51   0.000     3.577862    6.662516
          15  |   6.054063   .7869046     7.69   0.000     4.511758    7.596368
          20  |   .6104585   .7870224     0.78   0.438     -.932077    2.152994
   group#time |
        2 10  |  -1.245678   1.122178    -1.11   0.267    -3.445106    .9537501
        2 15  |  -1.581695   1.126637    -1.40   0.160    -3.789864    .6264734
        2 20  |  -2.830481    1.12774    -2.51   0.012     -5.04081   -.6201511
        3 10  |  -.3909519   1.135047    -0.34   0.731    -2.615604      1.8337
        3 15  |  -.7709906   1.134923    -0.68   0.497    -2.995398    1.453417
        3 20  |  -.5713752   1.135312    -0.50   0.615    -2.796547    1.653796
          ses |  -.0209192   .0203155    -1.03   0.303    -.0607368    .0188984
        _cons |   104.1393   1.187133    87.72   0.000     101.8125     106.466
      sigma_u |  3.1002125
      sigma_e |  6.1590537
          rho |  .20215091   (fraction of variance due to u_i)

. xtreg instad group##c.time ses

Random-effects GLS regression                   Number of obs      =      1413
Group variable: id                              Number of groups   =       360

R-sq:  within  = 0.0049                         Obs per group: min =         1
        between = 0.0414                                        avg =       3.9
        overall = 0.0193                                        max =         4

                                                 Wald chi2(6)       =     21.62
corr(u_i, X)   = 0 (assumed)                    Prob>  chi2        =    0.0014

       instad |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
        group |
           2  |   .4061883   1.137796     0.36   0.721    -1.823851    2.636228
           3  |  -1.590677   1.146674    -1.39   0.165    -3.838116     .656763
         time |   .0580776   .0553659     1.05   0.294    -.0504374    .1665927
group#c.time |
           2  |  -.1741696    .079296    -2.20   0.028     -.329587   -.0187523
           3  |  -.0427001    .079865    -0.53   0.593    -.1992325    .1138324
          ses |  -.0261362   .0206384    -1.27   0.205    -.0665867    .0143142
        _cons |    106.608   1.288649    82.73   0.000     104.0823    109.1337
      sigma_u |  2.6938033
      sigma_e |  6.8485734
          rho |   .1339852   (fraction of variance due to u_i)

Dear Ricardo:
probably some other Statalister will explain better than I, but I hope I can give some initial explanation. When you use the first model, time is categorical and the meanings of the coeficients are differences in means of the "category" 10 against the "category" 5, of the "category" 15 against "category" 5 etc. and does not must use the intervals 5, 5, 5 and 5 between the categories, because the variable is not numeric. For the second model, the variable is continuous and the coeficient says that there is an increase of .05 in instad for each unity of time, that maybe 0 1 2 3 4 5 6 7 8 9 ......20. The values are not exatly what I mentioned because you use interaction which interferes in the linear estimation, and the data presents a possible squared form.
josé maria
Jose Maria Pacheco de Souza
Professor Titular (aposentado), Colaborador Senior
Departamento de Epidemiologia/Faculdade de Saude Publica, USP
Av. Dr. Arnaldo, 715
01246-904  -  S. Paulo/SP - Brasil
fones (11)3061-7747; (11)3768-8612
