| 
    
 |   | 
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: xtgee results depend on sort order
We recently encountered a surprising problem in which estimates from xtgee
depend on the sort order of the data: something that we had thought could 
not happen in a standard Stata estimation command. Stata technical support 
have confirmed this really happens in this dataset, but were not able to 
explain why. They recommended using the sortseed command before performing 
the analysis, but while that makes results reproducible it does explain why 
the problem arises or how to derive the best estimate. The problem is 
particularly disconcerting because nothing in the output suggests there is 
an estimation problem.
We used xtgee with an exchangeable correlation structure. The correlation
between successive observations in the same person is negative but stable 
even when the estimation results change. Two examples of the output follow 
(the output varies every time the command is run). As you can see exactly 
the same numbers of people are included in the analysis, with the same 
number of observations.
Has anyone encountered this problem before and, if so, what the best method 
is to overcome it?
Thanks
Jonathan Sterne
A. xtgee cd4_slope ON_ARTs, i(patient) fam(gaussian) link(id) corr(exc)
Iteration 1: tolerance = .21893738
Iteration 2: tolerance = .00579956
Iteration 3: tolerance = .00004564
Iteration 4: tolerance = 3.927e-07
GEE population-averaged model               Number of obs      = 4927
Group variable:                   patient   Number of groups   = 3131
Link:                            identity   Obs per group: min = 1
Family:                          Gaussian                  avg = 1.6 
Correlation:                  exchangeable                 max = 11 
Wald chi2(1)       = 57.58
Scale parameter:                12012.05    Prob > chi2        = 0.0000
---------------------------------------------------------------------
cd4_slope|    Coef.   Std. Err.     z    P>|z|   [95% Conf. Interval]
---------+----------------------------------------------------------- 
ON_ARTs  |  20.13073  2.652914    7.59   0.000   14.93111   25.33034
 _cons  | -23.04211  2.331146   -9.88   0.000   -27.61107  -18.47315
---------------------------------------------------------------------
B. xtgee cd4_slope ON_ARTs, i(patient) fam(gaussian) link(id) corr(exc)
Iteration 1: tolerance = .08330006
Iteration 2: tolerance = .00242865
Iteration 3: tolerance = .0000264
Iteration 4: tolerance = 2.804e-07
GEE population-averaged model              Number of obs      = 4927
Group variable:                  patient   Number of groups   = 3131
Link:                           identity   Obs per group: min = 1
Family:                         Gaussian                  avg = 1.6
Correlation:                exchangeable                  max = 11 
Wald chi2(1)       = 44.53
Scale parameter:                12005.77   Prob > chi2      = 0.0000
-------------------------------------------------------------------
cd4_slope|    Coef.  Std. Err.    z    P>|z|    [95% Conf. Interval]
---------+----------------------------------------------------------
ON_ARTs | 17.71781  2.655046   6.67   0.000    12.51402   22.92161
  _cons |-22.26446  2.335275  -9.53   0.000   -26.84151   -17.6874
-------------------------------------------------------------------
----------------------
Jonathan Sterne
Department of Social Medicine
University of Bristol
Canynge Hall
Whiteladies Road
Bristol BS8 2PR
UK
Tel:    0117 928 7396
Fax:    0117 928 7325
E-mail: [email protected]
web:    www.epi.bris.ac.uk
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/