Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: mepoisson vs meqrpoisson


From   Steve Samuels <[email protected]>
To   [email protected]
Subject   Re: st: mepoisson vs meqrpoisson
Date   Fri, 2 Aug 2013 11:28:34 -0400

Correction: The differing signs for the pe & *meqr* adp coefficients are
likely an instance of Simpson's paradox.

SS

Phil:

1. I can't explain the difference between the -mepoisson and
-mepqrpoisson-. The near equivalence of the meqr1 and meqr2 estimates
gives me the impression that the QR results are more accurate, but I
really don't know. In any case, the comparisons between me and meqr are
premature if you haven't yet followed this advice from the ME manual
(page 10):

"When fitting mixed-effects models, you should always assess whether the
approximation is adequate by refitting the model with a larger number of
quadrature points."

Try this and report back.

2. The differing signs for the pe & me adp coefficients are an instance
of Simpson's paradox. To save Nick Cox a post, I'll add that this
phenomenon was recognized earlier by Yule and by Karl Pearson. See:
https://en.wikipedia.org/wiki/Simpson%27s_paradox


Steve




On Aug 1, 2013, at 9:05 PM, Phil Clayton wrote:

Dear Statalisters,

I have been experimenting with the new -mepoisson- and -meqrpoisson- commands and have found substantively different results that I can't explain.

I am analysing infections in patients undergoing dialysis. I have a 3-level model - episodes of dialysis are clustered within patients (n=6959), who are clustered within hospitals (n=56). The outcome is "inf" (number of infections) and my main exposure of interest is "apd" (type of dialysis) which can vary within patients. Exposure time is stored in "duration".

There is also a variable called "anyprevinf" which indicates if the patient has previously had an infection. So there can be up to 4 observations per patient, eg:
  hospid   patient   anypre~f   apd   duration   inf  
      13        49          0     0          3     0  
      13        49          0     1         49     1  
      13        49          1     0         89     0  
      13        49          1     1         20     0  

To illustrate the problem I am ignoring the other covariates and have run 3 models - simple Poisson model ignoring the hierarchical structure, -mepoisson- and -meqrpoisson-:
poisson inf apd, exp(duration)
est store p1
mepoisson inf apd, exp(duration) || hospid: || patient:
est store mep1
meqrpoisson inf apd, exp(duration) || hospid: || patient:
est store meqrp1

Since in this example I'm not using anyprevinf I then collapsed the dataset and re-ran the same 3 models:
collapse (sum) inf duration, by(hospid apd patient)
poisson inf apd, exp(duration)
est store p2
mepoisson inf apd, exp(duration) || hospid: || patient:
est store mep2
meqrpoisson inf apd, exp(duration) || hospid: || patient:
est store meqrp2

My results are summarised as follows:
. est table p1 p2 mep1 mep2 meqrp1 meqrp2, keep(apd) eq(1) b se p

--------------------------------------------------------------------------------------------
  Variable |     p1           p2          mep1         mep2        meqrp1       meqrp2    
-------------+------------------------------------------------------------------------------
       apd | -.11492828   -.11492828   -.07347119     .0277205    .04527607    .04511924  
           |  .02520192    .02520192    .03083056    .03562662    .03693156     .0369239  
           |     0.0000       0.0000       0.0172       0.4365       0.2202       0.2217  
--------------------------------------------------------------------------------------------
                                                                            legend: b/se/p

Comments:
- p1 and p2 are identical as expected
- mep1 and mep2 are quite different which I don't understand - I think they should be almost totally equivalent since the only difference was the -collapse- (see meqrp1 vs meqrp2)
- mep1 and meqr1 are quite different which I don't understand - I thought they were fitting the same model using a slightly different technique
- mep2 and meqr2 are pretty similar, but not as similar as I was expecting since my understanding is that they're fitting the same model

I would be grateful for any suggestions as to what's happening here. Below I've put a log of some descriptive data and the full output from each model.

One subtle potential problem is that if anyprevinf==1 there must be another observation for that patient with anyprevinf==0 & inf==1 - so in the uncollapsed dataset there isn't perfect conditional independence. Not sure if that's enough to explain the discrepancy though.

Kind regards,
Phil

. describe

Contains data
obs:        12,978                          
vars:             6                          
size:       272,538                          (_dta has notes)
-------------------------------------------------------------------------------------------------------------------
            storage   display    value
variable name   type    format     label      variable label
-------------------------------------------------------------------------------------------------------------------
patient         int     %9.0g                 
apd             byte    %8.0g                 
anyprevinf      byte    %8.0g                 
hospid          byte    %9.0g                 group(hospcode)
inf             double  %8.0g                 (sum) inf
duration        double  %9.0g                 (sum) duration
-------------------------------------------------------------------------------------------------------------------
Sorted by:  hospid  apd  patient  anyprevinf
   Note:  dataset has changed since last saved

. misstable summarize // no missing data
(variables nonmissing or string)

. codebook hospid patient

-------------------------------------------------------------------------------------------------------------------
hospid                                                                                              group(hospcode)
-------------------------------------------------------------------------------------------------------------------

                type:  numeric (byte)

               range:  [1,56]                       units:  1
       unique values:  56                       missing .:  0/12978

                mean:   33.0623
            std. dev:    15.064

         percentiles:        10%       25%       50%       75%       90%
                              10        20        35        47        53

-------------------------------------------------------------------------------------------------------------------
patient                                                                                                 (unlabeled)
-------------------------------------------------------------------------------------------------------------------

                type:  numeric (int)

               range:  [1,6959]                     units:  1
       unique values:  6959                     missing .:  0/12978

                mean:   3461.36
            std. dev:   2043.82

         percentiles:        10%       25%       50%       75%       90%
                             674      1650    3457.5      5258      6284

. bysort patient: assert hospid==hospid[_n-1] if _n>1

. tab apd

      apd |      Freq.     Percent        Cum.
------------+-----------------------------------
        0 |      7,275       56.06       56.06
        1 |      5,703       43.94      100.00
------------+-----------------------------------
    Total |     12,978      100.00

. tab anyprevinf

anyprevinf |      Freq.     Percent        Cum.
------------+-----------------------------------
        0 |      9,411       72.52       72.52
        1 |      3,567       27.48      100.00
------------+-----------------------------------
    Total |     12,978      100.00

. tab inf

(sum) inf |      Freq.     Percent        Cum.
------------+-----------------------------------
        0 |      8,231       63.42       63.42
        1 |      3,980       30.67       94.09
        2 |        388        2.99       97.08
        3 |        185        1.43       98.51
        4 |         89        0.69       99.19
        5 |         54        0.42       99.61
        6 |         25        0.19       99.80
        7 |         11        0.08       99.88
        8 |          7        0.05       99.94
        9 |          3        0.02       99.96
       10 |          4        0.03       99.99
       11 |          1        0.01      100.00
------------+-----------------------------------
    Total |     12,978      100.00

. sum duration

  Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
  duration |     12978    322.0892    389.4168         .5       2802

. 
. poisson inf apd, exp(duration)

Iteration 0:   log likelihood = -14529.051  
Iteration 1:   log likelihood = -14528.992  
Iteration 2:   log likelihood = -14528.992  

Poisson regression                                Number of obs   =      12978
                                                LR chi2(1)      =      20.79
                                                Prob > chi2     =     0.0000
Log likelihood = -14528.992                       Pseudo R2       =     0.0007

------------------------------------------------------------------------------
       inf |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       apd |  -.1149283   .0252019    -4.56   0.000    -.1643231   -.0655334
     _cons |  -6.439011    .017778  -362.19   0.000    -6.473855   -6.404167
ln(duration) |          1  (exposure)
------------------------------------------------------------------------------

. est store p1

. mepoisson inf apd, exp(duration) || hospid: || patient:

Fitting fixed-effects model:

Iteration 0:   log likelihood =  -26408.57  
Iteration 1:   log likelihood =  -14532.71  
Iteration 2:   log likelihood = -14528.994  
Iteration 3:   log likelihood = -14528.992  
Iteration 4:   log likelihood = -14528.992  

Refining starting values:

Grid node 0:   log likelihood = -9069.4652

Fitting full model:

Iteration 0:   log likelihood = -9069.4652  
Iteration 1:   log likelihood = -9063.4762  
Iteration 2:   log likelihood =  -9059.664  
Iteration 3:   log likelihood =  -9058.714  
Iteration 4:   log likelihood =  -9058.715  
Iteration 5:   log likelihood =  -9058.715  

Mixed-effects Poisson regression                Number of obs      =     12978

-----------------------------------------------------------
              |   No. of       Observations per Group
Group Variable |   Groups    Minimum    Average    Maximum
----------------+------------------------------------------
       hospid |       56          1      231.8        949
      patient |     6959          1        1.9          4
-----------------------------------------------------------

Integration method: mvaghermite                 Integration points =         7

                                              Wald chi2(1)       =      5.68
Log likelihood =  -9058.715                     Prob > chi2        =    0.0172
--------------------------------------------------------------------------------
         inf |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
---------------+----------------------------------------------------------------
         apd |  -.0734712   .0308306    -2.38   0.017     -.133898   -.0130444
       _cons |  -6.505974   .0244448  -266.15   0.000    -6.553884   -6.458063
ln(duration) |          1  (exposure)
---------------+----------------------------------------------------------------
hospid         |
   var(_cons)|   .3186859   .0910349                      .1820586    .5578463
---------------+----------------------------------------------------------------
hospid>patient |
   var(_cons)|    .912644   .0565245                      .8083181    1.030435
--------------------------------------------------------------------------------
LR test vs. Poisson regression:      chi2(2) = 10940.55   Prob > chi2 = 0.0000

Note: LR test is conservative and provided only for reference.

. est store mep1

. meqrpoisson inf apd, exp(duration) || hospid: || patient:

Refining starting values: 

Iteration 0:   log likelihood = -13532.955  
Iteration 1:   log likelihood = -13514.626  
Iteration 2:   log likelihood = -13503.088  

Performing gradient-based optimization: 

Iteration 0:   log likelihood = -13503.088  
Iteration 1:   log likelihood = -13502.469  
Iteration 2:   log likelihood = -13502.468  
Iteration 3:   log likelihood = -13502.468  

Mixed-effects Poisson regression                Number of obs      =     12978

--------------------------------------------------------------------------
              |   No. of       Observations per Group       Integration
Group Variable |   Groups    Minimum    Average    Maximum      Points
----------------+---------------------------------------------------------
       hospid |       56          1      231.8        949           7
      patient |     6959          1        1.9          4           7
--------------------------------------------------------------------------

                                              Wald chi2(1)       =      1.50
Log likelihood = -13502.468                     Prob > chi2        =    0.2202

------------------------------------------------------------------------------
       inf |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       apd |   .0452761   .0369316     1.23   0.220    -.0271085    .1176606
     _cons |  -6.922041   .0735074   -94.17   0.000    -7.066113   -6.777969
ln(duration) |          1  (exposure)
------------------------------------------------------------------------------

------------------------------------------------------------------------------
Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
hospid: Identity             |
                var(_cons) |   .1840679   .0504352      .1075836    .3149271
-----------------------------+------------------------------------------------
patient: Identity            |
                var(_cons) |   1.010816   .0477642      .9214039    1.108904
------------------------------------------------------------------------------
LR test vs. Poisson regression:      chi2(2) =  2053.05   Prob > chi2 = 0.0000

Note: LR test is conservative and provided only for reference.

. est store meqrp1

. 
. collapse (sum) inf duration, by(hospid apd patient)

. poisson inf apd, exp(duration)

Iteration 0:   log likelihood = -10969.711  
Iteration 1:   log likelihood = -10969.378  
Iteration 2:   log likelihood = -10969.378  

Poisson regression                                Number of obs   =       9756
                                                LR chi2(1)      =      20.79
                                                Prob > chi2     =     0.0000
Log likelihood = -10969.378                       Pseudo R2       =     0.0009

------------------------------------------------------------------------------
       inf |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       apd |  -.1149283   .0252019    -4.56   0.000    -.1643231   -.0655334
     _cons |  -6.439011    .017778  -362.19   0.000    -6.473855   -6.404167
ln(duration) |          1  (exposure)
------------------------------------------------------------------------------

. est store p2

. mepoisson inf apd, exp(duration) || hospid: || patient:

Fitting fixed-effects model:

Iteration 0:   log likelihood = -18163.874  
Iteration 1:   log likelihood = -11003.634  
Iteration 2:   log likelihood = -10969.486  
Iteration 3:   log likelihood = -10969.378  
Iteration 4:   log likelihood = -10969.378  

Refining starting values:

Grid node 0:   log likelihood = -9019.1437

Fitting full model:

Iteration 0:   log likelihood = -9019.1437  
Iteration 1:   log likelihood = -9009.7733  
Iteration 2:   log likelihood = -8995.3399  
Iteration 3:   log likelihood = -8986.9478  
Iteration 4:   log likelihood = -8986.5939  
Iteration 5:   log likelihood = -8986.5907  
Iteration 6:   log likelihood =  -8986.591  

Mixed-effects Poisson regression                Number of obs      =      9756

-----------------------------------------------------------
              |   No. of       Observations per Group
Group Variable |   Groups    Minimum    Average    Maximum
----------------+------------------------------------------
       hospid |       56          1      174.2        716
      patient |     6959          1        1.4          2
-----------------------------------------------------------

Integration method: mvaghermite                 Integration points =         7

                                              Wald chi2(1)       =      0.61
Log likelihood =  -8986.591                     Prob > chi2        =    0.4365
--------------------------------------------------------------------------------
         inf |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
---------------+----------------------------------------------------------------
         apd |   .0277205   .0356266     0.78   0.437    -.0421064    .0975474
       _cons |  -6.738195   .0441765  -152.53   0.000    -6.824779   -6.651611
ln(duration) |          1  (exposure)
---------------+----------------------------------------------------------------
hospid         |
   var(_cons)|   .2065251   .0586674                      .1183512    .3603903
---------------+----------------------------------------------------------------
hospid>patient |
   var(_cons)|   .9873731   .0491168                        .89565    1.088489
--------------------------------------------------------------------------------
LR test vs. Poisson regression:      chi2(2) =  3965.57   Prob > chi2 = 0.0000

Note: LR test is conservative and provided only for reference.

. est store mep2

. meqrpoisson inf apd, exp(duration) || hospid: || patient:

Refining starting values: 

Iteration 0:   log likelihood = -9972.7637  
Iteration 1:   log likelihood = -9949.2459  
Iteration 2:   log likelihood = -9943.3971  

Performing gradient-based optimization: 

Iteration 0:   log likelihood = -9943.3971  
Iteration 1:   log likelihood = -9942.2696  
Iteration 2:   log likelihood = -9942.2532  
Iteration 3:   log likelihood = -9942.2532  

Mixed-effects Poisson regression                Number of obs      =      9756

--------------------------------------------------------------------------
              |   No. of       Observations per Group       Integration
Group Variable |   Groups    Minimum    Average    Maximum      Points
----------------+---------------------------------------------------------
       hospid |       56          1      174.2        716           7
      patient |     6959          1        1.4          2           7
--------------------------------------------------------------------------

                                              Wald chi2(1)       =      1.49
Log likelihood = -9942.2532                     Prob > chi2        =    0.2217

------------------------------------------------------------------------------
       inf |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       apd |   .0451192   .0369239     1.22   0.222    -.0272503    .1174888
     _cons |  -6.921458   .0734339   -94.25   0.000    -7.065386    -6.77753
ln(duration) |          1  (exposure)
------------------------------------------------------------------------------

------------------------------------------------------------------------------
Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
hospid: Identity             |
                var(_cons) |   .1835775   .0503448      .1072467    .3142355
-----------------------------+------------------------------------------------
patient: Identity            |
                var(_cons) |   1.009257   .0477619      .9198554    1.107347
------------------------------------------------------------------------------
LR test vs. Poisson regression:      chi2(2) =  2054.25   Prob > chi2 = 0.0000

Note: LR test is conservative and provided only for reference.

. est store meqrp2

. 
. est table p1 p2 mep1 mep2 meqrp1 meqrp2, keep(apd) eq(1) b se p

--------------------------------------------------------------------------------------------
  Variable |     p1           p2          mep1         mep2        meqrp1       meqrp2    
-------------+------------------------------------------------------------------------------
       apd | -.11492828   -.11492828   -.07347119     .0277205    .04527607    .04511924  
           |  .02520192    .02520192    .03083056    .03562662    .03693156     .0369239  
           |     0.0000       0.0000       0.0172       0.4365       0.2202       0.2217  
--------------------------------------------------------------------------------------------
                                                                            legend: b/se/p



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index