Dear Roland,
in addition to Maarten's wise insight, I was wondering whether, in order to
dealing with the skewed sampling distribution of LOS, bootstrapping your
raw data without log-transforming may be a good way to go (please, see for
instance Glick HA, Doshi JA, Sonnad SS Polsky D. Economic Evaluation of
Clinical Trials. Oxford: Oxford University Press, 2007: 89-113).
I tried to replicate an abridged version of your problem (no drugs and/or
interaction among variables allowed) first performing a multiple linear
regression on the raw data and then replicating OLS on bootstrapped data
(10,000 random samples for each variables). Both attempts did not reach
statistical significance.
. regress LOS age surgery
Source | SS df MS Number of obs =
10
-------------+------------------------------ F( 2, 7) =
2.39
Model | 2.79848368 2 1.39924184 Prob > F =
0.1619
Residual | 4.10151632 7 .585930904 R-squared =
0.4056
-------------+------------------------------ Adj R-squared =
0.2357
Total | 6.9 9 .766666667 Root MSE =
.76546
----------------------------------------------------------------------------
--
LOS | Coef. Std. Err. t P>|t| [95% Conf.
Interval]
-------------+--------------------------------------------------------------
--
age | -.0126865 .0177748 -0.71 0.498 -.0547173
.0293443
surgery | 1.035422 .4866573 2.13 0.071 -.1153401
2.186183
_cons | 4.697851 .5397554 8.70 0.000 3.421532
5.974169
----------------------------------------------------------------------------
--
reg boot_los boot_surgery boot_age
Source | SS df MS Number of obs =
10000
-------------+------------------------------ F( 2, 9997) =
0.51
Model | .070269293 2 .035134647 Prob > F =
0.5984
Residual | 683.931319 9997 .068413656 R-squared =
0.0001
-------------+------------------------------ Adj R-squared =
-0.0001
Total | 684.001589 9999 .068407 Root MSE =
.26156
----------------------------------------------------------------------------
--
boot_los | Coef. Std. Err. t P>|t| [95% Conf.
Interval]
-------------+--------------------------------------------------------------
--
boot_surgery | -.0053317 .0165804 -0.32 0.748 -.0378326
.0271692
boot_age | -.0005713 .000594 -0.96 0.336 -.0017356
.0005931
_cons | 4.914887 .0171246 287.01 0.000 4.881319
4.948455
----------------------------------------------------------------------------
--
HTH and Kind Regards,
Carlo
-----Messaggio originale-----
Da: [email protected]
[mailto:[email protected]] Per conto di Maarten buis
Inviato: mercoledì 5 novembre 2008 10.49
A: [email protected]
Oggetto: Re: st: Interpretation of regressionmodel of ln-transformed
variable
--- roland andersson <[email protected]> wrote:
> It is also difficult to imaging that there should be censoring
> for conditions that normally need 1 to 7 days of hospital visit.
Ok, sounds reasonable.
> Following your example I have made this model
>
> xi:regress lnLOS lapscopic i.appdgn age agesq cons, eform("exp(b)")
> nocons
>
> and get this result
>
> lnLOS exp(b) [95% Conf. Interval]
> lapscopic 1.018056 1.004532 1.031762
> _Iappdgn2_1 1.850726 1.824841 1.876978
> _Iappdgn2_3 1.174283 1.147247 1.201956
> age .9852508 .9841405 .9863623
> agesq 1.000275 1.000261 1.000289
> cons 2.208685 2.168225 2.2499
>
> I now understand that the exp(b) is a multiplicator, ie that open
> appendectomy has a geometric mean LOS of 2.21 days whereas
> laparoscopic patients have 1.02*2.21=2.25 days or 0.04 days longer
> geometric mean LOS. Is it correct to recalculate the CI of this
> difference as 2.21-1.0045*2.21=0.01 and 2.21-1.032*2.21=0.07?
In that case I would use -adjust- and -nlcom- like in the example
below:
*--------------- begin example --------------------------
sysuse cancer, clear
gen ln_t = ln(studytime)
gen cons = 1
xi: reg ln_t i.drug age cons, nocons eform("exp(b)")
adjust _Idrug_3=0 age, by(_Idrug_2) exp ci
sum age if e(sample)
nlcom exp((_b[cons] + _b[age]*`r(mean)')+ _b[_Idrug_2]) - ///
exp((_b[cons] + _b[age]*`r(mean)'))
*---------------- end example ---------------------------
Notice that the difference in LOS now depends on the values of the
other explanatory variables. These other variables define the baseline
LOS (in your case the LOS for someone who received an open
appendectomy). So if you haven't mean centered age, then the difference
in geometric mean LOS you reported applies to newly born babies. You
can report the difference in geometric mean LOS for someone of average
age either by first mean centering age (subtract the mean age from the
variable age as I did in the example in my previous post), or take mean
age into account like in the example above.
Hope this helps,
Maarten
-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands
visiting address:
Buitenveldertselaan 3 (Metropolitan), room N515
+31 20 5986715
http://home.fsw.vu.nl/m.buis/
-----------------------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/