Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Vaidyanathan Ganapathy <vaidyang@usc.edu> |
To | statalist@hsphsun2.harvard.edu |
Subject | st: Interpretation of Oaxaca decomposition results after re-transformation of log scale |
Date | Wed, 12 Mar 2014 00:03:13 -0700 |
Dear Statalisters, I am performing an Oaxaca type decomposition to understand the healthcare cost differences between two groups - controls and premature infants. Here is my specification: . oaxaca lnallhccx2 tpcat2-tpcat6 bpdx2 chdx2 asthbrdx2 resinfxdx2 cnsdx2 motordx2 physdevdx2 nddx2 chrnic1 period, by(premie_cat) pooled vce(cluster pcn) eform The dependent variable is ln(healthcare costs) and the other variables are covariates including poverty levels (tpcat2-tpcat6) and certain medical diagnoses. Since the dependent variable is in log scale I used the -eform option to exponentiate and report the predicted costs and the decomposed cost differentials. While I am able to interpret the predicted values for the two groups, I have some trouble in interpreting the overall, explained and unexplained differences. Here is the output - Blinder-Oaxaca decomposition Number of obs = 137972 1: premie_cat = 0 (controls) 2: premie_cat = 1 (premature infants) (Std. Err. adjusted for 68994 clusters in pcn) ------------------------------------------------------------------------------- | Robust lnallhccx2 | exp(b) Std. Err. z P>|z| [95% Conf. Interval] --------------+---------------------------------------------------------------- Differential | Prediction_1 | 348.9868 1.737476 1176.03 0.000 345.598 352.4089 Prediction_2 | 956.743 75.23525 87.28 0.000 820.0862 1116.172 Difference | .3647655 .0287414 -12.80 0.000 .3125676 .4256803 --------------+---------------------------------------------------------------- Decomposition | Explained | .5206098 .035001 -9.71 0.000 .4563368 .5939355 Unexplained | .7006504 .0489067 -5.10 0.000 .6110629 .8033722 ------------------------------------------------------------------------------- Using simple math, it could be seen from the results in panel 1 (Differential) that healthcare costs among controls is only 36.47% of that of healthcare costs among premature infants. This led me to the following interpretation about the overall cost differential between premature and control infants: The healthcare cost among premature infants increases by 174% of that of the costs among controls as predicted by the group models. Is it correct to make this interpretation? The interpretation of the decomposition results (panel 2 above) - the explained and unexplained components, doesn't seem to be that straight forward. Any help in understanding these difference estimates will be very helpful. Thanks! * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/