I have data on the "cost" (actually tranformed hours) of various types of
caretaking for Alzheimers patients. I'm interested in a regression model to
test treatment effects in a multisite study. As is usual for cost data, it
is positively skewed. So, I contemplated a log transform, either through a
direct transformation of the response, or through a log link in a glm, gee,
or something similar. I actually am using "xt" commands to allow for
nonindependence among caretakers treated at the same site.
the problem is that the mode cost is $0, so that the distribution is
bimodal. This, of course, remains true if I do a lof transform. Any ideas on
how to analyze such data would be apreciated.
Log-transformed data can often be understood in terms of geometric means
and their ratios. If in Stata you type