Peter Lachenbruch wrote:
Perhaps a na�ve approach would be to bootstrap the cv with a few thousand
reps. An obvious problem arises if the mean is near 0. I suspect one
should restrict this to positive random variables, but I have no theory to
go on.
--------------------------------------------------------------------------------
Another naive approach would be using maximum likelihood estimation of mu
and sigma. This would be followed by -nlcom- to test the ratios (or, for
that matter, their inverses should they be deemed more suitable). An
example is in the do-file below, which uses -mynormal_d2- from W. Gould,
J. Pitblado & W. Sribney, _Maximum Likelihood Estimation with Stata. Third
Edition._ (College Station, Texas: Stata Press, 2006).
I'm not sure how well this approach compares to bootstrapping. The delta
method, upon which -nlcom- depends, as well as maximum likelihood
estimation, itself, might require larger samples than what is customarily
contemplated when coefficients of variance are used.
Coeffient of variation is popular in pharmacokinetics and biopharmaceutics.
It's also de rigeur in analytical (chemistry) methods development in the
pharmaceutical industry, where it's known as the relative standard deviation
(RSD). Practitioners of these professions avoid the "obvious problem" by
defining a so-called limit of quantitation (LOQ) at which they quit before
things get too close to zero. (Actually, increasing to a threshold RSD at
progressively lower analyte levels is used in establishing the LOQ for an
assay.)
Joseph Coveney
clear *
set more off
*
// mynormal_d2.ado from Gould, Pitblado & Sribney (2006), p. 267--68
program define mynormal_d2
version 9.2
args todo b lnf g negH g1 g2
tempvar mu lnsigma sigma
mleval `mu' = `b', eq(1)
mleval `lnsigma' = `b', eq(2)
quietly {
generate double `sigma' = exp(`lnsigma')
mlsum `lnf' = ln(normalden($ML_y1, `mu', `sigma'))
if (`todo' == 0 | missing(`lnf')) exit
tempvar z
tempname dmu dlnsigma
generate double `z' = ($ML_y1 - `mu') / `sigma'
replace `g1' = `z' / `sigma'
replace `g2' = `z' * `z' - 1
mlvecsum `lnf' `dmu' = `g1', eq(1)
mlvecsum `lnf' `dlnsigma' = `g2', eq(2)
matrix define `g' = (`dmu', `dlnsigma')
if (`todo' == 1 | missing(`lnf')) exit
tempname d11 d12 d22
mlmatsum `lnf' `d11' = 1 / `sigma' / `sigma', eq(1)
mlmatsum `lnf' `d12' = 2 * `z' / `sigma', eq(1,2)
mlmatsum `lnf' `d22' = 2 * `z' * `z', eq(2)
matrix define `negH' = (`d11', `d12' \ `d12', `d22')
}
end
*
set seed `=date("2008-07-11", "YMD")'
set obs 100
generate byte group = _n > _N / 2
generate double response = (1 + group) * (100 + invnormal(uniform()))
tabulate group, generate(group_)
ml model d2 mynormal_d2 (mu: response = group_1 group_2, ///
noconstant) (lnsigma: group_1 group_2, noconstant)
ml maximize, nolog
estimates store A
nlcom (CV1: exp(_coef[lnsigma:group_1]) / _coef[mu:group_1]) ///
(CV2: exp(_coef[lnsigma:group_2]) / _coef[mu:group_2]), post
test _b[CV1] = _b[CV2] // <-- Here
tabstat response, stat(cv) by(group)
// Inverse coefficient of variation
estimates restore A
nlcom (InvCV1: _coef[mu:group_1] / exp(_coef[lnsigma:group_1])) ///
(InvCV2: _coef[mu:group_2] / exp(_coef[lnsigma:group_2])), post
test _b[InvCV1] = _b[InvCV2]
// Cf. RMSE from OLS regression
estimates restore A
display %5.3f exp(_coef[lnsigma:group_1])
display %5.3f exp(_coef[lnsigma:group_2])
regress response if !group
regress response if group
exit
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/