Ricardo Ovaldia wrote:
--- Joseph Coveney wrote:
>> That the medians of the pooled data are identical
> wouldn't bother me so much as
> the difference between the asymptotic and
> permutation p-values with 256 dams.
>
> Take a look at what -xtreg percent mm, i(dam) fe-
> gives you (and also take a
> look at, say, -pnorm- on the residuals, for
> starters). I'm guessing that
> -xtreg, fe- (which would have been my first choice
> with this design) gives you
> the same take-home message as what -vaneltern- does.
Yes, I first used -xtreg- and obtained a p-value of
0.018. Although the residual plot did not look to bad
using -pnorm-, they look bad on the -qnorm- plot. I
was concerned that I was not meeting the normality
assumption, therefore I opted to use a non-parametric
test. I also tried to find a transformation, but the
ones I selected found did not performed any better.
--------------------------------------------------------------------------------
Good enough. I'm glad that things worked out with -vanelteren-, then.
I wouldn't necessarily write off -xtreg, fe- completely, though. Some work by
Lisa Sullivan and Ralph D'Agostino Sr.* indicates that the power of a t-test on
differences of paired ordered-categorical data is still pretty good, even with
small samples. The normality assumption isn't very well met in their case.
The do-file below suggests that the findings of Sullivan and D'Agostino can be
extended beyond the paired t-test. -vanelteren- and -xtreg, fe- are compared
in a simulation of an arrangement with 40 variably sized clusters of two to
twelve that are divided into two comparison groups. The do-file creates a
skewed distribution of ordered categorical data with five categories.
The performance of -xtreg, fe- isn't too shabby for hypothesis testing with two
levels of the grouping variable--Null: 55 / 1000 replicates in the simulation
(-vanelteren-) versus 45 / 1000 (-xtreg, fe-) at a nominal 5% Type I error
rate; Alternative: 222 / 1000 versus 216 / 1000. I would expect the findings
to generally hold up with tenfold the replicates.
It might be worthwhile to see how well -xtreg, fe- holds up with smaller
variable cluster sizes (we know what T = 2 is from Sullivan and D'Agostino),
cluster numbers (smaller samples) and levels of ordered categories (down to
four or even three). Not to suggest -xtreg, fe- for *estimation* here.
Joseph Coveney
*L. M. Sullivan & R. B. D'Agostino Sr., Robustness and power of analysis of
covariance applied to ordinal scaled data as arising in randomized controlled
trials. _Statistics in Medicine_ 22(8):1317-34, 2003.
clear
set more off
set seed `=date("2005-08-02", "ymd")'
set obs 12
forvalues i = 1/12 {
generate float a`i' = 0.5 + 0.5 * (_n == `i')
local varlist `varlist' latent_variable`i'
}
mkmat a*, matrix(A)
local one_eighth = 1/8
forvalues i = 1/6 {
local null_means `null_means' 0 0
local alternative_means `alternative_means' 0 `one_eighth'
}
*
capture program drop simem
program define simem, rclass
syntax namelist, MEANS(numlist)
drawnorm `namelist', means(`means') corr(A) n(40) clear
generate byte stratum = _n
generate byte number_of_replicates = 2 + floor(uniform() * 10)
reshape long latent_variable, i(stratum) j(observation)
drop if observation > number_of_replicates
generate byte manifest_variable = 1
scalar lowest_cutpoint = 1 / (2 + 4 + 8 + 16)
foreach multiple in 2 4 8 16 {
quietly replace manifest_variable = manifest_variable + ///
(norm(latent_variable) > (1 - `multiple' * ///
scalar(lowest_cutpoint)))
}
generate byte grouping_variable = mod(observation, 2)
vanelteren manifest_variable, by(grouping_variable) ///
strata(stratum)
return scalar vanelteren = r(p)
xtreg manifest_variable grouping_variable, i(stratum) fe
return scalar xtregfe = Ftail(e(df_b), e(df_r), e(F))
end
*
simulate vanelteren = r(vanelteren) xtregfe = r(xtregfe), ///
reps(1000) nodots: simem `varlist', means(`null_means')
generate byte positive_vanelteren = vanelteren < 0.05
generate byte positive_xtregfe = xtregfe < 0.05
summarize positive_*
simulate vanelteren = r(vanelteren) xtregfe = r(xtregfe), ///
reps(1000) nodots: simem `varlist', means(`alternative_means')
generate byte positive_vanelteren = vanelteren < 0.05
generate byte positive_xtregfe = xtregfe < 0.05
summarize positive_*
simem `varlist', means(`alternative_means')
predict residuals, e
pause on
version 7: kdensity residuals, norm
pause
version 7: pnorm residuals
pause
version 7: qnorm residuals
exit
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/