Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: R: Can I repeatedly sample with constraints from an unbalanced data set to balance it?


From   "Carlo Lazzaro" <[email protected]>
To   <[email protected]>
Subject   st: R: Can I repeatedly sample with constraints from an unbalanced data set to balance it?
Date   Sat, 27 Oct 2007 18:35:08 +0200

Dear Paul,

provided that I have figured out correctly your research need, as a
sensitivity analysis of your base case results on effectiveness, you might
find useful to perform a permutation test (see - help permute - ) on the two
samples of patient you are comparing.

As you are surely aware of, the theorical hypotheses of this random
resampling without reintroduction test are well reported in: 

Efron B, Tibshirani JT. An Introduction to the Bootstrap. New York:
Chapman&Hall 1993: 202-219 (particularly).

Sorry I cannot be more helpful.

Kind Regards,

Carlo
-----Messaggio originale-----
Da: [email protected]
[mailto:[email protected]] Per conto di Paul Walsh
Inviato: sabato 27 ottobre 2007 17.47
A: [email protected]
Oggetto: st: Can I repeatedly sample with constraints from an unbalanced
data set to balance it?

I have a 700 subject data set of a clinical trial comparing two
treatments for outcome (hospital admission from an emergency room) for
a particular disease.  I am also using a three point ordinal scale of
disease severity that strongly predicts hospital admission regardless
of treatment.  Though the trial was designed to balance the two
treatment arms, calculating disease severity is too cumbersome to have
included it to balance each treatment arm with equal numbers of each
severity category in the emergency room.  Thus there is an unequal
distribution of severity of cases in the two arms.  When I calculate
the unadjusted risk ratio of admission for two treatments I obtain a
low, non-significant crude RR, similar to already published studies
that did not account for severity.  When I model the treatments and
include the severity score, the adjusted RR increases and is
significant, demonstrating superiority of one treatment over the
other.



The manuscript reviewers feel that the study should have balanced the
severity scores in both treatment arms instead of including severity
as a variable.  I'd like to run jackknife  or bootstrap estimations of
unadjusted RR by constraining each jackknife/bootstrap to select equal
numbers of patients receiving each treatment with each severity score.
 The goal is to repeatedly select samples from the data set that
produce equal numbers of patients in each of the six groups (two
treatments, three severity classifications). Can someone comment on
the feasibility of doing this in the bootstrap/jack knife context?
Since this is not random sampling from the data set, how would this
procedure affect the interpretation of bootstrapped/jacknifed results?
 If feasible and interpretable, can someone suggest some code  that
would do this?

I have a 700 subject data set of a clinical trial comparing two
treatments for outcome (hospital admission from an emergency room) for
a particular disease. I am also using a three point ordinal scale of
disease severity that strongly predicts hospital admission regardless
of treatment. Though the trial was designed to balance the two
treatment arms, calculating disease severity is too cumbersome to have
included it to balance each treatment arm with equal numbers of each
severity category in the emergency room. Thus there is an unequal
distribution of severity of cases in the two arms. When I calculate
the unadjusted risk ratio of admission for two treatments I obtain a
low, non-significant crude RR, similar to already published studies
that did not account for severity. When I model the treatments and
include the severity score, the adjusted RR increases and is
significant, demonstrating superiority of one treatment over the
other.

The manuscript reviewers feel that the study should have balanced the
severity scores in both treatment arms instead of including severity
as a variable. I'd like to run jackknife or bootstrap estimations of
unadjusted RR by constraining each jackknife/bootstrap to select equal
numbers of patients receiving each treatment with each severity score.
The goal is to repeatedly select samples from the data set that
produce equal numbers of patients in each of the six groups (two
treatments, three severity classifications). Can someone comment on
the feasibility of doing this in the bootstrap/jack knife context?
Since this is not random sampling from the data set, how would this
procedure affect the interpretation of bootstrapped/jacknifed results?
If feasible and interpretable, can someone suggest some code that
would do this or suggest another way of achieving the same goals?

 Paul Walsh

Bakersfield CA
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index