Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: impute with draws from random distribution
From
Austin Nichols <[email protected]>
To
[email protected]
Subject
Re: st: impute with draws from random distribution
Date
Wed, 22 Jun 2011 11:10:02 -0400
Joerg Luedicke <[email protected]>, D-Ta <[email protected]>,
Maarten, et al.--
I disagree that the proposal makes no sense at all--suppose for the
sake of argument that all observables are important to control for
only with respect to their impact on duration not participating (time
to participation), or the hazard of participation in each program; in
such a case estimating a competing-risks hazard model and predicting
durations of non-participation (median or other percentiles, probably)
then matching on predicted duration makes sense. What does not make
sense to me is to match observed durations to predicted durations, but
it is possible that such an approach is justifiable. I have not read
the cited papers, and I would appreciate complete references as
specified in the Statalist FAQ.
On Wed, Jun 22, 2011 at 10:30 AM, Joerg Luedicke
<[email protected]> wrote:
> On Wed, Jun 22, 2011 at 4:03 AM, D-Ta <[email protected]> wrote:
>> Dear All,
>>
>> I am looking for a convenient solution to the following problem. That is the
>> type of the sample I am working with:
>>
>> id x1 x2 participant programm time to
>> participation
>> 1 5 23 1 1 3.5
>> 2 6 42 1 2 5.7
>> 3 73 7 0 . .
>> 4 35 2 0 . .
>> 5 5 6 1 1 12
>> 6 34 34 1 1 3.5
>> 7 34 34 1 2 8.1
>>
>>
>>
>> The sample consists of of individuals (with covariates x1 and x2) who can
>> either be participating in programm 1, programm 2, or be non-participants.
>> The non-participants are my controll group. One of the control variables
>> that I would like to condition on in a subsequent matching step is time to
>> participation. By definition, time to participation is not observed for the
>> non-participants. Hence, I would like to create hypothetical values in that
>> variable for the group of non-participant. It is standard in the literature
>> to randomly draw from the distribution of the participants.
>>
>> Since there are two groups of participants, there are also two different
>> distributions in the start dates. I would like to assign two values in the
>> time to participation to each non-participant (hypothetical time to
>> participation in programm 1 and hypothetical time to participation in
>> programm 2)
>>
>> Any suggestions how to do this??
>>
>
> I agree with Maarten that this makes no sense at all. Let's say your
> aim is to create a balanced sample via the use of propensity scores.
> Now let's further assume that "time to participation" would be the
> only variable of concern. If you now impute some values and predict
> your propensity scores, a claim of any subsequent analysis would be
> that the treatment assignment is ignorable based on the observed time
> that elapsed between timepoint x and start of the program. Only that
> for the non-participants, there was no start of a program in the first
> place, hence no elapsed time and, thus, there is nothing to balance.
>
> J.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/