Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Propensity Score Matching Between 3 Groups
From
Alfonso Sánchez-Peñalver <[email protected]>
To
Stata List <[email protected]>
Subject
Re: st: Propensity Score Matching Between 3 Groups
Date
Fri, 28 Feb 2014 08:25:10 -0500
Austin,
as I've said, I'm not an expert on propensity matching, so I don't know what's the procedure there. I agree with you that a model with data from A and B cannot be applied to subgroup C, that is why I suggested the nested logit model which uses data from groups A, B, and C. For group C you first need to know what is the probability that a person would live in a treatment area, and then, given that he/she does, the probability that he is treated. A model using A and B would only estimate the probability that a person who lives in a treatment area is treated, so it's not applicable to group C, since they don't live in a treatment area. The second email I sent only states that it all depends on what sample of people she is going to be applying the model to calculate the probabilities. If it includes people who don't live in a treatment area (or you don't know where they live), then the nested logit model is the appropriate one. If the probabilities are going to be used only on p!
eople who live in a treatment area, then a simple binary logit or probit using groups A and B would be appropriate. In no case have I said that a model using data on groups A and B only can be used on subgroup C.
Best,
Alfonso
On Feb 28, 2014, at 7:40 AM, Austin Nichols <[email protected]> wrote:
> Alfonso--
>
> I gave the code for what Isobel said she wanted, but I think she
> should be using data from A and C directly to balance on covariates,
> not using data from A and B to develop a model that she then applies
> to A and C. The point is to achieve balance on the covariates, not to
> compute the actual probability of treatment for cases in C.
>
> On Fri, Feb 28, 2014 at 7:32 AM, Alfonso Sanchez-Penalver
> <[email protected]> wrote:
>> Hi Isobel,
>>
>> I've given this some further thought. The nested logit approach would make sense if you're going to predict the probability of being treated to data where you didn't know whether a person resides in a treatment area or not. Your probability of being treated then would be the probability of living in a treated area times the probability of being treated given that you live in a given area.
>>
>> If for all the data you're going to apply the model you know whether a person lives in a treatment area or not, you already know that the probability for those that don't is zero. In that case, you can just do a binary logit or probit with groups A and B, and use that model for the subsample of the data that lives in a treatment area to calculate the probability of treatment, and those who don't live in an area of treatment have a probability of zero.
>>
>> As for the questions related to propensity matching I can't help you there because that is not an area I know much about.
>>
>> Best,
>>
>> Alfonso Sanchez-Penalver, PhD
>>
>>> On Feb 28, 2014, at 4:01 AM, Isobel Williams <[email protected]> wrote:
>>>
>>> Dear All,
>>>
>>> Thank you very much for all of the advice! The nested logit approach seems most suitable for my data. So does this mean applying the nested logit model to all 3 groups, and then doing propensity score matching just between groups A and C? How would I do this in Stata?
>>>
>>> Also, when I try to merge 2 datasets in stata (I am merging many:1), some of the variables get really messed up: several consecutive variables in the which is being merged onto the master dataset take on values from the master dataset, which are in no way related. For example, the question "do you speak an indigenous language" from the individual dataset has a "yes/no" binary answer. When it is merged onto the household dataset, it has answers "carbon, electric" which probably applies to another variable in the household dataset.
>>>
>>> Could this be due to the value labels? How can I ensure that this doesn't happen?
>>>
>>> Many thanks,
>>> Isobel WIlliams
>>>
>>>> Subject: Re: st: Propensity Score Matching Between 3 Groups
>>>> From: [email protected]
>>>> Date: Thu, 27 Feb 2014 19:23:55 -0500
>>>> To: [email protected]
>>>>
>>>>
>>>> I believe that Adam has misinterpreted the "positivity assumption". It
>>>> is, according to Stuart (2009):
>>>>
>>>> "2) there is a positive probability of receiving each treatment for all
>>>> values of X: 0 < P(T = 1|X) < 1 for all X."
>>>>
>>>> In other words, the conditioning event is not residence in the
>>>> non-intervention area (a "treatment"), but on the predictors used for
>>>> creating the propensity score.
>>>>
>>>> Reference: Stuart, EA. 2009. Matching Methods for Causal Inference: A
>>>> review and a look forward. Statistical Science
>>>> Author Manuscrip available at:
>>>> http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2943670/pdf/nihms200640.pdf
>>>>
>>>> Steve
>>>> [email protected]
>>>>
>>>>
>>>> On Feb 26, 2014, at 5:15 PM, Adam Olszewski <[email protected]> wrote:
>>>>
>>>> It may be worth noting however, that this procedure violates the
>>>> principles of causal inference. If Group C resides in a
>>>> non-intervention area, then their probability of receiving "treatment"
>>>> is zero, and the positivity assumption required by propensity score
>>>> analysis is not met. Perhaps this is somehow irrelevant to the study
>>>> subject, but if causal inference assumptions are not met, then why not
>>>> just use regular regression?
>>>> AO
>>>>
>>>>> On Wed, Feb 26, 2014 at 4:54 PM, Austin Nichols <[email protected]> wrote:
>>>>> Isobel Williams <[email protected]>:
>>>>>
>>>>> The practical implementation of Fernando's first suggestion depends on
>>>>> your data, but if you have exogenous treatment predictors in the local
>>>>> `x' and a treatment dummy t, plus a variable group with value labels
>>>>> 1="A", 2="B", 3="C" then you can:
>>>>>
>>>>> logit t `x' if inlist(group,1,2)
>>>>> predict double p if inlist(group,1,3)
>>>>> psmatch2 t, p(p) out(y) `options'
>>>>>
>>>>> But I am unclear on why you would want to do this, as there is no
>>>>> guarantee that this type of matching will produce appropriate balance,
>>>>> even in expectation, much less in practice.
>>>>>
>>>>>> On Wed, Feb 26, 2014 at 2:58 PM, Fernando Rios Avila <[email protected]> wrote:
>>>>>> Hi Isobel,
>>>>>> So here is what I know about this.
>>>>>> If what you want to do is to indeed apply the propensity scores from
>>>>>> the A vs B group for the A vs C group, I would run the logit between A
>>>>>> and B, and then predict the propensity score for all three groups.
>>>>>> Once the propensity score is estimated, you can indicate within the
>>>>>> -psmatch2- the specific propensity score you want to use, instead of
>>>>>> having it estimate a separately logit model.
>>>>>> The other alternative, given that there is nothing that would indicate
>>>>>> that people in group B are equal to people in group C, is to estimate
>>>>>> the propensity score using a multinomial logit for the three groups,
>>>>>> and then proceed with your analysis with each pair group of interest.
>>>>>> (for example C vs B with B as treated group) (C vs A) and (B vs A)
>>>>>> Hope this helps.
>>>>>> Fernando
>>>>>>
>>>>>>
>>>>>> On Wed, Feb 26, 2014 at 2:01 PM, Isobel Williams
>>>>>> <[email protected]> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I am trying to estimate a propensity score and then find the
>>>>>>> average treatment effect on the treated. However, my sample has 3
>>>>>>> groups:
>>>>>>>
>>>>>>> Group A: Resides in intervention area, receives treatment
>>>>>>> Group B: Resides in intervention area, doesn't receive treatment
>>>>>>> Group C: Resides in non-intervention area, doesn't receive treatment
>>>>>>>
>>>>>>> Thisis effectively like having 1 intervention group and 2 control groups: I
>>>>>>> want to calculate a propensity score for treatment between Groups A and
>>>>>>> B. I then want to apply this propensity score to Groups A and C to find
>>>>>>> the average treatment effect on the treated.
>>>>>>>
>>>>>>> I have seen this done in several economic papers, but never explained thoroughly. Is it
>>>>>>> possible to do this in Stata, and if so, how?
>>>>>>>
>>>>>>> If it is relevant, I am using a logit model and -psmatch2- (the user-written SSC programme) to estimate the propensity score.
>>>>>>>
>>>>>>> Thanks!
>>>>>>> Isobel Williams
>>>>> *
>>>>> * For searches and help try:
>>>>> * http://www.stata.com/help.cgi?search
>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>> * http://www.ats.ucla.edu/stat/stata/
>>>> *
>>>> * For searches and help try:
>>>> * http://www.stata.com/help.cgi?search
>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> * http://www.ats.ucla.edu/stat/stata/
>>>>
>>>>
>>>> *
>>>> * For searches and help try:
>>>> * http://www.stata.com/help.cgi?search
>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> * http://www.ats.ucla.edu/stat/stata/
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/