Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Regression Discontinuity Design


From   Nyasha Tirivayi <[email protected]>
To   [email protected]
Subject   Re: st: Regression Discontinuity Design
Date   Fri, 7 Oct 2011 20:54:06 +0200

Hi Austin

Thanks so much for the advice

Regards

Nyasha Tirivayi
Maastricht University
 Netherlands

On Fri, Oct 7, 2011 at 7:42 PM, Austin Nichols <[email protected]> wrote:
> Nyasha Tirivayi <[email protected]>
> I said in my first reply: "There are IV methods one might use,
> perhaps based on distance to clinic...." We would also need to know every
> bit of information in your data, and what other data might be matched onto it,
> to tell you what can be done.  Perhaps the best approach is to recruit
> a coauthor who can help you brainstorm another method.
>
> On Fri, Oct 7, 2011 at 11:28 AM, Nyasha Tirivayi <[email protected]> wrote:
>> Hi Austin
>>
>> I mean baseline employment rates (obtained retrospectively) are higher
>> for control individuals than for treated individuals. What we were
>> told by the program staff was that community selection was done based
>> on HIV rates of above 22%. But as you can see, one treated community
>> is below 22% and one control community is above 22%.
>>
>> If I cannot use RDD, what other methods can I use instead of
>> propensity score matching? My outcome is labour supply measured as
>> weekly hours, cross sectional data.
>>
>> May you kindly advise
>>
>> Nyasha Tirivayi
>> Maastricht University
>> Netherlands
>>
>>
>> On Fri, Oct 7, 2011 at 5:10 PM, Austin Nichols <[email protected]> wrote:
>>> Nyasha Tirivayi <[email protected]> :
>>> What do you mean, "baseline labour supply rates for the treated sample
>>> (68%) are lower than from the control group (57%)"
>>>
>>> fwiw, I see no evidence of a discontinuity:
>>>
>>> clear
>>> input T Community N HIVrate
>>> 1   1    103     22.5
>>> 1   2    120     22.6
>>> 1   3    122     22.5
>>> 1   4    129     20.3
>>> 0   5    124     18.5
>>> 0   6    140     20.4
>>> 0   7    126     18.5
>>> 0   8    138     23.9
>>> end
>>> sc T HIVrate [aw=N]
>>>
>>>
>>> On Fri, Oct 7, 2011 at 10:15 AM, Nyasha Tirivayi <[email protected]> wrote:
>>>> Hi Austin
>>>>
>>>> Thank you so much for the response. I am trying to estimate the impact
>>>> of a social program on intrahousehold labour supply. Hence I have
>>>> labour supply data at individual level. In total I have 474
>>>> individuals from 200 treated households (residing in 4 treated
>>>> communities) and 532 individuals from 200 control households (residing
>>>> in 4 control communities).
>>>>
>>>> I had initially done propensity score matching. However baseline
>>>> labour supply rates for the treated sample (68%) are lower than from
>>>> the control group (57%). Once comment I have received is that the
>>>> possibility of differential trends in labor market outcomes across
>>>> program and non-program communities implies that any observed
>>>> differences are not reliable measures of the effects of the food
>>>> program.   Hence journal reviewers are concerned about the possibility
>>>> of unobservables and suggested a regression discontinuity approach (if
>>>> possible) or within community estimates.
>>>>
>>>> CommunityHouseholdsAdult Individuals    Community HIV rate
>>>> Treated
>>>> 1       50      103     22.5
>>>> 2       50      120     22.6
>>>> 3       50      122     22.5
>>>> 4       50      129     20.3
>>>> Control
>>>> 1       50      124     18.5
>>>> 2       50      140     20.4
>>>> 3       50      126     18.5
>>>> 4       50      138     23.9
>>>>
>>>>
>>>>
>>>> On Fri, Oct 7, 2011 at 1:55 PM, Austin Nichols <[email protected]> wrote:
>>>>> Nyasha Tirivayi <[email protected]>
>>>>> You do not have a good RD design, partly because you do not appear to
>>>>> be confident of the existence of a discontinuity in treatment, but
>>>>> mainly because you do not have adequate sample size.  6 communities
>>>>> are hypothesized to lie on either side of the cutoff; if assumptions
>>>>> are correct, communities close to the cutoff can be treated as being
>>>>> randomly assigned treatment.  People in those communities can also be
>>>>> treated as being randomly assigned treatment under the stronger
>>>>> assumption that community is fixed and people do not change community.
>>>>>  But you do not have 400 observations on the assignment variable with
>>>>> which to construct a local linear regression of the effect of the
>>>>> assignment variable on treatment; you have 6. The problem here is that
>>>>> you will really want to cluster on community, but you cannot cluster
>>>>> when you have 6 clusters (and when you cluster in the first stage, you
>>>>> really only have 6 obs, not 400). Even 400 obs probably would not be
>>>>> enough to identify any reasonably small effect using an RD method,
>>>>> which needs a very large sample size to work well.  The first thing to
>>>>> do in such cases, if you are not sure how much power you might have,
>>>>> is to run a quick simulation. There are IV methods one might use,
>>>>> perhaps based on distance to clinic, but you are not really explicit
>>>>> about what your estimand is.  What are you trying to estimate?  What
>>>>> is the outcome variable?
>>>>>
>>>>> On Thu, Oct 6, 2011 at 6:39 PM, Nyasha Tirivayi <[email protected]> wrote:
>>>>>> Hello
>>>>>>
>>>>>> I have questions about implementing a regression discontinuity
>>>>>> approach. I have cross sectional data from 200 households on a social
>>>>>> program and 200 control households. The program was targeted at two
>>>>>> levels- geographically and at household level.
>>>>>>
>>>>>> The geographic placement of the social program in communities appears
>>>>>> to have been done based on HIV prevalence rates of more than 20.5% for
>>>>>> 3 "treated" communities and less than 20.5% for 3 "control
>>>>>> communities". Two clinics do not follow this cutoff making it a fuzzy
>>>>>> discontinuity design at community level. After geographic placement,
>>>>>> households were then selected based on a means tested score. However
>>>>>> we do not have access to this data. We have data from 200 randomly
>>>>>> sampled households who are actually in the social program and residing
>>>>>> in the treated communities and from 200 control households with
>>>>>> similar household characteristics to the treated households but
>>>>>> residing in the control communities.
>>>>>>
>>>>>> My questions are as follows:
>>>>>> 1. Would it be valid to use the community level discontinuity for
>>>>>> impact evaluation? What software can I use in Stata?
>>>>>> 2. If so would an RD approach based on 8 communities be valid? Is the
>>>>>> sample of communities too small?
>>>>>> 3. If RD is no appropriate what other methods besides propensity score
>>>>>> matching can I use, that can also take care of unobservables even with
>>>>>> cross sectional data?
>>>>>>
>>>>>> Kindly advise
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>> N.Tirivayi
>>>>>> Maastricht University
>>>>>> Netherlands
>>>>> *
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index