Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Polynomial Fitting and RD Design
From
Alex Olssen <[email protected]>
To
[email protected]
Subject
Re: st: Polynomial Fitting and RD Design
Date
Sat, 10 Sep 2011 23:06:53 +1200
Works a charm.
Thanks Nick!
On 10 September 2011 22:57, Nick Cox <[email protected]> wrote:
> This is Stata 7 syntax. It will work in
> later versions with the prefix
>
> version 7:
>
> Nick
>
> On 10 Sep 2011, at 11:46, Alex Olssen <[email protected]> wrote:
>
>> Hi,
>>
>> I have a question. It's not quite on this topic, but it is related to
>> the replication of Lee, Moretti, and Butler.
>> In the do-files found in the links in the original post are the
>> following lines of code
>>
>> graph meanY100 fit1 fit2 int1U int1L int2U int2L dembin ,
>> l1(" ") l2("ADA Score, time t") b1(" ") t1(" ") t2(" ")
>> b2("Democrat Vote Share, time t-1") xlabel(0,.5,1) ylabel (0,.5,1)
>> title(" ") xline(.5)
>> c(.lll[-]l[-]l[-]l[-]) s(oiiii) sort saving(`1'_reduced.gph, replace);
>> translate `1'_reduced.gph `1'_reduced.eps, replace;
>>
>> I can't get this to work. I have never seen the graph command used
>> like this before - I always used graph twoway, etc.
>>
>> In any case, the error I get is:
>> "meanY100graph_g.new fit1 fit2 int1U int1L int2U int2L dembin ,: class
>> member function not found"
>>
>> Can anyone help me with this?
>>
>> Cheers,
>> Alex
>> On 6 September 2011 07:40, Patrick Button <[email protected]> wrote:
>>>
>>> Thank you for the feedback everyone. It has been extremely useful and now
>>> I am not freaking out as much.
>>>
>>> First, i've changed x to x - 0.5 as per Austin Nichols' suggestion. This
>>> makes interpretation easier. I should have done this earlier.
>>>
>>> I was thinking that my replication was going to involve critique Nick
>>> Cox,
>>> and I agree with you and others that the 4th order polynomials are
>>> somewhat fishy.
>>>
>>> The weird thing about the paper is that the authors say that they are
>>> using 4th degree polynomials on either side of the discontinuity, but
>>> their graphs and/or code indicate that they are just using one polynomial
>>> to fit the entire thing. Not sure why that is... So in trying to do the
>>> 4th degree polynomial for each side on my own, i’ve run into this issue
>>> of
>>> results being weird. Now that I understand why it makes perfect sense.
>>>
>>> As for if the 4th degree polynomial is ideal, I would agree with all of
>>> you that it probably is not. If one is going to go with polynomials, the
>>> ideal degree depends on the bandwidth you use. Ariel Linden described
>>> this
>>> really well earlier.
>>>
>>> Larger bandwidths mean more precision, but more bias. Smaller bandwidths
>>> (say only using data within +/- 2 percentage points of 50%) lead to the
>>> opposite. Lee and Lemieux (2010)
>>> (http://faculty.arts.ubc.ca/tlemieux/papers/RD_JEL.pdf) discuss that the
>>> optimal polynomial degree is a function of the bandwidth.
>>>
>>> The ideal degree is determined by the Akaike Information Criterion (AIC).
>>> I'm going to stick with the 4th degree polynomial (and the entire
>>> dataset), then i'll try other polynomials and bandwidths, and then kernel
>>> after that. I need to do the replication first, THEN I will critique that
>>> by going with something more realistic. The -rd- package should be really
>>> useful for that. Thanks so much for all the discussion about a more
>>> realistic model. The key thing is that results should be robust to
>>> several
>>> different types of fitting and bandwidths, so long as they are realistic
>>> in the first place.
>>>
>>> As for using orthog/orthpoly to generate orthogonal polynomials, I gave
>>> that a shot. Thank you very much for the suggestion Martin Buis.
>>>
>>> I've done the orthogonalization two different ways. Both give different
>>> results, neither of which mirror the results where I create the
>>> polynomials in the regular fashion. I'm not sure which method is
>>> "correct". I'm also unsure why the results are significantly different.
>>> Any suggestions would be very helpful.
>>>
>>> Orthpoly # 1 uses orthpoly separately on each side of the discontinuity.
>>> #
>>> 2 does it for all the data.
>>>
>>> The code and output are below:
>>>
>>> *****
>>>
>>> drop if demvoteshare==.
>>> keep if realincome~=.
>>> drop demvs2 demvs3 demvs4
>>>
>>> gen double x = demvoteshare - 0.5
>>>
>>> gen D = 1 if x >= 0
>>> replace D = 0 if x < 0
>>>
>>> *Orthpoly #1
>>>
>>> *Creating orthogonal polynomials separately for each side.
>>>
>>> orthpoly x if x < 0, deg(4) generate(demvsa demvs2a demvs3a demvs4a)
>>> orthpoly x if x >= 0, deg(4) generate(demvsb demvs2b demvs3b demvs4b)
>>> replace demvsa = 0 if demvsa==.
>>> replace demvsb = 0 if demvsb==.
>>> replace demvs2a = 0 if demvs2a==.
>>> replace demvs2b = 0 if demvs2b==.
>>> replace demvs3a = 0 if demvs3a==.
>>> replace demvs3b = 0 if demvs3b==.
>>> replace demvs4a = 0 if demvs4a==.
>>> replace demvs4b = 0 if demvs4b==.
>>>
>>> replace demvsa = (1-D)*demvsa
>>> replace demvs2a = (1-D)*demvs2a
>>> replace demvs3a = (1-D)*demvs3a
>>> replace demvs4a = (1-D)*demvs4a
>>>
>>> replace demvsb = D*demvsb
>>> replace demvs2b = D*demvs2b
>>> replace demvs3b = D*demvs3b
>>> replace demvs4b = D*demvs4b
>>>
>>> regress realincome D demvsa demvs2a demvs3a demvs4a demvsb demvs2b
>>> demvs3b
>>> demvs4b
>>>
>>> *Orthpoly #2
>>>
>>> orthpoly x, deg(4) generate (demvs demvs2 demvs3 demvs4)
>>>
>>> replace demvsa = (1-D)*demvs
>>> replace demvs2a = (1-D)*demvs2
>>> replace demvs3a = (1-D)*demvs3
>>> replace demvs4a = (1-D)*demvs4
>>>
>>> replace demvsb = D*demvs
>>> replace demvs2b = D*demvs2
>>> replace demvs3b = D*demvs3
>>> replace demvs4b = D*demvs4
>>>
>>> regress realincome D demvsa demvs2a demvs3a demvs4a demvsb demvs2b
>>> demvs3b
>>> demvs4b
>>>
>>> *****
>>>
>>> And the results are:
>>>
>>>
>>> Orthpoly # 1
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> realincome | Coef. Std. Err. t P>|t| [95% Conf.
>>> Interval]
>>>
>>> -------------+----------------------------------------------------------------
>>> D | -2597.064 140.5829 -18.47 0.000 -2872.626
>>> -2321.502
>>> demvsa | -853.4396 109.0927 -7.82 0.000 -1067.277
>>> -639.6025
>>> demvs2a | -941.1276 109.0927 -8.63 0.000 -1154.965
>>> -727.2905
>>> demvs3a | 593.9881 109.0927 5.44 0.000 380.151
>>> 807.8252
>>> demvs4a | 121.7433 109.0927 1.12 0.264 -92.09384
>>> 335.5804
>>> demvsb | -2006.552 88.66978 -22.63 0.000 -2180.357
>>> -1832.747
>>> demvs2b | -620.1632 88.66978 -6.99 0.000 -793.9685
>>> -446.3579
>>> demvs3b | -134.2237 88.66978 -1.51 0.130 -308.029
>>> 39.58156
>>> demvs4b | 457.7355 88.66978 5.16 0.000 283.9302
>>> 631.5407
>>> _cons | 32210.1 109.0927 295.25 0.000 31996.26
>>> 32423.93
>>>
>>> ------------------------------------------------------------------------------
>>>
>>>
>>> Orthpoly # 2
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> realincome | Coef. Std. Err. t P>|t| [95% Conf.
>>> Interval]
>>>
>>> -------------+----------------------------------------------------------------
>>> D | -15904.18 22026.78 -0.72 0.470 -59079.79
>>> 27271.42
>>> demvsa | 56141.35 33816.59 1.66 0.097 -10143.95
>>> 122426.6
>>> demvs2a | 42328.68 25413.63 1.67 0.096 -7485.616
>>> 92142.98
>>> demvs3a | 19367.81 11950.96 1.62 0.105 -4057.754
>>> 42793.37
>>> demvs4a | 3038.492 2722.757 1.12 0.264 -2298.496
>>> 8375.481
>>> demvsb | -40636.36 7469.378 -5.44 0.000 -55277.4
>>> -25995.32
>>> demvs2b | 47190.86 9181.907 5.14 0.000 29193.03
>>> 65188.7
>>> demvs3b | -33596.74 6331.021 -5.31 0.000 -46006.43
>>> -21187.04
>>> demvs4b | 7983.823 1546.578 5.16 0.000 4952.31
>>> 11015.33
>>> _cons | 68128.44 21623.63 3.15 0.002 25743.08
>>> 110513.8
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> The results using the earlier method (generating polynomials normally)
>>> gives the following after I change x to x - 0.5:
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> realincome | Coef. Std. Err. t P>|t| [95% Conf.
>>> Interval]
>>> -------------+--------------------------------------------------
>>>
>>>
>>>
>>>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/