Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: lpoly and nonmissing fitted values where the dependent variable is missing

From	Alex Olssen <[email protected]>
To	[email protected]
Subject	Re: st: lpoly and nonmissing fitted values where the dependent variable is missing
Date	Tue, 10 Aug 2010 10:30:08 +1000

Dear Statlisters,

I have a follow on question.

Does anybody know how -lpoly- chooses how far to extend fitted values
outside the values used for estimation?
I have a feeling this should be related to bandwidth but it is not
clear how or why.
The following code looks like with the rectangle kernel and linear
regression -lpoly- estimates to the bwidth - 5 units outside the
estimation values.
This seems arbitrary though.  Is there a good reason?

sysuse auto, clear
sort length
lpoly price length if length<190, ker(rec) deg(1) bwidth(10) nogr
gen(L10) at(length)
lpoly price length if length<190, ker(rec) deg(1) bwidth(20) nogr
gen(L20) at(length)
lpoly price length if length<190, ker(rec) deg(1) bwidth(30) nogr
gen(L30) at(length)
br L30 L20 L10 length

Kind regards,

Alex


On 10 August 2010 08:33, Alex Olssen <[email protected]> wrote:
> Thanks Austin and Yulia for your helpful responses.
>
> Sorry Austin, I was actually aware of your work and intended to
> mention it but forgot to when I sat down to write the email.  It is
> clear and very helpful.
>
> Kind regards,
>
> Alex
>
>
> On 10 August 2010 02:09, Yulia Marchenko, StataCorp LP
> <[email protected]> wrote:
>> Alex Olssen <[email protected]> asks why -lpoly- produces smoothed values
>> outside the range of <x>-values (the variable -length- below) as defined by an
>> -if- statement:
>>
>>> I am doing a regression discontinuity analysis and want to understand how
>>> -lpoly- is working.  I use the -lpoly- options -gen- and -at- to create
>>> fitted values for my local linear regression.  Due to the nature of
>>> regression discontinuity I look at two subgroups separately.  Fitted values
>>> are generated to observation that are even outside the subgroup.  I want to
>>> understand how it chooses where to fit them.
>>>
>>> For example,
>>>
>>> sysuse auto, clear
>>> lpoly price length if length<190, ker(rec) deg(1) bwidth(12) gen(L) at(length)
>>> sort length
>>> br L length
>>>
>>> Cars with lengths up to 212cm long have fitted values.  Does anyone know why?
>>>
>>> Note the if statement causes no problems.  If I gen lengthlt190=length if
>>> length<190 and then lpoly price lengthlt190 the results are identical.
>>
>> -lpoly- uses two notions of a sample: an estimation sample and a grid sample.
>> An estimation sample defines a set of observations to be used in local
>> weighted linear regression fits.  A grid sample defines a set of grid points
>> at which the smooth will be evaluated.  To link this to the documentation
>> (-[R] lpoly-, pp. 939-940), the estimation sample defines the set of x_i's
>> used to compute regression coefficients in formula (2) in the documentation
>> and the grid sample defines the set of grid points x_o.
>>
>> An -if- condition only affects the estimation sample and not the grid sample.
>> To restrict the range of grid points, Alex should create a new variable in the
>> desired range and use it in the -at()- option.  Continuing Alex's example, we
>> can use the -lengthlt190- variable in the -at()- option to restrict the range
>> of 'at' values to those less than 190:
>>
>>  . sysuse auto, clear
>>  . gen lengthlt190=length if length<190
>>  . lpoly price length if length<190,   ///
>>                        ker(rec) deg(1) bwidth(12) gen(L) at(lengthlt190)
>>
>>
>> -- Yulia
>> [email protected]
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- Re: st: lpoly and nonmissing fitted values where the dependent variable is missing
  - From: [email protected] (Yulia Marchenko, StataCorp LP)
- Re: st: lpoly and nonmissing fitted values where the dependent variable is missing
  - From: Alex Olssen <[email protected]>

Prev by Date: st: matching by characteristics
Next by Date: Re: st: matching by characteristics
Previous by thread: Re: st: lpoly and nonmissing fitted values where the dependent variable is missing
Next by thread: Re: st: lpoly and nonmissing fitted values where the dependent variable is missing
Index(es):
- Date
- Thread