Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Get fitted values after locpoly (follow-up)
From
Partho Sarkar <[email protected]>
To
[email protected]
Subject
Re: st: Get fitted values after locpoly (follow-up)
Date
Wed, 21 Sep 2011 20:55:04 +0530
Tania
I think I see where you are coming from, and so just a quick pointer:
You are probably thinking in terms of "kernel regression" (or local
polynomial regression) as usually understood in the machine learning
literature, in which the bandwidth is *optimally* selected (or
"tuned") from an available "training set" or "memory set" of (xi,yi)
points, and *this bandwidth, together with the training set data*, can
then be used to "predict" the y0 value at some previously "query"
point x0 outside the training set. [In a sense, you could say that
the training set together with the bandwidht constitute the "model"].
But this is clearly not how locpoly is set up. The bandwidth is
fixed-either by default or your choice. And I am not sure, having
only tried a canned example with the program once very briefly, if
there is any scope to meaningfully partition the data into training
and query sets, as I think you might have in mind. The user interface
certainly does not *explicitly* give the user such a choice. [But this
can be clarified by those more familiar with this command.] There may
be possibly be a roundabout way to get an approximation to what I
think you have in mind. But if I wanted to do the kind of kernel
regression I mention above, I would (without knowing what other Stata
programs may be available for this) go to R's CRAN archives. I worked
on this a few years ago, so let me know and I could try to dig up
some of the sources, or just search CRAN.
Hope this helps
Partho
On Wed, Sep 21, 2011 at 4:28 PM, Tania Treibich
<[email protected]> wrote:
> Dear Stata List users
>
> I could get fitted values for my kernel regression using the at()
> option of lpoly instead of the n() option:
>
> locpoly inv_rate l_kap, at(l_kap) generate (yfitted) degree(3)
> width(1.5) noscatter
>
> This indeed computes the smoothing and creates the fitted value
> yfitted for all the values of l_kap. However, it gives too much
> weight to outliers.
>
> Instead, I would like the kernel regression to be computed only on a
> limited number of points (as in the option n(50) ) BUT get the fitted
> (approximated) value for ALL my observations.
>
> Thanks again for your help!!
>
> Tania
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/