Eik Leong Swee
> Firstly, thanks so much for the reply. I'm not sure what is
> the difference between kernreg2 and
> locpoly.
I am not sure why you are not sure what the difference is,
as a comparison of the files and using the programs
should make this clear. For example, -kernreg2- is
a program for Stata 6, while -locpoly- is a program
for Stata 8, so the associated graphics are quite
different. As earlier indicated, -kernreg2- was intended to
be a temporary fix by myself to -kernreg-. That fix
was made in March 1999, but the authors of -kernreg-
have yet to get round to publishing a revised version
of their program, despite a variety of public and private
requests. For Stata 8 users, that is now immaterial,
as -locpoly- supersedes -kernreg-. For any Stata 6 and Stata 7
users, there remains an issue. I have been tempted to
withdraw -kernreg2-, but that would mean that -kernreg-
would remain in the public domain, although known to possess
bugs, yet without an alternative.
> My theoretical understanding of kernel estimation (y on x) is
> a locally weighted averaging (using
> a prespecified kernel function eg. normal or epanechnikov)
> method of fit where the bandwidth is
> simply a measure of applying weights to distant observations.
> The optimal bandwidth is chosen to
> minimise the mean itegrated squared error or so-called cross
> validation (CV).
>
> Given the above, would you suggest I use kernreg2 or locpoly?
> Is the optimal bandwidth chosen in
> each case using CV?
I suggest neither. I think you should tell us what kind
of assumptions you are making about the error around
whatever smooth curve you are fitting or, more generally,
why you think a binary response is suitable for this
kind of application.
Neither program makes any use of cross-validation. If they
did, that would be clear in the documentation.
Cross-validation would require some extra programming
on somebody's part. But any kind of optimisation would seem
beside the point unless you can justify your application
as appropriate. Optimising a qualitatively incorrect model
would seem a somewhat bizarre exercise.
This is not to say that some kind of kernel regression
might not provide a useful exploratory or heuristic
approach to smoothing your response as a function of
your predictor. In practice, it might work quite well.
But I am not clear that the idea of averaging across a
binary response is quite the best way to approach your problem. That's
a lack of clarity on my part, and open to correction
from people with stronger technical grasp of this
area.
> Another question is regarding graphing the kernel estimates
> and bootstrap confidence intervals. I
> have seen in some journals where kernel regressions (y on x)
> were used and bootstrap CI were
> plotted around the kernel estimates. I encountered 3 problems
> here. Firstly, I could not save
> kernreg graphs like I could with scatter plots. Secondly, I
> know how to calculate bootstrap CI but
> dont know how to plot them on a graph. Lastly, how do I plot
> both together on one graph?
Your problems here are not indicated precisely. Perhaps
you should start by stating which version of Stata you
are using. If you are using Stata 8, -kernreg*- is,
as stated, superseded. If you are using -kernreg2-,
you should indicate precisely what you did. If you are using
-kernreg-, that is against my strong advice, as indicated.
Nick Cox
[email protected]
> > -kernreg2-, of which I am notionally first
> > author, was intended to be a temporary fix
> > of -kernreg-, written by other people.
> >
> > It didn't turn out that way, but no matter:
> > -locpoly- is now the recommended command,
> > in my view. In short, -kernreg2- is history,
> > except that it remains in the archives out
> > of inertia and for people still on earlier
> > versions of Stata.
> >
> > However, both of them stop a long way short
> > of offering this kind of functionality.
> >
> > Having said that, my own personal view is
> > that kernel regression is not obviously
> > the best thing for summarising how a
> > binary response varies with a predictor.
> > I can't offer more positive advice because
> > I am unclear on how far your problem is
> > tractable at all.
Eik Leong Swee
> > > I am trying to do a kernel density estimation of a y ( a
> 0-1 variable)
> > > on x1. This generates Graph1. I also did an estimation on
> y on x2 and
> > > generated graph2. I used kernreg2 for both these estimations.
> > >
> > > Now, I would also like to bootstrap confidence intervals
> around the
> > > graph and subsequently test the two distributions from
> graph 1 and 2
> > > (to see if they are statistically different in the
> relevant range) .
> > > Unfortunately, kernreg2 does not give the non-parametric standard
> > > errors. I tried bootstrapping nevertheless, and this is the output
> > > that I get.
> > > Bootstrap statistics
> > >
> > > Variable | Reps Observed Bias Std. Err. [95% Conf. Interval]
> > > ---------+----------------------------------------------------
> > > ---------------
> > > klnpce | 100 10.69125 .5342394 .9190264 8.867703 12.5148 (N)
> > > | 9.449879 13.2954 (P)
> > > | 9.095177 11.76517 (BC)
> > > --------------------------------------------------------------
> > > ---------------
> > > N = normal, P = percentile, BC = bias-corrected
> > >
> > >
> > > First I would like to draw confidence intervals for the entire
> > > function, and then bootstrap the confidence intervals and
> am not sure
> > > how to do it. I was wondering if anyone had faced this
> problem, and
> > > could help me out.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/