I am glad that you do understand that. But you did say that you were
doing kernel density estimation, so I was making clear that the two
procedures were different.
Nick
[email protected]
Benjamin Villena Roldan
Thanks for the information, Nick
By the way I didn't mean that kernreg is a density estimation procedure.
Nadaraya-Watson implemented by kernreg also needs a choice of bandwidth.
I
just wanted to know what kind of default bandwidth kernreg uses.
-----Mensaje original-----
De: [email protected]
[mailto:[email protected]] En nombre de Nick Cox
Enviado el: Monday, September 15, 2008 1:21 PM
Para: [email protected]
Asunto: RE: st: optimal bandwidth in Stata 8
I agree with Austin. Indeed I would underline his argument with further
points.
First, bandwidth means different things w.r.t. different kernels because
the parameter for each that tunes bandwidth does not have the same
effect, as the kernels are defined quite differently, a point that is at
least tacit in the official documentation.
Second, -kernreg- and -kernreg1- are both buggy; -kernreg2- is I believe
better. Use -findit- for locations. But unless you are using some
ancient Stata there is no reason whatsoever to use any rather than
-lpoly- or some other similar method implemented as an official command.
I no longer have access to Stata 8 but I believe the same point applies
there too.
Third, -kernreg- and -kernreg1- and -kernreg2- are not density
estimation command.
Nick
[email protected]
Austin Nichols
Benjamin Villena Roldan <[email protected]> :
The Methods and Formulas section of the manual entry [R] kdensity is
quite clear: the default bandwidth is .9*N^(-1/5)*min(sd(x),
IQR(x)/1.349) which is not globally optimal in any sense. The Methods
and Formulas section of the manual entry [R] lpoly shows the formula
for the "asymptotically optimal constant bandwidth" used there (also
not globally optimal). You mention kernreg and kernreg1 which are not
official Stata, so you would have to read their help files, read the
ado code, or contact their authors for more detail. The -locpoly-
command on SSC enjoys a special status; it is not official Stata but
is written by Stata staff (way before -lpoly- was introduced). If you
view "http://www.stata-journal.com/software/sj6-4/st0053_3/locpoly.hlp"
in Stata, you will find the hilarious throwaway line:
If width() is not specified, then the "default" width is used; see [R]
kdensity. This default is entirely inappropriate for local polynomial
smoothing. Roll your own.
On Mon, Sep 15, 2008 at 12:01 PM, Benjamin Villena Roldan
<[email protected]> wrote:
> I'm running some kernel density estimation in Stata 8.
> The help file asserts that the "optimal" width is used (bandwidth) to
do the
> kernel estimation. I searched in the internet, but I haven't found a
clear
> description of the procedure used. Does anyone know the precise
meaning of
> "optimal" width? (Cross-validation perhaps?).
> Does anyone know if the same choice is used for kernreg and kernreg1?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/