I don't have any special expertise here, just
a bunch of prejudices based on reading, thinking
and practice with various datasets. As Bill
Gould reminded the list yesterday, I am a geographer,
not a statistician.
I think the main problem with smoothing is that
it is too easy, and so leaves many hard questions
unanswered. Of course, in many moods I go for "easy"
not "difficult" every time, especially if my
colleagues or students report themselves happy
that a successful smooth is capturing a pattern that
they can tell a story about in a presentation or
paper.
I like the idea that when fitting the data we
should listen to what the data say, especially
when existing theory is weak and/or we can't take
formal models too seriously, except as simplifications
or conveniences. Who doesn't? So we should be
open to indications of curvature, non-monotonicity,
etc., etc. -- even to kinks and jumps, although
smooths won't always capture those well.
But any kind of smooth -- bivariate or multivariate --
can leave awkward issues unanswered:
1. Sure, we got this nice smooth with this method
and these data. But what about a different degree
of smoothing or a different smoothing method?
As an old joke tells us, "The great thing
about standards is that there are so many to choose
form." Same story with smoothing methods.
2. How do I report my results other except as a graph?
How are others in my field expected to relate my
findings to theirs? For all the rigidity of many
parametric models, the apparatus of estimates,
standard errors, figures of merit, etc., etc.,
does give us a basis for comparisons, and comparisons
are much of the stuff of data analysis if you go
beyond what John Nelder called the cult of the
isolated study.
J. A. Nelder (1986)
Statistics, science and technology.
Journal of the Royal Statistical Society A
149: 109-121.
3. Anyone can buy into some homespun philosophy
such as "Nature is wobbly and wiggly, not
straight". But being open to wobbles and
wiggles in your data doesn't necessarily
get you any closer to interesting generalisations
about nature (or society, or whatever). You
can end up reporting lots of idiosyncratic behaviour,
however well supported (pun intended) it is in
a particular dataset.
For those and other reasons, when I think about this
territory I am reminded of Nestroy's dark comment
that it is the nature of every advance to appear
greater than it really is.
Nick
[email protected]
[email protected]
> thanks indeed Nick for your responses to all these
> smoothing questions.
>
> i have noticed that you warn of the dangers of taking
> these Stata smoothers too seriously, or of using them
> for anything other than exploratory work. also, in the
> concurrent -fractileplot- thread you mention that
> S-plus and R offer a much richer modelling
> environment.
>
> for those of us without the expertise required to make
> these judgements, would you care to explain further
> why you stress that smoothers such as -mlowess- or
> -mrunning- should not be taken too seriously? which
> deficiencies concern you?
>
> consider a simple application where all the x
> variables are known a priori (so that no exploratory
> work is required to determine the appropriate x's),
> and where all the y's & the x's are measured without
> error: if theory does not offer any priors re the
> functional form (other than local smoothness) are
> there any other estimators that would you take
> seriously? or do you think that smoothing science has
> not yet advanced to that point?
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/