See also
ssc inst locpr
help locpr
On Fri, May 2, 2008 at 12:40 PM, Sergiy Radyakin <[email protected]> wrote:
> Hello Nick,
>
> I am sorry for being inprecise. Indeed, I smooth the rates (of e.g.
> unemployment) by groups defined by age (which is truncated to
> integers, and thus I concider it categorical).
>
> So I start with a table like the following:
>
> Age Unemployment rate
> 10 0.01
> 11 0.02
> ...
> 99 0.01
>
> Here unemployment rate is naturally between 0 and 1. It is the average
> of the 0/1-responses within the group, defined by age.
>
> If I just run lowess, it produces the picture similar to the one here:
> sysuse auto
> generate z=1/headroom^16
> lowess z mpg
> Note that the tails go below zero, and this is what I am trying to avoid.
>
> Your advice of logit transformation before/after smoothing worked.
>
> Thank you,
> Sergiy Radyakin
>
>
>
>
>
> On 5/2/08, Nick Cox <[email protected]> wrote:
> > Quite how to get useful results from smoothing a binary response is not
> > clear to me.
> >
> > If the data were proportions on (0,1) or even [0,1] I would suggest
> > some kind of transformation approach. -lowess, logit- is presumably
> > intended to help.
> > Otherwise consider something like an angular or folded root
> > transformation, applying -lowess- and then transforming back.
> >
> > But for binary data any transformation just maps two distinct values to
> > two other distinct values and so cannot help, so far as I can see.
> >
> > In the case of unemployment data, presumably you are dealing with
> > individuals? If they are aggregate data for lots of individuals I would
> > collapse by age to get proportion of unemployed, and then smooth if
> > necessary. It sounds as if you want something quite different, however.
> > Also, as you regard -age- as categorical I probably don't understand
> > what you are trying to do.
> >
> > Nick
> > [email protected]
> >
> > Sergiy Radyakin
> >
> > I am plotting a smoothed graph (-lowess-) of a binary variable (e.g.
> > unemployed) by categorical (e.g. age). However the smoothed values are
> > not necessarily in the [0;1] range, where unemployment must be by
> > definition. I can save the smoothed values into a new variable with
> > the option -generate(newvar)- and then truncate the negatives and
> > values larger than one, but I believe smoothing must look differently
> > if I could tell -lowess- to look for such a constrained value in the
> > first place. As it follows from the description of -lowess- it doesn't
> > have such a feature. Is there any user-written command or simple
> > algorithm for this purpose?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/