The bigger issue is that if you have lots of ties,
it is most unlikely that your problem really is
suitable for -ipolate-, or indeed any other
interpolation method. In fact users applying
-ipolate- are probably confused about what
interpolation is for.
In the case of -ipolate-, the averaging is documented
in the manual at [D] ipolate. This therefore
is one of many cases in which reliance on
the on-line help would lead you to miss
a notable detail of the process. Nevertheless,
it would do no harm for a line to be added to
the help file explaining this detail.
It is open to anyone to clone -ipolate- and
modify the clone. The conservative behaviour
in which ties are not averaged can be achieved
without any programming whatsoever:
ipolate yvar xvar, gen(yvar2)
replace yvar2 = yvar if !mi(yvar)
-ipolate- is an official Stata command and
outwith my control. However, I looked at my
-cipolate- command for cubic interpolation
on SSC, which behaves in exactly the same
way in averaging ties. That is no surprise
as I stole the -ipolate- code and made the minimum changes
necessary. I am not inclined to change the code
but I will add a word of explanation to the help
file.
Nick
[email protected]
austin nichols
> While I can see Nick's point about interpolation in general, I agree
> with Ben Jann that -ipolate- should not "replace" nonmissing values of
> the y variable when it interpolates using multiple y values per value
> of x, since according to its own help file it does not. The missing
> value of rep78 for mpg==19 should be filled in, but not the four
> nonmissing values of rep78 for mpg==15 or 17.
>
> I would add the following to line 75 of ipolate.ado
> (currently a blank line):
> qui replace `z'=`usery' if `usery'<.
> and alter the first two lines to read
> *! version 1.0 6sep2005 based on ipolate, version 1.3.3 21sep2004
> program define nipolate, byable(onecall) sort
> then save the revised program as nipolate.ado, which produces
>
> rep78 mpg rep78i
> . 14 .
> 4 15 4
> 3 15 3
> 3 16 3
> 5 17 5
> 2 17 2
> 3 18 3
> 3 19 3
> . 19 3
> 3 19 3
> 3 21 3
> . 22 3.5
> 4 23 4
> 4 25 4
> 4 25 4
> . 26 4.1
> . 26 4.1
> 5 35 5
>
> in Ben's example.
>
> Alternatively, you could add an option to line 5, making it e.g.
> */ [ BY(varlist) Epolate noavg]
> and make line 75:
> if "`avg'"!="" { qui replace `z'=`usery' if `usery'<. }
> so nipolate would behave as does ipolate unless you specify noavg.
>
>
> On 9/5/05, Nick Cox <[email protected]> wrote:
> > -ipolate- interpolates linearly within gaps. That
> > is, it is assumed that the y variable varies linearly with
> > the x variable within any gaps. This is best seen
> > geometrically, as the last (x,y) pair before any gap
> > and the first (x,y) pair after any gap are just joined
> > by a straight line and intermediate results read
> > off directly.
> >
> > In addition, a consequence of the assumption that
> > y is piecewise linear in x is
> > that repeated y's at any x are just averaged.
> >
> > Plotting your results will make it easier to
> > see what is going on.
> >
> > The help file is somewhat elliptical here.
> >
> > Nick
> > [email protected]
> >
> > > Ben Jann wrote:
> > >
> > > Stata's -ipolate- command produces results I don't
> > > understand. Here is an example:
> > >
> > > . set seed 2346
> > > . sysuse auto
> > > . drop if rep78<. & uniform()<.8
> > > . ipolate rep78 mpg, g(rep78i)
> > > . sort mpg
> > > . list rep78 mpg rep78i, clean
> > >
> > > rep78 mpg rep78i
> > > 1. . 14 .
> > > 2. 4 15 3.5
> > > 3. 3 15 3.5
> > > 4. 3 16 3
> > > 5. 2 17 3.5
> > > 6. 5 17 3.5
> > > 7. 3 18 3
> > > 8. 3 19 3
> > > 9. 3 19 3
> > > 10. . 19 3
> > > 11. 3 21 3
> > > 12. . 22 3.5
> > > 13. 4 23 4
> > > 14. 4 25 4
> > > 15. 4 25 4
> > > 16. . 26 4.1
> > > 17. . 26 4.1
> > > 18. 5 35 5
> > >
> > > -help ipolate- states that rep78i should equal rep78
> > > if rep78 is not missing ("ipolate creates newvar = yvar,
> > > where yvar is not missing"). This is certainly not the case
> > > in the above example. For some reason, "3.5" is stored
> > > for cases 2, 3, 5, and 6. Can someone explain this to me?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/