[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Re: Filling missing values. with averages

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	st: RE: Re: Filling missing values. with averages
Date	Tue, 13 May 2003 09:24:22 +0100

Moon Joong
> >
> > I'm trying to find a smart way to fill missing values 
> with averages
> between
> > nearest values. The data looks like the following:
> >
> > id   year   rates
> > 1   1960  .35
> > 1   1961   missing
> > 1   1962  .45
> > 1   1963  .47
> > 1   1964   missing
> > 1   1965  .46
> > 2   1945  .2
> > 2   1946   missing
> > 2   1947  .35
> > 2   1948   missing
> > 3   1949   missing
> > 3   1950   .56
> > .....
> >
> > What I want to do is, take the example of id 1 of year 
> 1961 (which is
> > missing), to fill the average value between .35 and .45 (that is
> > (.35+.45)/2) into the missinge observation of year 1961. 
> Since I have a
> lot
> > of observations to fill in, using brute force by hands 
> takes forever and
> > would be inaccurate. I think it's very straightfoward, 
> which I don't know
> > since I'm a beginner.

In addition to other suggestions, you could use -ipolate-. 
However, you should only do this separately for each -id-. 
Nevertheless you can cycle over the -id-s with -foreach-. 

The following is a sketch. Also check out the -epolate- 
option. 
 
levels id, local(levels) 
gen rates2 = rates 

qui foreach l of local levels { 
	ipolate rates year if id == `l', gen(work) 
	replace rates2 = work if missing(rates2) & !missing(work) 
	drop work 
} 

This will fill gaps of more than one. 

That's linear interpolation. Other interpolation 
methods might also be tried. But linear is 
right for your examples. 

Nick 
[email protected] 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Re: Filling missing values. with averages
  - From: "Scott Merryman" <[email protected]>

Prev by Date: st: RE: RE: Tabulating extreme values (2)
Next by Date: Re: st: Tabulating extreme values
Previous by thread: st: Re: Filling missing values. with averages
Next by thread: Re: st: Filling missing values. with averages
Index(es):
- Date
- Thread