Mirko--
Bill Gould and Scott Merryman already made good suggestions (though I
think Scott's code does not work when age is repeated within
region/educ cells, e.g. after an -expand- command on the toy data, and
could tax system memory if age varies widely across cells and there is
a lot of data).
When I read the question, I thought age might not be measured as an
integer, i.e. as floor(elapsed_age), but as a float or double, as I
suspect it usually should be (though common practice dictates
integers), which led me my thinking in another direction, shown below.
This solution is pretty fast and adaptable to a number of related
problems, I suspect, but limited to problems where the "cohort" fuzzy
match variable (age, here) has values that fit in a local (though this
limit could be sidestepped in various ways). No doubt there is an
even better solution using Mata.
clear
input id region educ age income
1 2 1 25 5
1 2 1 26 5
2 2 1 29 8
2 2 1 30 8
3 2 1 32 11
3 2 1 33 11
4 1 1 40 5
4 1 1 41 5
5 1 2 37 8
5 1 2 38 8
6 1 2 42 9
6 1 2 43 9
end
g double yref=.
cap prog drop mbyage
program mbyage, byable(recall)
syntax [varlist] [if] [in]
marksample touse
qui levelsof age if `touse', loc(a)
foreach y of local a {
su `varlist' if `touse' & age>=`y'-5 & age<=`y'+5, meanonly
qui replace yref=r(mean) if `touse' & age==`y'
}
end
bys reg educ: mbyage inc
sort id age
li, noo sepby(id)
On 5/15/07, Scott Merryman <[email protected]> wrote:
> An alternative:
> egen group = group(region educ)
> tsset group age
> tsfill, full
> tssmooth ma ref= income, w(5 1 5)
> keep if id !=.
>
> > -----Original Message-----
> > From: Mirko
> > William,
> > thanks a lot for all the work you have done on this.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/