I don't know what "did not work" means if you don't tell us.
In your code, you say -fiveyearaverage- but all calculations are based
on threes.
I guess that's just a typo.
In your code, you use a -while- loop when I would use -forvalues-, but
that's not a bug.
I presume that -year- and -yeara- are references to the same variable.
Otherwise, your code looks correct to me. However, I have not tested it.
Nick
[email protected]
Fabian Brenner
thank you very much your for your help, Nick. Besides "year" (1979-2006)
there is a variable "group" (from 1-38) in my dataset. I want to create
a three-year-average across groups and years. Variable "fiveyearaverage"
should be the average across years (the three years before) for the
group this observation belongs to.
My data look like the following:
"year" "ROE" "group" "threeyearaverage" (for this group in
the last three years)
1979 3 29 ?
1979 4 17 ?
1979 1 29 ?
1980 4 9 ?
...
I tried it like this but it did not work.
generate fiveyearaverage = .
local k = 1
while `k' <38 {
quietly forvalues y = 1979/2006 {
summarize ROE if yeara == `= `y' - 3' & group == `k', meanonly
local mean3 = return(mean)
summarize ROE if yeara == `= `y' - 2' & group == `k', meanonly
local mean2 = return(mean)
summarize ROE if yeara == `= `y' - 1' & group == `k', meanonly
replace fiveyearaverage = (`mean3' + `mean2' + r(mean)) / 3 if yeara ==
`y' & group == `k'
}
local k = `k' + 1
}
"Nick Cox" <[email protected]>
> I agree with Neil's suggestion of -egen, mean() by(year)- for yearly
> averages.
>
> For the average of the previous three years, there are several ways to
> do it. I don't see that Neil's solution takes into consideration that
> the periods should overlap.
>
> Here is one way to do it:
>
> gen threeyearaverage = .
>
> qui forval y = 1982/2006 {
> local y1 = `y' - 3
> local y2 = `y' - 1
> su ROE if inrange(year, `y1', `y2'), meanonly
> replace threeyearaverage = r(mean) if year == `y'
> }
>
> This is an average across observations, not years. If you want the
> latter, it would be
>
> gen threeyearaverage = .
>
> qui forval y = 1982/2006 {
> su ROE if year == `= `y' - 3', meanonly
> local mean3 = r(mean)
> su ROE if year == `= `y' - 2', meanonly
> local mean2 - r(mean)
> su ROE if year == `= `y' - 1', meanonly
> replace threeyearaverage = (`mean3' + `mean2' + r(mean)) / 3 if
> year == `y'
> }
>
> Another different way to do it would be to -collapse-, work on the
> collapsed dataset, and -merge- back in again.
>
> I'll put in here that -round(,)- can be useful in similar problems.
>
> Nick
> [email protected]
>
> Neil Shephard
>
> Fabian Brenner wrote:
> >
> > I have several observations called "ROE" for the "years" (from 1979
to
> 2006). There is a different number of observations for each year.
> >
> > My data look like this:
> > "year" "ROE" "Average" "threeyearaverage"
> >
> > 79 12 ? ?
> > 79 9 ? ?
> > 79 2 ? ?
> > 80 3 ? ?
> > 81 20 ? ?
> > 81 5 ? ?
> > 82 3 ? ?
> > 82 6 ? ?
> > 82 9 ? ?
> > 82 8 ? ?
> > . . .
> > . . .
> > . . .
> >
> > I want to compute the average of the observations for each year,
e.g.
> for 1979: (12+9+2)/3 (I tried to sort the observations and to divide
the
> sum by _n but it didn't work...)
> >
>
> bysort year : egen average = mean(ROE)
> > In a second step I want to get the average ROE for the past three
> years ("threeyearaverage") (beginning in 1982), e.g. for 1982 it
should
> be the average of the ROE in 1979 plus the average ROE in 1980 plus
> average ROE in 1981, divided by 3.
> >
>
> Thats an inappropriate way of calculating the three year average, as
it
> fails to account for the fact that there are different numbers
> observations from each year, thus the weights aren't equal. This is
> covered in most basic statistics books. You therefore have two
options,
>
> a) use weights; b) use the raw data. Since -egen newvar = mean()-
> doesn't allow weights I'd be inclined to go with b).
>
> You therefore need to generate a variable that bins your data...
>
> gen year3 = .
> replace year3 = 1 if(year >= 79 & year <= 82)
> replace year3 = 2 if(year >= 83 & year <= 85)
> replace year3 = 3 if(year >= 86 & year <= 88)
> ....
> bysort year3 : egen threeyearaverage = mean(ROE)
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/