Maarten's solution is good, but there is no need for the sorting within -yearofbirth-.
. bys yearofbirth: gen byte first = _n == 1
. egen meanwage = mean(wage), by(yearofbirth)
. scatter meanwage yearofbirth if first
It doesn't matter which observation you pick up within each group of -yearofbirth-, as -meanwage- will be the same for all within the same group.
Note that this is also equivalent to what -egen, tag()- does. You could go
egen tag = tag(yearofbirth)
egen meanwage = mean(wage), by(yearofbirth)
scatter meanwage yearofbirth if tag
The -egen- route is strictly inefficient. You make Stata do quite a lot of work, yet in this case there is an equivalent one-liner that gets you there directly. Also, getting familiar with -by:- tricks is a good long-term idea.
Nick
[email protected]
Maarten buis
An intermediate solution that will get the small graph while keeping the
data intact is:
. bys yearofbirth (wage): gen byte first = _n == 1
. egen meanwage = mean(wage), by(yearofbirth)
. scatter meanwage yearofbirth if first
--- Ulrich Kohler wrote:
> . egen meanwage = mean(wage), by(yearofbirth)
> . scatter meanwage yearofbirth
>
> is one possibility. The advantage of this solution is that it keeps your
> data as it is. The disadvantage is that the file size of the graph
> becomes arbitrary large if you have many observations.
>
> A solution that destroys the data in memory but produces smaller graphs
> (in terms of bandwidth) is
>
> . collapse (mean) wage, by(yearofbirth)
> . scatter wage yearofbirth
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/