> -----Original Message-----
> From: Theodoropoulos, N. [mailto:[email protected]]
> Sent: Monday, June 24, 2002 12:22 PM
> To: [email protected]
> Subject: st: matched data
>
>
> Dear Statalisters,
>
> I have a matched employer-employee dataset. Each firm is
> recognised by a unique identifier, and in each firm I have up
> to 25 employees. From the existing data I want to generate
> some new variables which capture only firms that employ a
> positive proportion of old and/or young workers. In other
> words I want to generate variables by selecting only the
> firms with some sampled old and/or young employees.
> Also, I have the variables that capture the proportion of old
> and young workers.
> Is there a quick way of doing this instead of going through
> the firms one by one.
>
> Any hints will be highly appreciated,
I'm not entirely clear what you want to do, but the -by: varlist- and/or
-egen ..., by()-constructs will probably be your friends here. For
example, to to generate a variable tagging records based on the
proportion of old and young employees in a firm, you might do this:
. generate old = (age>55) if age!=. /* or whatever
age */
. egen tot_old = sum(old), by(firm)
. egen tot_yng = sum(1-old), by(firm)
. gen prop_old = tot_old / (tot_old+tot_yng)
--Nick W
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/