[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: matched data

From	"Nick Winter" <[email protected]>
To	<[email protected]>
Subject	st: RE: matched data
Date	Mon, 24 Jun 2002 12:42:03 -0400

> -----Original Message-----
> From: Theodoropoulos, N. [mailto:[email protected]] 
> Sent: Monday, June 24, 2002 12:22 PM
> To: [email protected]
> Subject: st: matched data 
> 
> 
> Dear Statalisters,
> 
> I have a matched employer-employee dataset. Each firm is 
> recognised by a unique identifier, and in each firm I have up 
> to 25 employees. From the existing data I want to generate 
> some new variables which capture only firms that employ a 
> positive proportion of old and/or young workers. In other 
> words I want to generate variables by selecting only the 
> firms with some sampled old and/or young employees.
> Also, I have the variables that capture the proportion of old 
> and young workers. 
> Is there a quick way of doing this instead of going through 
> the firms one by one.
> 
> Any hints will be highly appreciated,

I'm not entirely clear what you want to do, but the -by: varlist- and/or
-egen ..., by()-constructs will probably be your friends here.  For
example, to to generate a variable tagging records based on the
proportion of old and young employees in a firm, you might do this:

	. generate old = (age>55) if age!=.		/* or whatever
age */
	. egen tot_old = sum(old), by(firm)
	. egen tot_yng = sum(1-old), by(firm)
	. gen prop_old = tot_old / (tot_old+tot_yng)

--Nick W

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: st: panel data time series models
Next by Date: st: RE: matched data
Previous by thread: st: panel data time series models
Next by thread: st: r-squared after xtgls
Index(es):
- Date
- Thread