[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: PLEASE HELP ME!

From	"Svend Juul" <[email protected]>
To	<[email protected]>
Subject	Re: st: PLEASE HELP ME!
Date	Sun, 20 May 2007 14:53:32 +0200

Frank wrote:
 
Can anyone help me to deal with the following dataset?
I have thought about it for the whole day��.
 
Here is the simplified version of my dataset
 
ID wage jobtype period1 period2 period3 
1. 1 30 1 1 1 0 
2. 1 20 2 1 0 1 
3. 2 40 1 1 1 0 
4. 2 35 1 0 1 1 
5. 2 10 2 0 0 1
 
That is, in this dataset there are N individuals and M periods.
In each period each individual can have either one or more than
one jobs. Each job is accompanied by some wage. 
 
In the real dataset, both N and M are large. But here is one
simplified example with N=2, M=3. period1, period2, period3 are
time dummies. The example says, in the period 1, individual 1
has two jobs (type 1 and type2) and the associated wages are 30
and 20 respectively. In period 2, individual 1 only has one job
which is type 1. In period 3, individual 1 has one job which is
type 2. Similarly, we can read the job and wage information for
individual 2 during these three periods.
 
My questions is, how to write a code to generate some variables
which contain the following information: (1) the number of job
each individual has in each period; (2) the maximum wage for
each individual in each period. That is, I want to obtain the
following information from the above dataset, where, for example,
(2,30) means in period 1 individual 1 has 2 jobs, and the maximum
wage is 30.
 
ID period1 period2 period3
1. 1 (2, 30) (1, 30) (1, 20) 
2. 2 (1, 40) (2, 40) (2, 35)
 
--------------------------------------------------------------
 
Here is a try. This is the testdata:
     +---------------------------------------------------+
     | id   wage   jobtype   period1   period2   period3 |
     |---------------------------------------------------|
  1. |  1     30         1         1         1         0 |
  2. |  1     20         2         1         0         1 |
  3. |  2     40         1         1         1         0 |
  4. |  2     35         1         0         1         1 |
  5. |  2     10         2         0         0         1 |
     +---------------------------------------------------+
 
. // A long format is easier to work with. -reshape- needs
. // a unique identifier for each id-period combination
. gen id1 = _n
. reshape long period , i(id1) j(per)
. 
. // We can drop some observations and id1.
. drop if period==0
. drop id1
 
. // Here we go. 
. // NB! -nvals()- is an unofficial -egenmore- function, 
. // and you may need to:
. //    ssc install egenmore
. sort id per
. by id per: egen maxwage=max(wage)
. by id per: egen njobs=nvals(jobtype)
. by id per: keep if _n==1
. keep id per maxwage njobs
 
. // If you prefer the wide format:
. reshape wide maxwage njobs , i(id) j(per)
. list
     +----------------------------------------------------------------+
     | id   maxwage1   njobs1   maxwage2   njobs2   maxwage3   njobs3 |
     |----------------------------------------------------------------|
  1. |  1         30        2         30        1         20        1 |
  2. |  2         40        1         40        1         35        2 |
     +----------------------------------------------------------------+

Hope this helps
Svend
 
P.S. Try to give informative subject information, it is much
more likely to create interest among the potiental responders.
The is so much suffering in the world, and PLEASE HELP ME!
does not tell me whether it is something I can help with.
 
________________________________________________________ 
 
Svend Juul
Institut for Folkesundhed, Afdeling for Epidemiologi
(Institute of Public Health, Department of Epidemiology)
Vennelyst Boulevard 6 
DK-8000 Aarhus C,  Denmark 
Phone, work:  +45 8942 6090 
Phone, home:  +45 8693 7796 
Fax:          +45 8613 1580 
E-mail:       [email protected] 
_________________________________________________________ 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: st: 95% cofidence intervals of the Gini coefficient
Next by Date: st: FW: generate observations by group/levels
Previous by thread: st: PLEASE HELP ME!
Next by thread: st: Re: elementary texts
Index(es):
- Date
- Thread