The -reshape- idea is good. It can be slimmed
slightly:
reshape long w_job, i(ID) j(jobnum)
bysort id (w_job): gen Number_job = jobnum[_N]
reshape wide w_job, i(ID) j(jobnum)
but this solution, like Kelvin's, will return
any missing for each -id- as the maximum.
There are various ways around that; just be warned.
The calculation can also be done without a
-reshape-. We can also trap any missings that way.
gen max = w_job1
gen Number_job = cond(max == ., ., 1)
qui forval i = 2/4 {
replace max = max(max, w_job`i')
replace Number_job = `i' if max == w_job`i'
}
Nick
[email protected]
Kelvin Foo
reshape long w_job, i(ID) j(jobnum)
gsort id -w_job
by id: gen Number_job=jobnum[1]
reshape wide w_job, i(ID) j(jobnum)
Hui Wang
> I have a dataset. Each individual can at most have 4 jobs
(job1-job4). For
> each individual I know the hourly wage for each job (w_job1-w_job4). I
> wonder how I can creat a variable indicating that which job has the
highest
> wage. Refer to the following example. The variable I want to creat is the
> 'Number_job'
>
> ID w_job1 w_job2 w_job3 w_job4 Number_job
> 1 10 100 0 0 2
> 2 120 30 10 0 1
> 3 20 90 80 70 2
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/