Ronnie Babigumira wrote:
I have an interesting data management problem. My data look like this
hhid hi_educ
1 11
2 21
3 17
4 16
5 24
6 31
7 32
8 13
9 22
Where hi_educ is the highest level of education for household. From this I would like to extract the number of years of schooling.
Now, for values below 17, the years of schooling is the last digit
for values between 21 and 24, it is 7 + the last digit
for values between 31 and 35 it is 11 + the last digit
What I would like to end up with is something like this
hhid hi_educ years
1 11 1
2 21 8
3 17 7
4 16 6
5 24 11
6 31 12
7 32 13
8 13 3
9 22 9
==============================================================
Joseph Coveney gave a couple of suggestions. Here is a third
suggestion:
. generate years = mod(hi_educ,10)
. replace years = years+7 if hi_educ>20 & hi_educ<25
. replace years = years+11 if hi_educ>30 & hi_educ<36
. list, clean
hhid hi_educ years
1. 1 11 1
2. 2 21 8
3. 3 17 7
4. 4 16 6
5. 5 24 11
6. 6 31 12
7. 7 32 13
8. 8 13 3
9. 9 22 9
It may look clumsier, but I believe it has the advantage to
be more transparent, especially to the less experienced user.
Svend
________________________________________________________
Svend Juul
Institut for Folkesundhed, Afdeling for Epidemiologi
(Institute of Public Health, Department of Epidemiology)
Bartholins Allé 2
DK-8000 Aarhus C, Denmark
Phone, work: +45 8942 6090
Phone, home: +45 8693 7796
Fax: +45 8613 1580
E-mail: [email protected]
_________________________________________________________
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/