Svend Juul also made a suggestion. The code will be a little messy
however this is done. Yet another possibility is
gen education_yrs =
cond(hi_edu > 30, 11 + mod(hi_edu, 10),
cond(hi_edu > 20, 7 + mod(hi_edhu, 10),
mod(hi_edu, 10)))
That is one line, syntactically. It may be parsed using this pseudocode
(which quite accidentally is rather Mata-like)
if (hi_edu > 30) y = 11 + mod(hi_edu, 10)
else if (hi_edu > 20) y = 7 + mod(hi_edu, 10)
else y = mod(hi_edu, 10)
People can agree to disagree here, as the differences are ones of style.
I like all the solutions I've seen, ugly ducklings one all.
Nick
[email protected]
Joseph Coveney
There are probably better ways, but something like that below should do
it.
(Note that I'd normally prefer something more like
generate byte education _yrs = mod(hi_edu, 10) + ///
7 * inrange(hi_edu, 21, 24) + ///
11 * inrange(hi_edu, 31, 35)
because it would be easier to maintain--more self-documenting--but
there's an
outside chance that it is somewhat slower in execution, perhaps even
noticeably so if you've got a very large amount of data.)
. clear *
. set more off
. input hhid hi_educ years
hhid hi_educ years
1. 1 11 1
2. 2 21 8
3. 3 17 7
4. 4 16 6
5. 5 24 11
6. 6 31 12
7. 7 32 13
8. 8 13 3
9. 9 22 9
10. end
. generate byte education_yrs = mod(hi_educ, 10) + ///
> 7 * floor(hi_educ / 20) + ///
> 4 * floor(hi_educ / 30)
. list, noobs separator(0)
+-----------------------------------+
| hhid hi_educ years educat~s |
|-----------------------------------|
| 1 11 1 1 |
| 2 21 8 8 |
| 3 17 7 7 |
| 4 16 6 6 |
| 5 24 11 11 |
| 6 31 12 12 |
| 7 32 13 13 |
| 8 13 3 3 |
| 9 22 9 9 |
+-----------------------------------+
. exit
Ronnie Babigumira wrote:
> I have an interesting data management problem. My data look like this
[see below]
> Where hi_educ is the highest level of education for household. From
this I
> would like to extract the number of years of schooling.
>
> Now, for values below 17, the years of schooling is the last digit
> for values between 21 and 24, it is 7 + the last digit
> for values between 31 and 35 it is 11 + the last digit
>
> What I would like to end up with is something like this
>
> hhid hi_educ years
> 1 11 1
> 2 21 8
> 3 17 7
> 4 16 6
> 5 24 11
> 6 31 12
> 7 32 13
> 8 13 3
> 9 22 9
>
> I am stuck here
> gen str3 test = ""
> replace test = substr(string(hi_educ), -1,.) if
inrange(hi_educ,11,17)
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/