Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: < and > operand in recode


From   Jeph Herrin <[email protected]>
To   [email protected]
Subject   Re: st: RE: < and > operand in recode
Date   Sun, 20 Aug 2006 14:42:23 -0400

Agree about the start integer being non-intuitive, and I suppose
making it an argument to -irecode- would just mean looking it up
that much more often (to recall which argument it was...).



Nick Cox wrote:
This is indeed a further possibility.
-irecode()- is a well-defined Stata function and this gives a concise one-line solution. And the definition is there in the help.
I'll declare prejudices, however. -irecode()- is a function I rarely use, so I would have to look at the help to check the definitions. (The results run 0 up; an equally defensible rule is that
results run 1 up, and I would have to look up to see which was Stata's
choice.) Also, this is to my mind
less transparent than -cond()-.
But these prejudices will not be compelling for all readers, and are mentioned mostly to explain why I didn't think of that.
Nick [email protected]
Jeph Herrin

What about:

. gen newvar = irecode(var,1,2,5,10,.)+1

?

Nick Cox wrote:
Terminology appears to be a small problem here.
I understand = to indicate equality and >, >=, < or <= to indicate inequality. Your contradictory usage
is rather surprising.
That aside, the key point is that -recode- is announced as for recoding categorical variables, meaning in practice categorical variables coded as integers.
-recode- does allow many-to-one mappings, but it really is not a good idea to use it for re-coding a continuous variable. Even though your work-around apparently worked for you, it is no more than a work-around. Also, there are plenty of possible
values between 0 and 0.0001, etc., and testing for equality and inequality with a decimal fraction
is usually problematic.
Now Stata as such doesn't really have any idea
of what a categorical variable is, and thus does not declare your use to be an error, although
there are several good arguments for strictness
in such matters (or at least for a -force- option which shows that you realise exactly what you are doing).
For your coding a perfectly respectable approach is
gen newvar = 1 if var <= 1
replace newvar = 2 if var <= 2 & missing(newvar) replace newvar = 3 if var <= 5 & missing(newvar) replace newvar = 4 if var <= 10 & missing(newvar) replace newvar = 5 if var < . & missing(newvar) replace newvar = . if var == .
That may look long-winded, but it is perfectly explicit and easy to understand.
Another perfectly respectable approach is make use of -inrange(,)-:

gen newvar = 1 if inrange(var,.,1) replace newvar = 2 if inrange(var,1,2) & missing(newvar) replace newvar = 3 if inrange(var,2,5) & missing(newvar) replace newvar = 4 if inrange(var,5,10) & missing(newvar) replace newvar = 5 if inrange(var,10,.) & missing(newvar) replace newvar = . if var == .
although with -inrange()- it is not so transparent what happens in the case of equality with either argument. See the help for -inrange()-.
Yet another perfectably respectable approach is to make use of -cond()-.
gen newvar = cond(var <= 1, 1, cond(var <= 2, 2, cond(var <= 5, 3, cond(var <= 10, 4, cond(var < ., 5, .)))))

That is all one command. Careful layout and use
of a good text editor to check balanced parentheses are recommended.
Personally, for your example problem, I like -cond()- best.
For a discursive tutorial see

SJ-5-3 pr0016 . . Depending on conditions: a tutorial on
the cond() function
. . . . . . . . . . . . . . . . . . . . . . . D.
Kantor and N. J. Cox
Q3/05 SJ 5(3):413--420
     (no commands)
tutorial on the cond() function


Nick [email protected]
b. water

Stata 8.2,

i wanted to recode a variable, which consisted of continuous number, something to the effect of:

<=1 coded 1 (<= i.e. meaning less than or equal to)

1 to <=2 coded 2
2 to <= 5 coded 3
5 to <=10 coded 4
10 coded 5
when i tried to use the equality operands (i.e. < or > in my recode commands, it gives an error message 'unknown el <2 in rule') so after consulting my manual on [R] recode, i managed by recoding:
0.0001/1 = 1
1.0001/2 = 2
.
.
10/1000 = 5
etc

being careful to make sure that the parameters included all the values.

i would appreciate if someone could confirm that equality sign cannot be used in recode. would appreciate it too if anyone can point out an alternative/better way to accomplish the recode.
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index