Naturally I support Svend's general -- and very Stataish -- stance that
Stata should make it
difficult for you to stomp on your data.
But equally it seems to me that the whole purpose of -recode- is to
change your data!
This is what it says:
"-recode- changes the values of numeric variables according to the rules
specified."
I am tempted to say -- which part of "changes the values" is unclear
here?
Otherwise put, -recode- is already a sibling of -replace-. That is its
job.
My guess is that -recode- divides the Stata world. There are probably
many users,
for example, many sociologists using survey data -- for whom it is in
their top 10 commands.
Having read in their data and had a quick look around, just about the
next thing is to
get into recoding. Presumably they get to learn exactly what -recode-
does and internalise
most of its somewhat idiosyncratic syntax. Many probably grew up on
similar commands in
other packages. There are immigrants to Stataland whose third question
is probably
"How do I recode?".
There are probably many other users who use it only occasionally, and
depending on their
preconceptions about what it should do, they may be surprised at what it
actually does.
Chris seems surprised that the command does what it claims. Normally,
that is regarded as a feature,
not a bug or limitation.
For various reasons, principally my tendency to use continuous variables
much more, I rarely
use -recode-. I would rather spell out a sequence of commands using
-generate-, -replace- and -label-.
That is just a question of taste.
An analogy is with -egen-. Many users, led by Bill Gould himself,
seemingly would rather work out
from first principles the manipulations, typically involving a -sort-, a
-by:- and some fancy
footwork with _n and _N, that are equivalent. Given fluency with basics,
that is much faster for them
than finding out whether an appropriate -egen- function exists and
checking its precise syntax.
Here I can sympathise readily with both camps, because I sometimes
commend the one-line solutions
using -egen- for their simplicity and clarity and I sometimes commend
the first principles route
as essential for generality and efficiency.
Nick
[email protected]
Svend Juul
Chris wrote:
in order to fit the distribution of wildtypes/genotypes in my
population, I want to change the order of the values coded in
z1_gene_x.
I'm using the recode command:
. tab z1_gene_x
group(z1_ge |
ne) | Freq. Percent Cum.
------------+-----------------------------------
G | 8 0.17 0.17
T | 4,684 99.83 100.00
------------+-----------------------------------
Total | 4,692 100.00
. recode z1_gene_x 1=2 2=1
(z1_gene_x: 4692 changes made)
. tab z1_gene_x
group(z1_ge |
ne) | Freq. Percent Cum.
------------+-----------------------------------
G | 4,684 99.83 99.83
T | 8 0.17 100.00
------------+-----------------------------------
Total | 4,692 100.00
It seems like recode actually does what it is supposed to.
However, it does not change the label, what is somehow confusing.
According to the manual, recode supports the label option,
but not for just keeping the label...
===============================================================
The manual and the online help says about -recode-'s -label()-
option:
label(name) specifies a name for the value label defined
from the transformation rules. label() may be defined
only with generate() (or its synonym, into()) and
prefix()...
You can define value labels within the -recode- command:
. recode z1 (1=2 "G")(2=1 "T") , generate(z2)
The label name becomes -z2-, unless you specify the -label()-
option:
. recode z1 (1=2 "G")(2=1 "T") , generate(z2) label(z2lab)
Your example illustrates the danger with -recode- without the
-generate()- option: you may change the meaning of codes in a
way that may lead to serious mistakes. I wish that -recode-
required either a generate() option or a -replace- option;
it would be in line with the safety precautions built into
Stata with other commands.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/