My impression is that you know all the pertinent
Stata commands in this area, so I am reduced
to focusing on "readability" itself.
The question in return has to be: readable by
whom, or more precisely by what class of program
reader? what can be assumed of their Stata
experience and competence?
Examples:
0. You don't say anything about comments.
If inscrutable code is your concern, tuning
a comment to give an explanation at the
right level is the simplest thing.
1. As a Stata user who also writes programs,
I find the use of a local macro and of an extended
macro function unthreatening, but I still have
to work a smidgen too much to realise that
`: val l y'
is a minimal abbreviation of
`: value label y'
Having also worked with highly concise
languages, such as J, in which almost
all primitives are one or two characters
long, I have to say that readability
does often hinge on avoiding minimal
abbreviations.
(That's why, I think, most of us do write -gen-
even though we would save many, many
keystrokes by writing -g-. Roger Newson
always writes -gene-, but then he was once
a real biologist, so that perhaps is why.)
2. The construct
"label":valuelabelname
is prominently documented, but my impression
is that it is fairly rarely used. (Shoot that
down, readers, if you use it all the time.)
So, what can be done here? Well, doing
a -decode- on the fly and testing a string
value might be much more readable
to some readers. Naturally, that is an
extra variable at least briefly, and so forth.
Nick
[email protected]
Phil Schumm
> Suppose I have a numeric variable y, and suppose it has an attached
> value label which maps 1 -> "yes". Now, suppose I want to count
> those cases for which y "is" yes (I'm clearly using "is" in
> the loose
> sense here, but I think the meaning is clear from the context). I
> could of course type:
>
> count if y == 1
>
> however if the encoding of the variable changes (i.e., if the
> value 1
> no longer refers to "yes"), this will (silently) no longer give me
> what I want. Alternatively (and in many cases better), I could write:
>
> count if y == "yes":y_label
>
> where y_label is the name of the corresponding value label. In this
> case, if the encoding changes I'm still ok (assuming that "yes" is
> still used in the label). But what if the *name* of the value label
> changes? I could do this:
>
> count if y == "yes":`:val l y'
>
> but that's a bit ugly. So, my question is, is there a more readable
> alternative to the expression above?
>
> For those interested in the use case, I am working on a large
> project
> involving a complicated data set and a lot of code to manage the
> data. Right now, most variables have a dedicated value label with
> the same name as the variable itself. However, in the
> future, we may
> wish to economize by replacing multiple labels that have equivalent
> definitions with a single label. This new label will clearly have a
> different name than the previous label(s) for at least one of the
> variables involved. What I'm trying to do is to make sure that the
> code downstream will not break if such changes are made.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/