This procedure, and sequential application of -encode- as earlier
mentioned, leave open a small problem: the set of value labels produced
need not end up alphabetically ordered, which may offend some ideas of
tidiness.
Sorting that out is also programmable. The resulting utility can also be
used to clone a set of value labels that is already alphabetically
ordered; or to shift the origin to 0 or any other integer, as separately
requested by Moleps.
Here is a utility in that spirit.
*! 1.0.0 NJC 20 Nov 2008
program labvalsort
version 8
syntax namelist(min=2 max=2) [, start(int 1) ]
tokenize "`namelist'"
args exist new
capture label list `new'
if _rc == 0 {
di as err "labels `new' already exist"
exit 198
}
label list `exist'
preserve
uselabel `exist'
tempfile file
tempname out
sort label
file open `out' using `"`file'"', w
forval i = 1/`=_N' {
local I = `start' + `i' - 1
file write `out' "label define `new' `I' "
local lbl = label[`i']
file write `out' `"`lbl', modify"' _n
}
file close `out'
restore
qui do `"`file'"'
di
label li `new'
end
Here are a few examples:
. label def foo 1 "d" 2 "c" 3 "b" 4 "a"
. labvalsort foo bar
foo:
1 d
2 c
3 b
4 a
bar:
1 a
2 b
3 c
4 d
. labvalsort foo bar2 , start(0)
foo:
1 d
2 c
3 b
4 a
bar2:
0 a
1 b
2 c
3 d
Note that _all_ this does is to produce a new set of value labels that
then can be used in a subsequent -encode-.
To put it in a nutshell, -labvalsort- is a solution to the following
problem:
You apply -encode- to various string variables that have overlapping
values, and you want the same set of value labels to be used in
producing a consistent set of numeric variables. But somehow or other
that set of value labels ends up out of alphabetical sequence. Thus, you
need to sort the labels. Then you can use the new sorted set in an
-encode-. (They will only by accident apply to any existing numeric
variables.)
Nick
n.j.cox@durham.ac.uk
Nick Cox
I agree with Martin. Here's a quick stab at it.
*! 1.0.0 NJC 19 Nov 2008
program mencode
version 8
syntax varlist(string) [if] [in] , stub(str) [ label(str) ]
if index("`stub'", "@") == 0 {
di as err "stub must contain @"
exit 198
}
// test the variable names
foreach v of local varlist {
local new : subinstr local stub "@" "`v'"
confirm new var `new'
}
marksample touse, strok
qui count if `touse'
if r(N) == 0 error 2000
if "`label'" == "" local label "`: word 1 of `varlist''"
// do it
foreach v of local varlist {
local new : subinstr local stub "@" "`v'"
encode `v' if `touse', gen(`new') label(`label')
qui compress `new'
}
end
Comments:
-mencode- works on a string varlist.
You may specify -if- or -in-.
You must specify a -stub()-. The stub must include the character @,
which means the present varname. You should add a prefix or suffix or
both. So if your stub is "n@", the new variable names will be prefixed
by "n". -mencode- checks first that the new names implied will be OK.
You may specify a name for the new value labels. If you don't, -mencode-
will use the name of the first variable you specify.
Here's an example:
. l var?
+---------------------------+
| var1 var2 var3 var4 |
|---------------------------|
1. | a b c d |
2. | a b c d |
3. | a b c d |
4. | a b c d |
+---------------------------+
. mencode var?, stub(n@)
. l var? nvar?
+-----------------------------------------------------------+
| var1 var2 var3 var4 nvar1 nvar2 nvar3 nvar4 |
|-----------------------------------------------------------|
1. | a b c d a b c d |
2. | a b c d a b c d |
3. | a b c d a b c d |
4. | a b c d a b c d |
+-----------------------------------------------------------+
. l var? nvar?, nola
+-----------------------------------------------------------+
| var1 var2 var3 var4 nvar1 nvar2 nvar3 nvar4 |
|-----------------------------------------------------------|
1. | a b c d 1 2 3 4 |
2. | a b c d 1 2 3 4 |
3. | a b c d 1 2 3 4 |
4. | a b c d 1 2 3 4 |
+-----------------------------------------------------------+
Martin Weiss
Nick`s contribution makes me think that it is possible to automate this
in
the fashion that you describe...
moleps islon
> So apparently no easy solution to this. The perfect solution would be
> a command that accepted a varlist, automatically generated new
> variables concatenating the old variablename with a userspecified
> _name_ and labeled the values according to a predefined labelset...
> That would also let the user set the startnumber for the codes...
> Gotta learn programming:-)
>
>>>> _______________________
>>>> ----- Original Message ----- From: "moleps islon"
<moleps2@gmail.com>
>>>> To: <statalist@hsphsun2.harvard.edu>
>>>> Sent: Wednesday, November 19, 2008 8:30 PM
>>>> Subject: st: -encode- help..
>>>>
>>>>
>>>>> I've got 30 different text variables that all have the same
possible
>>>>> values. Is there an easy way to encode all 30 variables using the
same
>>>>> label or do I have to do it manually. Also is it possible,
somehow, to
>>>>> specify stata to start encoding with tha value 0 instead of 1 ?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/