In November Moleps Islon started a thread about -encode-. Martin Weiss,
Sergiy Radyakin and I contributed. The start was
<http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist
.0811/date/article-859.html>
The thread raised two main questions:
1. How to -encode- several variables at once using the same set of value
labels.
2. How to start a set of value labels with a value of 0.
Although some code was posted, no very satisfactory solution was
offered. The key difficulty is that any set of value labels produced may
be untidy, meaning, not ordered alphabetically.
I remembered the problem again when working on a different one with some
similarities. I think I now have a better solution.
The previous code posted was called -mencode- but no help file was never
written and the code was not posted on SSC. If anyone optimistically
copied that code into their own filespace in the thought that it might
come in useful, they are advised that I consider it superseded. There is
still no help file but that will follow. I post the code because people
may have improvements to suggest.
I am now using the name -multencode-. -mencode- is not sufficiently
self-explanatory and in any case too close to -mvencode-.
This program is intended to solve problem 1. Problem 2 is soluble by
tweaking the code but I have to say that it doesn't interest me.
The key idea is simply to use some Mata code to define a tidy set of
value labels ahead of the call to -encode-.
Nick
[email protected]
*! 1.0.0 NJC 16 Jan 2009
program multencode
version 9
syntax varlist(string) [if] [in] , Generate(str) [ label(str)
FORCE ]
marksample touse, novarlist
qui count if `touse'
if r(N) == 0 error 2000
if "`label'" == "" local label : word 1 of `varlist'
if "`force'" == "" {
capture label list `label'
if _rc == 0 {
di as err "{p}value labels `label' already
exist; " ///
"specify -force()- option to overwrite{p_end}"
exit 498
}
}
local nvars : word count `varlist'
local mylist "`varlist'"
local 0 "`generate'"
syntax newvarlist
local generate "`varlist'"
local ngen : word count `generate'
if `nvars' != `ngen' {
di as err "`nvars' variables, but `ngen' new " ///
plural(`ngen', "name")
exit 198
}
if `nvars' == 1 {
encode `mylist' if `touse', gen(`generate')
label(`label')
exit 0
}
mata : get_distinct_vals("`mylist'", "`touse'")
forval j = 1/`J' {
label def `label' `j' `"`lbl`j''"', modify
}
tokenize "`generate'"
local j = 1
foreach v of local mylist {
encode `v' if `touse', gen(``j'') label(`label')
qui compress ``j''
local ++j
}
end
mata :
void get_distinct_vals(string scalar varnames, string scalar tousename)
{
string matrix y
string colvector vals
real scalar j
st_sview(y, ., tokens(varnames), tousename)
vals = J(0, 1, "")
for(j = 1; j <= cols(y); j++) {
vals = vals \ uniqrows(select(y[,j], y[,j] :!= ""))
}
vals = uniqrows(vals)
_sort(vals, 1)
for(j = 1; j <= rows(vals); j++) {
st_local("lbl" + strofreal(j), vals[j,])
}
st_local("J", strofreal(rows(vals)))
}
end
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/