This problem, or at least a relative of
it, can be attacked, I think, using Roger Newson's -sencode-.
His solution includes a certain amount of file manipulation.
In my version of the problem when I looked
at it two years ago I didn't find any need
for that, but I haven't looked closely enough to work
out what aspects of the problem Roger solves that I
don't or indeed vice versa.
There doesn't seem to be a help file for my resulting program,
but the code is a bit more general than yours.
program seqencode, sortpreserve
*! NJC 1.0.0 1 May 2003
version 8
syntax varname(string) [if] [in], Generate(str) [ Label(str) Unique ]
local limit = cond(c(flavor) == "Small", 1000, 65536)
quietly {
marksample touse, strok
count if `touse'
if r(N) == 0 error 2000
// variable is new?
confirm new variable `generate'
// label is new?
if "`label'" == "" local label "`generate'"
capture label list `label'
if _rc != 111 {
di as err "label `label' already defined"
exit 110
}
if "`unique'" != "" {
// each value `touse' mapped to its own -label-
replace `touse' = -`touse'
sort `touse' `_sortindex'
// define labels
count if `touse'
if `r(N)' > `limit' error 134
forval i = 1 / `r(N)' {
label def `label' `i' ///
`"`= `varlist'[`i']'"', modify
}
gen long `generate' = _n if `touse'
}
else {
// get first occurrences
tempvar first
bysort `touse' `varlist' (`_sortindex') : ///
gen byte `first' = -(_n == 1 & `touse')
sort `first' `_sortindex'
// define labels
count if `first'
if `r(N)' > `limit' error 134
forval i = 1 / `r(N)' {
label def `label' `i' ///
`"`= `varlist'[`i']'"', modify
}
// copy values from first occurrences
gen long `generate' = _n if `touse'
bysort `touse' `varlist' (`generate'): ///
replace `generate' = `generate'[1]
}
compress `generate'
// assign labels
label val `generate' `label'
label var `generate' `"`: variable label `varlist''"'
}
end
Nick
[email protected]
Friedrich Huebler
> When a string variable is converted to a numeric variable with
> -encode-, the numeric values follow the sort order of the string
> variable. I would like to -encode- a string variable based on the
> sort order of another variable. My original data is like this:
>
> var mean
> a 1.5
> b 1.2
> b 1.2
> b 1.2
> c 1.8
> c 1.8
>
> I would like to create the variable "newvar" like this, using the
> sort order of the variable "mean":
>
> var mean newvar (label for newvar)
> b 1.2 1 b
> b 1.2 1 b
> b 1.2 1 b
> a 1.5 2 a
> c 1.8 3 c
> c 1.8 3 c
>
> My solution is shown below. Creating "newvar" itself is simple but
> there must be a better way to assign the labels.
>
> sort mean
> egen newvar = group(mean)
> lab def newvar 1 "temp"
> levels(newvar), local(levels)
> foreach l of local levels {
> gen temp = ""
> replace temp = var if newvar==`l'
> levels(temp), local(templabel)
> lab def newvar `l' `templabel', modify
> drop temp
> }
> lab val newvar newvar
>
> How can this code be improved? Thank you for your suggestions.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/