Nick and Roger,
Thank you. With -sencode- my original code can be replaced by a
single line.
. sencode var, gen(newvar) gsort(mean)
Friedrich
--- Roger Newson <[email protected]> wrote:
> -sencode- can indeed solve Friedrich's problem (using the -gsort()-
> option
> to encode in an arbitrary order). The current version of -sencode-
> (downloadable from SSC) uses file manipulation only in 2 places:
>
> 1. There is an initial -preserve- and a final -restore, not-, in
> case the
> user presses -Break- in the middle of executing -sencode-.
>
> 2. In order for -sencode- to work if the -label()- option is given
> as an
> existing label, -sencode- uses -label save- to save the existing
> label to a
> temporary file, and then uses -file- to read that temporary file
> and find
> the highest integer with an existing label, so that any additional
> string
> values encoded are allocated integers even higher. I couldn't find
> a better
> way, at least in Stata 7 or 8, to obtain the highest labelled
> integer for
> an existing label.
>
> Roger
>
>
> At 15:46 25/04/2005, Nick Cox wrote (in reply to Friedrich
> Huebler):
> >This problem, or at least a relative of
> >it, can be attacked, I think, using Roger Newson's -sencode-.
> >
> >His solution includes a certain amount of file manipulation.
> >In my version of the problem when I looked
> >at it two years ago I didn't find any need
> >for that, but I haven't looked closely enough to work
> >out what aspects of the problem Roger solves that I
> >don't or indeed vice versa.
> >
> >There doesn't seem to be a help file for my resulting program,
> >but the code is a bit more general than yours.
> >
> >program seqencode, sortpreserve
> >*! NJC 1.0.0 1 May 2003
> > version 8
> > syntax varname(string) [if] [in], Generate(str) [
> Label(str) Unique ]
> >
> > local limit = cond(c(flavor) == "Small", 1000, 65536)
> >
> > quietly {
> > marksample touse, strok
> > count if `touse'
> > if r(N) == 0 error 2000
> >
> > // variable is new?
> > confirm new variable `generate'
> >
> > // label is new?
> > if "`label'" == "" local label "`generate'"
> > capture label list `label'
> > if _rc != 111 {
> > di as err "label `label' already defined"
> > exit 110
> > }
> >
> > if "`unique'" != "" {
> > // each value `touse' mapped to its own
> -label-
> > replace `touse' = -`touse'
> > sort `touse' `_sortindex'
> >
> > // define labels
> > count if `touse'
> > if `r(N)' > `limit' error 134
> > forval i = 1 / `r(N)' {
> > label def `label' `i' ///
> > `"`= `varlist'[`i']'"',
> modify
> > }
> >
> > gen long `generate' = _n if `touse'
> > }
> > else {
> > // get first occurrences
> > tempvar first
> > bysort `touse' `varlist' (`_sortindex') :
> ///
> > gen byte `first' = -(_n == 1 &
> `touse')
> > sort `first' `_sortindex'
> >
> > // define labels
> > count if `first'
> > if `r(N)' > `limit' error 134
> > forval i = 1 / `r(N)' {
> > label def `label' `i' ///
> > `"`= `varlist'[`i']'"',
> modify
> > }
> >
> > // copy values from first occurrences
> > gen long `generate' = _n if `touse'
> > bysort `touse' `varlist' (`generate'):
> ///
> > replace `generate' =
> `generate'[1]
> > }
> >
> > compress `generate'
> >
> > // assign labels
> > label val `generate' `label'
> > label var `generate' `"`: variable label
> `varlist''"'
> > }
> >end
> >
> >Nick
> >[email protected]
> >
> >Friedrich Huebler
> >
> > > When a string variable is converted to a numeric variable with
> > > -encode-, the numeric values follow the sort order of the
> string
> > > variable. I would like to -encode- a string variable based on
> the
> > > sort order of another variable. My original data is like this:
> > >
> > > var mean
> > > a 1.5
> > > b 1.2
> > > b 1.2
> > > b 1.2
> > > c 1.8
> > > c 1.8
> > >
> > > I would like to create the variable "newvar" like this, using
> the
> > > sort order of the variable "mean":
> > >
> > > var mean newvar (label for newvar)
> > > b 1.2 1 b
> > > b 1.2 1 b
> > > b 1.2 1 b
> > > a 1.5 2 a
> > > c 1.8 3 c
> > > c 1.8 3 c
> > >
> > > My solution is shown below. Creating "newvar" itself is simple
> but
> > > there must be a better way to assign the labels.
> > >
> > > sort mean
> > > egen newvar = group(mean)
> > > lab def newvar 1 "temp"
> > > levels(newvar), local(levels)
> > > foreach l of local levels {
> > > gen temp = ""
> > > replace temp = var if newvar==`l'
> > > levels(temp), local(templabel)
> > > lab def newvar `l' `templabel', modify
> > > drop temp
> > > }
> > > lab val newvar newvar
> > >
> > > How can this code be improved? Thank you for your suggestions.
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/