Patrick Joly responded to my suggestion:
> > [email protected] wrote
> > >
> > > I don't know of anything quite like this, but
> > > for once a looping over observations would seem
> > > to solve the problem:
> > >
> > > local N = _N
> > > forval i = 1/`N' {
> > > local val = naics[`i']
> > > local label = labelnaics[`i']
> > > label def naicslab `val' "`label'" , modify
> > > }
> > >
> > > Nick
> > > [email protected]
> >
> > This will work, no doubt. The reason I hadn't considered
> -forvalues- for
> > this purpose was that I wanted to avoid looping over
> observations. Such
> > looping is not so bad with my current data which contains
> approx. 2000
> > observations but may be computationaly intensive and slow
> if I try to
> extend
> > the procedure to situations where labels may take up to
> 65,536 different
> > coding values -- the Stata limit for value labels. I
> tested the loop on a
> > dataset of 30,000 observations and it took 2 minutes to
> complete, which is
> > not the end of the world for the use I'll make of it.
> >
> > But what escaped me in my own proposed solution below is
> that step 2
> (where
> > I would use -file- to substitute a space-character for
> the first comma)
> > would itself require looping over observations (!). I
> will probably go
> with
> > Nick's solution as I don't see anything else for now.
> >
> >
and Michael Blasnik then added
> In trying to automate value label creation from a numeric
> and string var,
> [email protected] wants to avoid an explicit loop over
> n for large
> datasets. Here is one approach that I think works
>
> gen str1 cmd=""
> replace cmd="label define mylab
> "+string(nvar)+char(34)+svar+char(34)+"
> ,modify"
> outsheet cmd using cmd.do, nonames noquote
> drop cmd
>
> Then you can -run cmd-
>
> This approach just builds a string variable that has the
> label define
> commands and uses char(34) to insert quotes. Of course, if you have
> repetitive values of nvar you should first collapse the
> dataset down to one
> obs for each nvar
>
This is a nice trick, but it still looks a loop
over observations to me. In the code suggested
first
local N = _N
forval i = 1/`N' {
local val = naics[`i']
local label = labelnaics[`i']
label def naicslab `val' "`label'" , modify
}
the problem for Stata includes the overhead of
interpreting and implementing (potentially) thousands of
-label- statements; and this is true for Michael's code too.
The trade-off between
(NJC)
managing a -forvalues- loop
putting individual values into -local-s
and
(MB)
-generate- string variable
-outsheet- a .do file
-run- a .do file
is hard for me to foresee, and it will vary somewhat
between platforms as I/O is entailed. Any advice from
Stata Corp? Anyone interested enough to experiment and report
on timings?
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/