[email protected] wrote
>
> I don't know of anything quite like this, but
> for once a looping over observations would seem
> to solve the problem:
>
> local N = _N
> forval i = 1/`N' {
> local val = naics[`i']
> local label = labelnaics[`i']
> label def naicslab `val' "`label'" , modify
> }
>
> Nick
> [email protected]
This will work, no doubt. The reason I hadn't considered -forvalues- for
this purpose was that I wanted to avoid looping over observations. Such
looping is not so bad with my current data which contains approx. 2000
observations but may be computationaly intensive and slow if I try to extend
the procedure to situations where labels may take up to 65,536 different
coding values -- the Stata limit for value labels. I tested the loop on a
dataset of 30,000 observations and it took 2 minutes to complete, which is
not the end of the world for the use I'll make of it.
But what escaped me in my own proposed solution below is that step 2 (where
I would use -file- to substitute a space-character for the first comma)
would itself require looping over observations (!). I will probably go with
Nick's solution as I don't see anything else for now.
My initial posting was
> [email protected]
> >
> > Has anyone written a routine to define a value label
> > mapping based on two
> > variables (one numeric, one string)? Several searches
> > using findit did not
> > turn up anything. Normally, I wouldn't bother asking but I
> > would find it
> > surprising if no one had tried to automate this type of task before.
> >
> > For instance, I have many data files comprising industrial
> > classification
> > codes which I routinely merge to data for various projects.
> > One such
> > example is my file for the NAICS (North American Industry
> > Classification
> > System) Codes, which contains the numerical variable
> > _naics_ and string
> > variable _labelnaics_
> >
> > . list
> >
> > naics labelnaics
> > 1. 11 Agriculture, Forestry, Fishing and Hunting
> > 2. 111 Crop Production
> > 3. 1111 Oilseed and Grain Farming
> > 4. 11111 Soybean Farming
> > 5. 111110 Soybean Farming
> > 6. 11112 Oilseed (except Soybean) Farming
> > 7. 111120 Oilseed (except Soybean) Farming
> > 8. 11113 Dry Pea and Bean Farming
> > ...
> >
> > I would like to define a value label such that each value
> > of _naics_ would
> > map to the corresponding value of _labelnaics_. Encode
> > will not do the
> > trick here since it would be equivalent to:
> >
> > label define naicslab 1 "Agriculture, Forestry, Fishing
> > and Hunting"
> > label define naicslab 2 "Crop Production", add
> > label define naicslab 3 "Oilseed and Grain Farming", add
> > ...
> >
> >
> > I could write my own .ado to create a do file similar to
> > those generated by
> > -label save ...-. This could be done by:
> > - sending the data to a comma-delimited file via -outsheet-
> > - using the -file-command, replacing the first comma by a
> > space character
> > - prefixing each line by -label define-
> > - suffixing by -, modify-
> >
> > But before I do that, has anyone seen such an .ado?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/