Brendan Halpin wrote:
> Is it normal that recode should be very slow with large numbers of
> "rules"? I find a recode statement with >400 value assignments adds
> something of the order of a minute to a job.
>
> N is moderately large (75k) but I wonder if recode is linear in N
> but non-linear in the number of rules or assignments.
>
> If so, any tips for efficiency? Break up the command into several
> smaller recodes? Ship out the equivalences to a lookup table and
> merge?
I don't know how it depends on N and the number of rules, but be aware that
-recode- is merely a wrapper for -generate- and -replace-. Hence -recode-
interprets what the user says, constructs the -generate- and -replace-
commands that are equivalent and let Stata process thru these -generate- and
-replace- statements. Stata it is always faster if you write down the
equivalent -generate- and -replace- commands -- but you will probably want
to add the writing-time to the processing-time.
Personally I never use -recode-. Instead I build -generate/replace statements-
with -inlist()- and -inrange()-, which are fast in writing _and_ processing.
Also note Stata-Tip 16 "Using input to generate variables" (SJ 5,1 134-135)
and Kantor/Cox "Depening on conditions: a tutorial on the cond()
function" (SJ 5,3 405-412)
Shipping out the equivalences to a lookup table and merging is always worth
thinking about.
Many regards
uli
--
[email protected]
+49 (030) 25491-361
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/