I reported the 503 character ":word of" limit to Stata Tech Support in
Nov. 2005 and their answer was:
"It turns out that the -:word of- extended macro
function pulls 'tokens' off of the string as words,
and Stata's internal parser has a limit of 503
characters for a 'token'. StataCorp is looking into
changing the -:word of- extended macro function to
not be affected by this limit."
Also see:
http://www.stata.com/statalist/archive/2005-11/msg00643.html
http://www.stata.com/statalist/archive/2005-11/msg01031.html
I'm surprised that it is not fixed yet. However, there are easy
workarounds: It is always possible to use -tokenize- or -gettoken-
instead of -:word of-. (And since Stata 8 you can use the -:list
sizeof- extended macro function instead of -:word count-.)
Note that the -:subinstr- extended macro function has a similar limit,
502 characters this time (-:subinstr- aborts with error if the
replacement strings are longer than this). Unfortunately there seems
to be NO WORKAROUND in this case (apart from switching to Mata). And
this isn't fixed yet either...
ben
On 2/18/07, Newson, Roger B <[email protected]> wrote:
Thanks for Kit for demonstrating that the bug (or feature) also occurs
under Linux. I think I have now found the cause of the problem in a bug
(or feature) of the extended macro function
word # of string
(see online help for extended_fcn). For some reason, this indeed seems
to have a 503-character length limit for the output word, and to treat
characters 504 onwards of a word in a string as being part of subsequent
words of the string. The following code demonstrates this:
**** BEGINNING OF Stata CODE - CUT HERE
local head ""
forvalues i = 1/104 {
local head "`head'1234567890"
}
disp `"`head'"'
local head1: word 1 of `head'
disp `"`head1'"'
local head2: word 2 of `head'
disp `"`head2'"'
local head3: word 3 of `head'
disp `"`head3'"'
local head4: word 4 of `head'
disp `"`head4'"'
local nword: word count `head'
disp "`nword'"
**** END OF Stata CODE - CUT HERE
This shows that Stata thinks that the local macro -head- contains 2
words with 503 characters each and 1 word with 34 characters, instead of
containing 1 word with 1040 characters. This seems to suggest that there
is a 503-character limit on the length of a string token. I cannot find
any documentation of this limit in -help limits- or in -[P] macro-.
I would definitely like to know whether this limit is a bug or a
feature, and if there are any plans to fix it in the near future.
-listtex- uses the extended macro functions
word count string
and
word # of string
to be able to produce multiple header or footer lines if the user
specifies these in the -headlines()- or -footlines()- option. It is a
pity that this ability currently comes at the price of limiting a header
or footer line length to 503 characters. I would like to be able to fix
this problem.
Best wishes
Roger
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/