There are four new string functions: match(),
subinstr(), subinword(), and
reverse().
match(s1,s2)
returns 1 if string s1 "matches"
s2. In the match, * in
s2 is understood to mean zero or more characters go
here, and ? is understood to mean one character goes
here. match("this","*hi*") is true. In
s2, \\, \?, and
\* can be used if you really want a \,
?, or * character.
subinstr(s1,s2,s3,n)
and
subinword(s1,s2,s3,n)
substitute the first n occurrences of s2 in
s1 with s3.
subinword() restricts "occurrences" to be occurrences
of words. In either, n may be coded as missing value, meaning
to substitute all occurrences. For instance,
subinword("measure me","me","you",.) returns "measure
you", and subinstr("measure me","me","you",.) returns
"youasure you".
reverse(s) returns s turned around.
reverse("string") returns "gnirts".
A fifth new string function is really intended for programmers:
abbrev(s,n) returns the
n-character ~-abbreviation of the variable name
s. abbrev(s,12) is the function used
throughout Stata to make 32-character names fit into 12 spaces.
The new functions inrange() and
inlist() make choosing the right observations easier.
inrange() handles missing values elegantly when
selecting subsamples such as a <= x <= b.
inrange(x,a,b) answers the
question, "Is x known to be in the range a to b?"
Obviously, inrange(.,1000,2000) is false. a or
b may be missing. inrange(x,a,.)
answers whether it is known that x >= a, and
inrange(x,.,b) answers whether it is
known that x <= b. inrange(.,.,.)
returns 0 which, if you think about it, is inconsistent but is
probably what you want.
inlist(x,a,b,...) selects
observations if x=a or x=b or ....
Other functions have been added. _by(),
_bylastcall(), and _byindex() deal
with making programs and ado-files allow the by
varlist:prefix.