Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: strings
From
Nick Cox <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
st: RE: strings
Date
Wed, 1 Feb 2012 12:23:08 +0000
Answers imbedded below. In general, "forcing" Stata is not a good way to think!
Nick
[email protected]
KOTa
i have 2 questions:
1.
i am using split command to divide my string variables into parts, is
there any way to force the split only by last occurrence of the split
sequence?
e.g. if strings are like "ABC BLINCAR COMPANY INC" and i want remove
the "INC" from all the strings. if i use split, p(INC) i will get "ABC
BL" instead of "ABC BLINCAR COMPANY".
NJC>>> That's not really a -split- problem. "INC" is not a string separator here. I am credited as the original author of -split- so I can tell you that it was not designed for this.
The easiest recipe (!) I can think of is
gen reversed = reverse(company)
replace reversed = subinstr(reverse, "CNI ", "", 1) if substr(reversed, 1, 4) == "CNI "
replace company = reverse(reversed)
That zaps " INC" if and only if it is the last four characters of your variable.
The three commands above could be telescoped into one with some loss of clarity.
I can believe that this may not delete all you want to delete.
2. is there any way to force stata to ignore letters case when
comparing strings?
e.g. if i merge 2 files by string variable i want that name "ROGER"
and name "Roger" would be recognized as the same string
NJC>>> In general, you have to clean up inconsistencies before -merge-. -merge- has a difficult enough job as it is!
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/