How about if you count the letters from the end instead of from the front?
The description for -substr- says:
substr(s,n1,n2) returns the substring of s starting at n1 for a length of n2.
If n1<0, the starting position is interpreted as distance from the end of the
string. If n2 is missing (.), the remaining portion of the string is
returned.
Now it might work if you write:
gen str3 threeletters=substr(originalstring,-4,3)
Hope it helps.
Cheers,
Ada
Dev Vencappa wrote:
Dear users,
I have a dataset that contains several string variables. Suppose one string variable contains the following values in one of the observations:
CMLST(HMV)
Another string variable contains :
COMSTL GTRI(HMV)
I want stata to create a new variable that retrieves only the three-letter word "HMV" from the string variable. I understand that the substr command does that but because the starting point of the word HMV is not the same for every variable , it would be difficult to identify the correct starting position to read the string and retrieve the first three letters HMV.
Can somebody help please? I have tried the other string commands but could not understand one that does what I want.
Many thanks
Dev
--
Ada Ma
Research Assistant
Department of Economics
University of Aberdeen Business School
Edward Wright Building F55
http://www.abdn.ac.uk/economics/
http://www.abdn.ac.uk/~pec187/firstpage.htm
http://www.student.ncl.ac.uk/a.h.y.ma/firstpage.htm
Tel: +44 1224 273417
Fax: +44 1224 272181
Email: [email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/