--
Martyn, this may get you started.
-Steve
**************************CODE BEGINS**************************
clear
drop _all
input str8 id
INR80TMA
IR1NT
INR888
end
list
label define alpha 1 "str" 0 "num"
gen id_l=length(id)
tab id_l
****assumes ID's have length <=8****
forvalues i=1/8{
gen v`i' =substr(id,`i',1)
gen alpha`i'= regexm(v`i',"[a-zA-Z]")
label values alpha`i' alpha
}
***************************CODE ENDS***************************
On Mar 3, 2009, at 7:58 AM, Sherriff, Martyn wrote:
I have data set which should have a string identifier of the form
LLLNNLLL such as INR80TMA from which I can extract the first 3
letters, 2 numbers and last 3 letters as sub-identifiers.
Unfortunately some of the data has been miscoded such as IR1NT.
How can I extract the letter, number, letter code from this, or is
it a case of editing all the codes to the correct format. I am
using Stata 10.
Many thanks,
Martyn
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/