I think you want the -ignore- option. You could also transform the original variables:
gen score2 = subinstr(score, "%", "", 1)
gen samplesize2 = subinstr(samplesize, " patients", "", 1)
Hope this helps.
Howie
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Taylor Cook
Sent: Friday, August 21, 2009 11:04 AM
To: [email protected]
Subject: st: destring numeric and non-numeric data
I am working with CMS's Hospital Compare data for the first time. One
of the sets lists recommended treatment for a condition (ex:aspirin
for heart attack), the percent of patients with the condition that
received the treatment (Score), and the total number of patients who
presented with the condition (SampelSize).
The variables I am interested in, Score and SampleSize, are both
string variables and, here is the tricky part, CMS recorded the data
with numeric and non-numeric symbols. For example, all of the scores
are "95%" and the sample size is "106 patients." These percent symbols
and the word "patient" have made it difficult to destring. Any
suggestions would be greatly appreciated.
Thanks,
Taylor
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/