Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Re: removing characters from string-formatted variables mixed in with numeric-formatted variables
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Re: removing characters from string-formatted variables mixed in with numeric-formatted variables
Date
Fri, 22 Jun 2012 19:56:42 +0100
-destring- just ignores numeric variables any way, so the extra code
is not needed. That is to explain why the difference makes no
difference.
Nick
On Fri, Jun 22, 2012 at 5:06 PM, Doug Hess <[email protected]> wrote:
> Somebody replied off list to me with the following code which worked
> wonderfully. The -ds- command is for listing "variables matching name
> patterns or other characteristics." (See -help ds- ). This is only a
> minor difference from the suggestion by Elan below, which also works.
>
> ds *, has(type string)
> display "`r(varlist)'"
> destring `r(varlist)', replace ignore("'")
> On Fri, Jun 22, 2012 at 11:28 AM, Cohen, Elan <[email protected]> wrote:
>> Doug,
>>
>> I believe the following one-liner should work for you:
>>
>> destring *, replace ignore("'")
>>
>> HTH,
>>
>> - Elan
>>
>>
>> -----Original Message-----
>> From: [email protected] [mailto:[email protected]] On Behalf Of Doug Hess
>> Sent: Friday, June 22, 2012 11:03
>> To: [email protected]
>> Cc: Doug Hess
>> Subject: st: removing characters from string-formatted variables mixed in with numeric-formatted variables
>>
>> Hello,
>>
>> I imported into Stata from text files a data set of survey responses
>> for a large national survey. Many of the variables have single quotes
>> around numeric values. For instance, a variable may include the values
>> '-9', '1', '2' instead of simply -9, 1, 2. However, not every
>> variable includes these characters for numeric values. (Not sure why!)
>> Thus, Stata formats some variables as string and some as numeric
>> during the import (using the import "text data from a spreadsheat"
>> menu). However, the order of the variables is not strings first,
>> numeric second. It's all hodgepodge.
>>
>> I want to remove all the stray single quote marks. So, after poking
>> around on Statalist I tried using the -replace- command, the
>> -subinstr- function, and a loop:
>>
>> local abc = "control bedrms region smsa metro3 lmed lmeda lmedb fmr"
>> /* Note I truncated this list, there are dozens of variables in the
>> dataset I wish to clean up. */
>> foreach varname of local abc {
>> replace `varname'=subinstr(`varname',"'","",.)
>> destring `varname', replace
>> }
>>
>> However, this loop stops when it runs into a variable formatted as
>> numeric. Given that there are dozens of these variables, I don't want
>> to use the -order- command one by one to put the string variables
>> first (or last). Is there a way to use the format of the variables
>> with -if- to limit the -order- command or -replace- command? Or other
>> ideas?
>>
>> Thank you. (Note: I subscribe to the list's digest mode, so cc'ing me
>> on any responses would be helpful.)
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/