Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: problem referencing certain characters


From   "Nick Winter" <[email protected]>
To   <[email protected]>
Subject   st: RE: problem referencing certain characters
Date   Thu, 21 Nov 2002 16:51:32 -0500

> -----Original Message-----
> From: [email protected] [mailto:[email protected]] 
> Sent: Thursday, November 21, 2002 4:30 PM
> To: [email protected]
> Subject: st: problem referencing certain characters
> 
> 
> I am writing a program to clean up some data and I wish to 
> delete certain
> characters from my data.  I have been using the subinstr 
> command with the
> following syntax:
>      replace `var' = subinstr(`var',"(removed character)","",.);
> This works exactly as I intend, except for the double quotes 
> character (")
> and the question mark (?).  I referenced the stata manual and 
> although the
> issue is addressed, their solution does not seem to work for 
> me.  They say
> to use the compound double quotes (`" "').  But doing so makes no
> difference for me.  Stata still is unable to correctly read 
> the following
> code:
>      replace `var' = subinstr(`var',`"""',"",.);
>      replace `var' = subinstr(`var',"?","",.);
> I have no idea why it wont work for the question mark either.
> If anyone has experienced similar problems or knows of a 
> solution, please
> inform.
> Thank you,
> Kyle

I don't have any trouble with the quote characters:


. clear

. set obs 1
obs was 0, now 1

. gen str10 junk=`"abc"abc"'

. list

           junk
  1.    abc"abc

. gen str20 j2=subinstr(junk,`"""',"",.)

. list

           junk                    j2
  1.    abc"abc                abcabc

. 

The question marks, however, may not be working for you because stata
displays many unprintable characters as a question mark:

. replace junk=char(22)
(1 real changes made)

. list

           junk                    j2
  1.          ?                abcabc

. list if junk=="?"

           junk                    j2

I don't know if the character mappings vary across OSs, but on my Wintel
machine, the various non-displayabel characters are numbered 1-32 and
127-255.  So you could do this:

foreach var in [VARIALBE LIST] {
	forval i=1/32 {
		local x = char(`i')
		quietly replace `var' = subinstr(`var',`"`x'"',"",.)
	}
	forval i=127/255 {
		local x = char(`i')
		quietly replace `var' = subinstr(`var',`"`x'"',"",.)
	}
}

This would strip all the unprintable characters out, one-by-one.

--Nick Winter


> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index