Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: help cleaning string variable


From   Mario Macis <[email protected]>
To   [email protected]
Subject   Re: st: RE: help cleaning string variable
Date   Mon, 27 Feb 2006 19:21:38 -0600

Great! That's exactly what I needed. Thank you so much 
Jennifer. Best,
Mario

---- Original message ----
>Date: Mon, 27 Feb 2006 16:48:03 -0800
>From: "Marino, Jennifer" <[email protected]>  
>Subject: st: RE: help cleaning string variable  
>To: <[email protected]>
>
>I don't know if it's necessary in Stata 9 - might have been 
put into the
>official egen package if it was used enough - but for Stata 
8 the
>fabulous ado package -egenmore-, by Dr. Cox, has a tailor-
made option
>for egen called "sieve": 
>
>Excerpt from the helpfile:
>
>sieve(strvar) , { keep(classes) | char(chars) | omit
(chars) } 
>    selects characters from strvar according to a specified 
criterion 
>    and generates a new string variable containing only 
those
>characters. 
>    This may be done in three ways. First, characters are 
classified
>using
>    the keywords alphabetic (any of a-z or A-Z), numeric 
(any of 0-9), 
>    space or other. keep() specifies one or more of those 
classes: 
>    keywords may be abbreviated by as little as one letter. 
Thus keep(a
>n) 
>    selects alphabetic and numeric characters and omits 
spaces and other
>
>    characters. Note that keywords must be separated by 
spaces.
>Alternatively, 
>    char() specifies each character to be selected or omit
() specifies
>each
>    character to be omitted. Thus char(0123456789.) selects 
numeric 
>    characters and the stop (presumably as decimal point); 
omit(" ")
>strips 
>    spaces and omit(`"""') strips double quotes. (Stata 7 
required.) 
>
>Hope that helps.
>Jen
>
>
>-----Original Message-----
>From: [email protected]
>[mailto:[email protected]] On Behalf Of 
Mario Macis
>Sent: Monday, February 27, 2006 1:44 PM
>To: [email protected]
>Subject: st: help cleaning string variable
>
>
>Dear statalist users,
>I need to clean a string variable containing the names of a 
large number
>of firms (over 30,000). In many cases these names contain 
extra
>characters that I would like to eliminate, such as % or " 
or ^. These
>characters always come at the beginning of the name. I know 
that Stata
>has a command (trim) that eliminates leading and trailing 
blank spaces
>from string variables. Is there a similar command to 
eliminate leading
>"undesired" characters? Thank you so much for your help. 
Best, Mario
>
>--
>Mario Macis
>PhD Candidate
>Department of Economics
>University of Chicago
>*
>*   For searches and help try:
>*   http://www.stata.com/support/faqs/res/findit.html
>*   http://www.stata.com/support/statalist/faq
>*   http://www.ats.ucla.edu/stat/stata/
>
>*
>*   For searches and help try:
>*   http://www.stata.com/support/faqs/res/findit.html
>*   http://www.stata.com/support/statalist/faq
>*   http://www.ats.ucla.edu/stat/stata/

--
Mario Macis
PhD Candidate
Department of Economics
University of Chicago
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index