Great! That's exactly what I needed. Thank you so much
Jennifer. Best,
Mario
---- Original message ----
>Date: Mon, 27 Feb 2006 16:48:03 -0800
>From: "Marino, Jennifer" <[email protected]>
>Subject: st: RE: help cleaning string variable
>To: <[email protected]>
>
>I don't know if it's necessary in Stata 9 - might have been
put into the
>official egen package if it was used enough - but for Stata
8 the
>fabulous ado package -egenmore-, by Dr. Cox, has a tailor-
made option
>for egen called "sieve":
>
>Excerpt from the helpfile:
>
>sieve(strvar) , { keep(classes) | char(chars) | omit
(chars) }
> selects characters from strvar according to a specified
criterion
> and generates a new string variable containing only
those
>characters.
> This may be done in three ways. First, characters are
classified
>using
> the keywords alphabetic (any of a-z or A-Z), numeric
(any of 0-9),
> space or other. keep() specifies one or more of those
classes:
> keywords may be abbreviated by as little as one letter.
Thus keep(a
>n)
> selects alphabetic and numeric characters and omits
spaces and other
>
> characters. Note that keywords must be separated by
spaces.
>Alternatively,
> char() specifies each character to be selected or omit
() specifies
>each
> character to be omitted. Thus char(0123456789.) selects
numeric
> characters and the stop (presumably as decimal point);
omit(" ")
>strips
> spaces and omit(`"""') strips double quotes. (Stata 7
required.)
>
>Hope that helps.
>Jen
>
>
>-----Original Message-----
>From: [email protected]
>[mailto:[email protected]] On Behalf Of
Mario Macis
>Sent: Monday, February 27, 2006 1:44 PM
>To: [email protected]
>Subject: st: help cleaning string variable
>
>
>Dear statalist users,
>I need to clean a string variable containing the names of a
large number
>of firms (over 30,000). In many cases these names contain
extra
>characters that I would like to eliminate, such as % or "
or ^. These
>characters always come at the beginning of the name. I know
that Stata
>has a command (trim) that eliminates leading and trailing
blank spaces
>from string variables. Is there a similar command to
eliminate leading
>"undesired" characters? Thank you so much for your help.
Best, Mario
>
>--
>Mario Macis
>PhD Candidate
>Department of Economics
>University of Chicago
>*
>* For searches and help try:
>* http://www.stata.com/support/faqs/res/findit.html
>* http://www.stata.com/support/statalist/faq
>* http://www.ats.ucla.edu/stat/stata/
>
>*
>* For searches and help try:
>* http://www.stata.com/support/faqs/res/findit.html
>* http://www.stata.com/support/statalist/faq
>* http://www.ats.ucla.edu/stat/stata/
--
Mario Macis
PhD Candidate
Department of Economics
University of Chicago
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/