Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Re: replace a string variable


From   "Eric G. Wruck" <[email protected]>
To   [email protected]
Subject   Re: st: Re: replace a string variable
Date   Mon, 9 May 2005 10:44:14 -0400

I think this will necessarily involve some manual work & a familiarity with Stata's string functions.  Using the <word()> function will get you part of the way there:


. gen vend = word(vendor,1)

. table vend

--------------------------
         vend |      Freq.
--------------+-----------
      STRYKER |          4
STRYKERITALIA |          1
       STYKER |          1
       SULZER |         10
 SULZERMEDICA |          1
       ZIMMER |          6
--------------------------

.
But as you can see, we also got STYKER, STRYKERITALIA, and SULZERMEDICA.  The Stykers of the world (i.e., typos) are going to cause you the most trouble.  I view this as a necessary part of data analysis.


Eric




>Dear Statalist
>I have a dataset with a string variable named VEND. It contains a lot of different companies with a varied different names although often they indicate the same company.
>For example for three different firms
>
>STRYKER ITALIA SRL
>STRYKER ITALIA SRL -
>STRYKER ITALIA SRL S
>         STRYKER SRL
>       STRYKERITALIA
>   STYKER ITALIA SRL
>              SULZER
>       SULZER MEDICA
>SULZER OR ITALIA SPA
>  SULZER ORTHOPEDICS
>SULZER ORTHOPEDICS I
>   SULZER ORTHPEDICS
>    SULZER ORTOPEDIC
>SULZER ORTOPEDICA IT
>SULZER ORTOPEDICS IT
>       SULZER PROTEK
>        SULZERMEDICA
>              ZIMMER
>    ZIMMER - NEX GEN
>          ZIMMER ARL
>       ZIMMER S.R.L.
>ZIMMER S.R.L.     (C
>          ZIMMER SRL
>
>
>
>Where the names are easily
>STRYKER
>SULZER
>ZIMMER
>
>How can I replace these strings with the same cluster name?
>Do you know if there is a similar command as
>. replace vend if  vend=="zimmer***"
>or I have to build a do file with a lot of -substr- and -index- command
>Thanks in Advance
>Paolo Grillo


-- 

===================================================

       Eric G. Wruck
       Econalytics
       2535 Sherwood Road
       Columbus, OH  43209

       ph:      614.231.5034
       cell:    614.330.8846
       eFax:    614.573.6639
       eMail:   [email protected]
       website: http://www.econalytics.com

====================================================
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index