[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: rationalizing multiple ids for the same name

From	Dalhia <[email protected]>
To	[email protected]
Subject	st: rationalizing multiple ids for the same name
Date	Mon, 17 Aug 2009 21:16:32 -0700 (PDT)

Dear Statalist, I have a question and I am hoping for some help. 

I have a very large dataset of companies over time, and I have two different identifiers for these companies - name and ticker. The problem is that the two identifiers are not always consistent. For instance:

Name, Ticker

AOL Time Warner, AOL
AOL Time Warner, TW
AOL Time Warner, TWX
AOL Time Warner Inc, TWX
AOL Time Warner Inc, T
Microsoft, MS

Basically the first 5 observations provide data about the same entity, AOL Time Warner, and I need a way of recognizing that these are all the same company. What I think will work is to check those names for which multiple tickers exist, and use the ticker which appears in the dataset the most, and put this most frequent ticker in a new variable New_Ticker. Here is how the data should now look: 

Name, Ticker, New_Ticker

AOL Time Warner, AOL, TWX
AOL Time Warner, TW, TWX
AOL Time Warner, TWX, TWX
AOL Time Warner Inc, TWX, TWX
AOL Time Warner Inc, T, TWX
Microsoft, MS, MS

I am unable to figure out how to create this new variable New_ticker, which basically has the most frequently used ticker in cases where the same name has multiple tickers. I will be very grateful for any help on how to create a variable which does the above.

Best
dalhia


      
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: AW: rationalizing multiple ids for the same name
  - From: "Martin Weiss" <[email protected]>

References:
- st: coefficients lost when mata code placed in eclass program
  - From: "Nelson, Carl" <[email protected]>

Prev by Date: Re: st: mi in Stata 11
Next by Date: st: Locals inside of a forvalue loop
Previous by thread: RE: st: coefficients lost when mata code placed in eclass program
Next by thread: st: AW: rationalizing multiple ids for the same name
Index(es):
- Date
- Thread