In case Mark's command isn't exactly what you need (I think it is),
below is how I would go about it if I had to do it from scratch. It
won't be the most economical thing, but I like breaking up operations
in blocks that make things easy for people to follow. I would do two
separate merges.
Say I have a file of average daily stock prices, A, with three
variables: ticker symbol, stockprice, date. And I have a CRSP header
file, B, with four variables: company CRSP ID, ticker symbol, date
ticker symbol starts, date ticker symbol ends. That way the same
company name may be associated with several ticker symbols.
I would first collect a list of the ticker symbols in A:
keep ticker
duplicates drop
sort ticker
save Atickers, replace
Then I would merge it with the B file, and
keep if _merge==3
keep ticker crsp_id
sort ticker
save Atickers, replace
That would give me the list of crsp_id's in the A file, which is
shorter than or equal to the list of ticker symbols:
tab crsp_id ticker
At this point I would take each crsp_id separately and merge it again
to the A file by ticker. Then keep if _merge==3 would give me the
stock prices of only that company regardless of how its ticker symbol
may have changed over that time span.
I would then append these individual company files and drop the ticker
symbols because now I have company_id's instead, and that's all I
want.
If you have a reasonable number of companies you care about, this
merge-keep-merge-keep-append succession won't take Stata very long to
complete. A do-file that would provide a general solution -- in case
the list of companies in file A changes frequently -- would be easy to
write.
Gabi
On 4/17/08, Malcolm Wardlaw <[email protected]> wrote:
> I have a data manipulation question about 1-to-many merging based on tickers
> and date ranges. It's similar to a previous question, but a much simpler
> operation and a much more common operation for me. I thought I had solved
> this problem, but I can't for the life of me figure it out again.
> Basically I have lots of observations by date and ticker in Dataset(A) .
> Tickers are only unique for companies in the CRSP header file for specific
> date ranges. The CRSP header file provides for a completely unique company
> identifier, matching the ID up to the ticker symbol and providing a <start>
> and <end> date for the period that the ticker is a valid match for that ID.
> So, I need to merge using the ticker, where the date in Dataset(A) is in
> between the <start> and <end> dates.
>
> Someone had suggested -nearmrg-, which kind of works, but it seems a bit
> squirrely for what I'm doing. Plus, I'm unfortunately still on Stata 9. I
> think I read some comment on the archives somewhere about creating 'bins',
> but I couldn't tell what they were talking about.
>
> This seems like such a common problem, I figured there must be a stock way
> to handle this.
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/