Malcolm Wardlaw <[email protected]>:
Reading your post again, it seems clear to me that you should use some
variety of matching program; what do you gain by matching a company to
*all* other companies in the universe in some bin, as opposed to
matching to the nearest k neighbors using e.g. -nnmatch- or
-psmatch2-?
On Wed, Mar 5, 2008 at 8:40 PM, Austin Nichols <[email protected]> wrote:
> Malcolm Wardlaw <[email protected]>
> Do you want to keep multiple observations per "event" company on the
> "matching" companies (with two ID variables defining companies and
> groups of companies)? Or do you want some aggregate measure of
> matching companies, such as mean assets? For the latter problem, I
> prefer to merge without a matching variable, then loop over
> observations as shown at
> http://www.stata.com/statalist/archive/2007-01/msg00079.html
>
> "each of these group of sample firms " sounds like the former problem;
> a many-to-one almost-nearest-neighbor problem.
>
> You may want -findit nearmrg- or -findit nnmatch-
>
>
>
> On Wed, Mar 5, 2008 at 3:13 PM, Malcolm Wardlaw <[email protected]> wrote:
> > I wanted to pose this question to Statalist regarding matching data to a
> > range of values instead of exact values. I kind of asked this question
> > before, but I realized from the response that my question was somewhat
> > ill formed, so I'll try to be as explicit as possible. I will use an
> > example to illustrate the question.
> >
> > Let's say I want to do a long-run event study on the changes in real
> > growth of companies. In order to do this, I need to appropriately match
> > the company I am running the event study on to a group of comparable
> > companies. For this, I need a matched dataset of all companies that
> > match in a range of accounting variables.
> >
> > The match occurs as follows. I have a data set (1) containing all of
> > the companies I wish to perform the event study on. I need to then
> > create a dataset (2) that contains matching companies from a dataset of
> > the larger Compustat universe of all firms (3). To do this, I need to
> > gather all firms that have the same SIC code, sales that are between 15%
> > and -15% of the event company, and assets that are between 20% and -20%
> > of the event company in the quarter of the event. The new dataset must
> > also have a marker for each of these group of sample firms that
> > corresponds to the event firm.
> >
> > Here is how I originally dealt with the problem. In the program, Stata
> > is continually cycling through the data, loading part of another dataset
> > into memory, appending it to another dataset from disk, saving that
> > dataset to disk, and then reloading the original dataset from disk each
> > time. It works, but it seems very inefficient.
> >
> > Is there a best practice on how to do this, or is this basically as good
> > as it's going to get?
> >
> > ---------------------------------------
> > local num = _N
> > forval i = 1/`num' {
> > /*The sales of Event Company i*/
> > local sales=sales[`i']
> > /*The quarter of the observation*/
> > local qtr=eventquarter[`i']
> > /*SIC code*/
> > local sic=sic3[`i']
> > /*Assets of the event company*/
> > local assets=qassets
> > /*A code that uniquely tags the event*/
> > local code=code[`i']
> > quietly:use compustat if `qtr'=obsqtr & `sic'=sic3 &
> > qsales<=1.15*`sales'/*
> > */ &
> > qsales>=.85*`sales'&qassets<=1.2*`assets'&qassets>=.85*`assets', clear
> > gen code=`code'
> > append using comparables
> > quietly:save comparables,replace
> > use events
> > }
> > ---------------------------------------
> > *
> > * For searches and help try:
> > * http://www.stata.com/support/faqs/res/findit.html
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
> >
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/