Dear statalist users,
I have two dataset, A and B.
A) number, sex, age, citizen
B) number, sex, civil status, children
I have to form a new dataset but number and sex do not uniquely
identify observations.
Number of observacions of A can be >,< or = Number of observations of B.
What's better, merge or joinby?
Bye,
Sebastian.
2008/3/5, Nick Cox <[email protected]>:
> Two minute comments only:
>
> local assets=qassets
>
> this looks wrong: qassets[`i'] ?
>
> quietly:use compustat if `qtr'=obsqtr & `sic'=sic3 &
> qsales<=1.15*`sales'/*
> */ &
> qsales>=.85*`sales'&qassets<=1.2*`assets'&qassets>=.85*`assets', clear
>
> tests for equality are ==, not =
>
> Malcolm Wardlaw
>
> I wanted to pose this question to Statalist regarding matching data to a
>
> range of values instead of exact values. I kind of asked this question
> before, but I realized from the response that my question was somewhat
> ill formed, so I'll try to be as explicit as possible. I will use an
> example to illustrate the question.
>
> Let's say I want to do a long-run event study on the changes in real
> growth of companies. In order to do this, I need to appropriately match
>
> the company I am running the event study on to a group of comparable
> companies. For this, I need a matched dataset of all companies that
> match in a range of accounting variables.
>
> The match occurs as follows. I have a data set (1) containing all of
> the companies I wish to perform the event study on. I need to then
> create a dataset (2) that contains matching companies from a dataset of
> the larger Compustat universe of all firms (3). To do this, I need to
> gather all firms that have the same SIC code, sales that are between 15%
>
> and -15% of the event company, and assets that are between 20% and -20%
> of the event company in the quarter of the event. The new dataset must
> also have a marker for each of these group of sample firms that
> corresponds to the event firm.
>
> Here is how I originally dealt with the problem. In the program, Stata
> is continually cycling through the data, loading part of another dataset
>
> into memory, appending it to another dataset from disk, saving that
> dataset to disk, and then reloading the original dataset from disk each
> time. It works, but it seems very inefficient.
>
> Is there a best practice on how to do this, or is this basically as good
>
> as it's going to get?
>
> ---------------------------------------
> local num = _N
> forval i = 1/`num' {
> /*The sales of Event Company i*/
> local sales=sales[`i']
> /*The quarter of the observation*/
> local qtr=eventquarter[`i']
> /*SIC code*/
> local sic=sic3[`i']
> /*Assets of the event company*/
> local assets=qassets
> /*A code that uniquely tags the event*/
> local code=code[`i']
> quietly:use compustat if `qtr'=obsqtr & `sic'=sic3 &
> qsales<=1.15*`sales'/*
> */ &
> qsales>=.85*`sales'&qassets<=1.2*`assets'&qassets>=.85*`assets', clear
> gen code=`code'
> append using comparables
> quietly:save comparables,replace
> use events
> }
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/