Pardon me if this a reposting.. I am not sure the last one made it to
the list as I posted that from the wrong email id.
Hope the following helps.
Step One
Duplicate match as control and rename only the gvkey as c_gvkey.
Step Two
Merge the match and control datasets.
merge 1:m sic2 year rroa using control
Step three
Drop cases where gvkey==c_gvkey (these are the discretionary accruals
of the same firm that you don’t want to include in calculating the
decile average or other decile stat)
drop if gvkey==c_gvkey
Padmakumar
On Tue, Jan 26, 2010 at 8:50 AM, Amy Dunbar
<[email protected]> wrote:
> I want to create a dataset, call it 'match-control' that groups together
> all the observations in the same industry-year-decile combination, but
> excludes the observation's own values for a particular variable. Thus if
> there is only one firm in a decile, that firm will not be in the
> combined sample. The resulting sample has obs= N * (N-1). If there are 3
> firms, there will 3 *2 = 6 in the resulting match_control group for the
> industry_year_decile.
>
> Assume I have a dataset (match) that includes a unique identifier
> (gvkey), industry (sic2), year, and decile (rroa), and the variable of
> interest (dacc); I duplicate the dataset (control) renaming the vars as
> c_gvkey, c_sic2, c_year, c_rroa, and c_dacc. For example, assume that
> for the first decile I have the following for match and the same for
> control, except that the var names have c_:
>
> data = match
> N gvkey year sic2 rroa dacc
> 1 1860 1991 13 0 0.11
> 2 2304 1991 13 0 -0.43
> 3 3949 1991 13 0 -0.31
> 4 5140 1991 13 0 -0.38
> 5 5188 1991 13 0 -0.28
> 6 7159 1991 13 0 -0.16
> 7 8104 1991 13 0 0.04
> 8 8838 1991 13 0 -0.07
> 9 8974 1991 13 0 0.08
> 10 12042 1991 13 0 -0.97
> 11 13183 1991 13 0 -0.28
> 12 15306 1991 13 0 -0.30
> 13 22398 1991 13 0 -0.21
> 14 23129 1991 13 0 -0.27
> 15 27199 1991 13 0 -0.17
>
>
> Consider the SIC2=13 group. The first decile (coded 0) in the sample)
> has 15 firms in 1991. The resulting group is 15 * 14 = 210 firms in the
> match-control sample. The merging for the first gvkey is the following:
>
> data = match_control
> gvkey year sic2 dacc rroa c_dacc
> 1860 1991 13 0.11 0 -0.43
> 1860 1991 13 0.11 0 -0.31
> 1860 1991 13 0.11 0 -0.38
> 1860 1991 13 0.11 0 -0.28
> 1860 1991 13 0.11 0 -0.16
> 1860 1991 13 0.11 0 0.04
> 1860 1991 13 0.11 0 -0.07
> 1860 1991 13 0.11 0 0.08
> 1860 1991 13 0.11 0 -0.97
> 1860 1991 13 0.11 0 -0.28
> 1860 1991 13 0.11 0 -0.30
> 1860 1991 13 0.11 0 -0.21
> 1860 1991 13 0.11 0 -0.27
> 1860 1991 13 0.11 0 -0.17
>
> In SAS, the following code does the trick.
> proc sql;
> create table match_control as select
> match.*, control.c_dacc
> from match, control
> where sic2 = c_sic2
> and year = c_year
> and rroa = c_rroa
> and gvkey ne c_gvkey; /* the magic words */
>
> I have tried merge and joinby, but I have had no success. Can someone
> tell me how to accomplish this one to many merge in Stata?
>
> Thank you.
>
> Amy Dunbar
> University of Connecticut
> School of Business
> Department of Accounting
> 2100 Hillside Road Unit 1041
> Storrs, CT 06269-1041
>
> [email protected]
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/