Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Merging annual data and monthly data few months after year end


From   Nick Cox <[email protected]>
To   "[email protected]" <[email protected]>
Subject   Re: st: Merging annual data and monthly data few months after year end
Date   Sun, 19 Jan 2014 01:39:15 +0000

As I wrote in an earlier post

"In running a block of code, Stata should stop at the first error, so
where does that occur precisely?"

A further key point seems that to be that you have duplicates in the
-merge-d dataset that is what you need to look at carefully. Checking
that no duplicates occur in the datasets to be merged is not
sufficient.

Also, you need to be trying to understand why duplicates are occurring
in the individual datasets, and whether they are duplicates on all
variables. Otherwise you may be discarding important information.
Nick
[email protected]


On 19 January 2014 01:30, Abdalla, Ahmed <[email protected]> wrote:
> I read it again carefully and edited my code accordingly (below). Unfortunately, I get the same two error messages!!
> Can you help? Thanks in advance.
>
> use ccm.dta,clear
> format datadate %d
>   * yr and mth were not reported in the original data in both datasets, however I wrote them in my example above*
> gen yr=year(datadate)
> gen mth=month(datadate)
> rename lpermno permno
> drop if permno==.
> duplicates tag permno yr mth, gen(isdup)
> edit if isdup
> drop isdup
> sort permno yr mth
> save ccm_2.dta,replace
>
> use crsp.dta,clear
> format date %d
> gen yr=year(date)
> gen mth=month(date)
> drop if permno==.
> duplicates tag permno yr mth, gen(isdup)
> edit if isdup
> drop isdup
> sort permno yr mth
>
> merge 1:1 permno yr mth using ccm_2.dta
> gen firmid= group(permno)
> gen timeid = ym(yr, mth)
> tsset firmid timeid
> gen prc3=f3.prc
> keep if _merge==3
>
>
>
>
> ________________________________________
> From: [email protected] <[email protected]> on behalf of Nick Cox <[email protected]>
> Sent: 19 January 2014 01:02
> To: [email protected]
> Subject: Re: st: Merging annual data and monthly data few months after year end
>
> The FAQ does explain: given a report of repeated time values, you can
> use -duplicates- to see what they are.
>
> I too spent much time, writing -duplicates- and writing about it to
> try to make this easy for people.
> Nick
> [email protected]
>
>
> On 19 January 2014 00:55, Abdalla, Ahmed <[email protected]> wrote:
>> Thanks Nick, I followed your suggestions and run my code in pieces to see what error messages I get:
>> 1- I get the error message : variables permno yr mth do not uniquely identify observations in the master data, when I run the merge command
>> 2- Also, repeated time values within panel , when I try to -tsset- my panel !!
>>
>> I also used  gen timeid = ym(yr, mth)  as you suggests , but still get same error messages!
>> I read the FAQ but I really can't figure out how to debug the code. Do you have another code or any other suggestions?
>> I spent much time in this and can't really fix it!
>>
>> Thanks
>>
>>
>>
>> ________________________________________
>> From: [email protected] <[email protected]> on behalf of Nick Cox <[email protected]>
>> Sent: 18 January 2014 11:25
>> To: [email protected]
>> Subject: Re: st: Merging annual data and monthly data few months after year end
>>
>> Thanks for the extra details. The -merge- shouldn't produce duplicates
>> on -permno-, -yr- and -mth- because the whole point is to match
>> observations on those variables.
>>
>> In running a block of code, Stata should stop at the first error, so
>> where does that occur precisely? The -tsset-? Or before?
>>
>> Note that
>>
>> egen timeid = group(yr mth)
>>
>> won't work well with gaps and in any case is much less direct than
>>
>> gen timeid = ym(yr, mth)
>>
>> which will produce monthly dates.
>>
>> There is generic advice in the FAQ
>>
>> How do I deal with a report of repeated time values within panel?
>>
>> Dealing with reports of repeated time values within panel
>>
>> http://www.stata.com/support/faqs/data-management/repeated-time-values/index.html
>> Nick
>> [email protected]
>>
>>
>> On 18 January 2014 02:36, Abdalla, Ahmed <[email protected]> wrote:
>>> I get an error message that variable does not uniquely identify observations. I understand that the problem might be because in the ccm.dta (annual data) each observation id (permno) has accounting data for one month for each fiscal year, which is the month at which the fiscal year ends. In the other data set, crsp.dta , each observation id (permno) has stock price data for the 12 months in each year. I need to link the annual accounting data at the fiscal year end (in the month the fiscal year ends) to the stock price at the same months and three months thereafter.
>>> I read thread you mentioned before posting my question, but still can't figure out how to solve my problem?
>>> Any suggestions will be much appreciated!
>>>
>>>
>>>
>>> ________________________________________
>>> From: [email protected] <[email protected]> on behalf of Nick Cox <[email protected]>
>>> Sent: 17 January 2014 20:43
>>> To: [email protected]
>>> Subject: Re: st: Merging annual data and monthly data few months after year end
>>>
>>> "doesn't work" is not informative.
>>>
>>> However, the recent thread
>>> http://www.stata.com/statalist/archive/2014-01/msg00589.html raised a
>>> similar-sounding problem.
>>>
>>> Nick
>>> [email protected]
>>>
>>>
>>> On 17 January 2014 18:35, Abdalla, Ahmed <[email protected]> wrote:
>>>> Dear Statalist
>>>> I have two databases:
>>>> a) ccm.dta that includes "annual" accounting data at fiscal year end for a set of firms over 20 years (however, unbalanced).
>>>>
>>>> Permno (firm identifier)   datadate                          yr             mth               x1          x2
>>>>
>>>> 10001                                    30june1988                  1988            6
>>>> 10001                                    30june1989                  1989            6
>>>> 10001                                    31december1990        1990            12
>>>> 10002                                    30april1988                  1988             4
>>>> 10002                                    30april1989                  1989             4
>>>> 10002                                    31may                           1990            5
>>>>
>>>> Note that a company may change its fiscal year end. A company that reports accounting data at the end of June this year may report data at December in the next year (or sometimes at the same year).
>>>>
>>>> b) crsp.dta that includes "monthly"stock prices for the same set of firms over the same period (however, unbalanced)
>>>>
>>>> Permno (firm identifier)      datadate                                  yr       mth     x3(stockprice)
>>>> 10001                                    29january1988                      1988       1
>>>> 10001                                    29february1988                    1988       2
>>>> 10001                                    31march1988                        1988       3
>>>> 10001                                    29april1988                           1988       4
>>>> 10001                                    31may1988                            1988       5
>>>> 10001                                    30june1988                           1988       6
>>>> 10001                                    31july1988                             1988      7
>>>> ....                                           ....
>>>> ....                                           ...31dec1988                         1988      12
>>>>
>>>> 10001                                    29january1989                      1989     1
>>>> 10001                                    29february1989                     1989     2
>>>> 10001                                    31march1989                        1989      3
>>>> 10001                                    29april1989                           1989     4
>>>> 10001                                    31may1989                            1989     5
>>>> 10001                                    30june1989                            1989    6
>>>> 10001                                    31july1989                             1989     7
>>>> ....                                           ....
>>>> ....                                           ...31dec1989                          1989    8
>>>>
>>>> 10002                                    29january1988                       1988    1
>>>> 10002                                    29february1988                     1988     2
>>>> 10002                                   31march1988                          1988     3
>>>> 10002                                    29april1988                            1988     4
>>>> 10002                                    31may1988                             1988    5
>>>> 10002                                    30june1988                             1988    6
>>>> 10002                                    31july1988                               1988   7
>>>> ....                                           ....
>>>> ....                                           ...31dec1989                            1988 12
>>>>
>>>>
>>>> I want to merge both data sets and link annual accounting data in ccm.dta to stock prices reported at the end of fiscal year and three months after the end of the fiscal year. Here is the code I used:
>>>>
>>>> use ccm.dta
>>>> duplicates drop
>>>> format datadate %d
>>>>   * yr and mth were not reported in the original data in both datasets, however I wrote them in my example above*
>>>> gen yr=year(datadate)
>>>> gen mth=month(datadate)
>>>> drop if permno==.
>>>> sort permno yr mth
>>>> save ccm_2.dta
>>>>
>>>> use crsp.dta,clear
>>>> duplicates drop
>>>> format date %d
>>>> gen yr=year(date)
>>>> gen mth=month(date)
>>>> drop if permno==.
>>>> sort permno yr mth
>>>> merge 1:1 permno yr mth using ccm_2.dta
>>>> egen firmid= group(permno)
>>>> egen timeid=group(yr mth)
>>>> tsset firmid timeid
>>>> gen prc3=f3.prc
>>>> keep if _merge==3
>>>>
>>>>
>>>> The code didn't work and gives error messages that I couldn't fix and especially when I "tsset" my panel.
>>>>
>>>> Would you please suggest a proper code to merge these two datasets and link x1 x2 (annual data) to x3 (stock prices) at fiscal year end and three months after fiscal year end ?
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index