Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Identifying first observation in each panel after regression
From
Ivan Png <[email protected]>
To
[email protected]
Subject
Re: st: Identifying first observation in each panel after regression
Date
Mon, 4 Jun 2012 18:43:23 -0400
What I don't understand: Why the
. by gvkey , sort : gen flag = 1 if _n ==1
works when I invoke it before the regression (it then picks up the
first observation of each company), but not when I invoke it after the
regression (it misses many companies).
I used exactly the same command in both cases.
On 4 June 2012 18:31, Nick Cox <[email protected]> wrote:
> Which bit don't you understand?
>
> On Mon, Jun 4, 2012 at 11:16 PM, Ivan Png <[email protected]> wrote:
>> Dear Nick--
>>
>> Many thanks for your hint. I found the solution. I execute
>> . by gvkey , sort: gen flag = 1 if _n == 1
>> before the regression.
>>
>> Then, after the regression, I execute
>> . gen regsample == 1 if e(sample)
>>
>> And, to identify the first observation of each company in the
>> regression sample, I use
>> regsample == 1 & flag == 1
>>
>> However, I still don't understand the reason it works.
>>
>>
>> On 4 June 2012 14:24, Nick Cox <[email protected]> wrote:
>>> What code do you mean by "the code below"?
>>>
>>> I suspect there's something else up with your dataset that leads to
>>> what you see. Examine the data omitted by
>>>
>>> . edit if !e(sample)
>>>
>>> after your -xtreg- command.
>>>
>>> Nick
>>>
>>> On Mon, Jun 4, 2012 at 6:44 PM, Ivan Png <[email protected]> wrote:
>>>> Many thanks, Nick. Incidentally, thanks for the yeoman service to all
>>>> STATAlisters.
>>>>
>>>> The discrepancy I found was by using xtreg to run a fixed-effects
>>>> regression on the sample. xtreg reported 2773 companies. Yet, when I
>>>> used the code below on the regression sample, I got only 1048
>>>> companies. So, the only reason I could think of was that the flag
>>>> identified only companies that were present in year 1.
>>>
>>> On 4 June 2012 13:21, Nick Cox <[email protected]> wrote:
>>>
>>>>> Your code looks fine to me, so I have difficulty understanding why you think it doesn't work.
>>>>>
>>>>> The -sort- on the second command is unnecessary given the previous command, but I don't see that it will change the sort order.
>>>>>
>>>>> You can check logic in terms of this example:
>>>>>
>>>>> . webuse grunfeld
>>>>>
>>>>> . su year
>>>>>
>>>>> Variable | Obs Mean Std. Dev. Min Max
>>>>> -------------+--------------------------------------------------------
>>>>> year | 200 1944.5 5.780751 1935 1954
>>>>>
>>>>> . drop if year == 1935 & mod(company, 2)
>>>>> (5 observations deleted)
>>>>>
>>>>> . tab year
>>>>>
>>>>> year | Freq. Percent Cum.
>>>>> ------------+-----------------------------------
>>>>> 1935 | 5 2.56 2.56
>>>>> 1936 | 10 5.13 7.69
>>>>> 1937 | 10 5.13 12.82
>>>>> 1938 | 10 5.13 17.95
>>>>> 1939 | 10 5.13 23.08
>>>>> 1940 | 10 5.13 28.21
>>>>> 1941 | 10 5.13 33.33
>>>>> 1942 | 10 5.13 38.46
>>>>> 1943 | 10 5.13 43.59
>>>>> 1944 | 10 5.13 48.72
>>>>> 1945 | 10 5.13 53.85
>>>>> 1946 | 10 5.13 58.97
>>>>> 1947 | 10 5.13 64.10
>>>>> 1948 | 10 5.13 69.23
>>>>> 1949 | 10 5.13 74.36
>>>>> 1950 | 10 5.13 79.49
>>>>> 1951 | 10 5.13 84.62
>>>>> 1952 | 10 5.13 89.74
>>>>> 1953 | 10 5.13 94.87
>>>>> 1954 | 10 5.13 100.00
>>>>> ------------+-----------------------------------
>>>>> Total | 195 100.00
>>>>>
>>>>> . bysort company (year) : gen first = _n == 1
>>>>>
>>>>> . l company year if first
>>>>>
>>>>> +----------------+
>>>>> | company year |
>>>>> |----------------|
>>>>> 1. | 1 1936 |
>>>>> 20. | 2 1935 |
>>>>> 40. | 3 1936 |
>>>>> 59. | 4 1935 |
>>>>> 79. | 5 1936 |
>>>>> |----------------|
>>>>> 98. | 6 1935 |
>>>>> 118. | 7 1936 |
>>>>> 137. | 8 1935 |
>>>>> 157. | 9 1936 |
>>>>> 176. | 10 1935 |
>>>>> +----------------+
>>>>>
>>>>> Nick
>>>>> [email protected]
>>>>>
>>>>> Ivan Png
>>>>>
>>>>> I am analyzing an unbalanced panel of company data, organized by
>>>>> company (gvkey) and year. I want to create a flag to the first
>>>>> observation of each company in the panel. I tried
>>>>>
>>>>> . sort gvkey year
>>>>> . by gvkey , sort: gen flag = 1 if _n == 1
>>>>>
>>>>> However, this only flagged flag = 1 if a company was present in year 1
>>>>> of the panel. It missed any company that appeared in later years.
>>>>>
>>>>> I searched statalist and found this:
>>>>> http://www.stata.com/statalist/archive/2005-04/msg00334.html
>>>>>
>>>>> But it doesn't work. I'd be grateful for any relevant help.
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
--
Best wishes
Ivan Png
Skype: ipng00
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/