Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Identifying first observation in each panel of unbalanced panel
From
Ivan Png <[email protected]>
To
[email protected]
Subject
st: Identifying first observation in each panel of unbalanced panel
Date
Mon, 4 Jun 2012 13:44:35 -0400
Many thanks, Nick. Incidentally, thanks for the yeoman service to all
STATAlisters.
The discrepancy I found was by using xtreg to run a fixed-effects
regression on the sample. xtreg reported 2773 companies. Yet, when I
used the code below on the regression sample, I got only 1048
companies. So, the only reason I could think of was that the flag
identified only companies that were present in year 1.
On 4 June 2012 13:21, Nick Cox <[email protected]> wrote:
> Your code looks fine to me, so I have difficulty understanding why you think it doesn't work.
>
> The -sort- on the second command is unnecessary given the previous command, but I don't see that it will change the sort order.
>
> You can check logic in terms of this example:
>
> . webuse grunfeld
>
> . su year
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> year | 200 1944.5 5.780751 1935 1954
>
> . drop if year == 1935 & mod(company, 2)
> (5 observations deleted)
>
> . tab year
>
> year | Freq. Percent Cum.
> ------------+-----------------------------------
> 1935 | 5 2.56 2.56
> 1936 | 10 5.13 7.69
> 1937 | 10 5.13 12.82
> 1938 | 10 5.13 17.95
> 1939 | 10 5.13 23.08
> 1940 | 10 5.13 28.21
> 1941 | 10 5.13 33.33
> 1942 | 10 5.13 38.46
> 1943 | 10 5.13 43.59
> 1944 | 10 5.13 48.72
> 1945 | 10 5.13 53.85
> 1946 | 10 5.13 58.97
> 1947 | 10 5.13 64.10
> 1948 | 10 5.13 69.23
> 1949 | 10 5.13 74.36
> 1950 | 10 5.13 79.49
> 1951 | 10 5.13 84.62
> 1952 | 10 5.13 89.74
> 1953 | 10 5.13 94.87
> 1954 | 10 5.13 100.00
> ------------+-----------------------------------
> Total | 195 100.00
>
> . bysort company (year) : gen first = _n == 1
>
> . l company year if first
>
> +----------------+
> | company year |
> |----------------|
> 1. | 1 1936 |
> 20. | 2 1935 |
> 40. | 3 1936 |
> 59. | 4 1935 |
> 79. | 5 1936 |
> |----------------|
> 98. | 6 1935 |
> 118. | 7 1936 |
> 137. | 8 1935 |
> 157. | 9 1936 |
> 176. | 10 1935 |
> +----------------+
>
> Nick
> [email protected]
>
> Ivan Png
>
> I am analyzing an unbalanced panel of company data, organized by
> company (gvkey) and year. I want to create a flag to the first
> observation of each company in the panel. I tried
>
> . sort gvkey year
> . by gvkey , sort: gen flag = 1 if _n == 1
>
> However, this only flagged flag = 1 if a company was present in year 1
> of the panel. It missed any company that appeared in later years.
>
> I searched statalist and found this:
> http://www.stata.com/statalist/archive/2005-04/msg00334.html
>
> But it doesn't work. I'd be grateful for any relevant help.
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/