Clive,
I believe the problem is not that you have unbalanced panels, but you have
missing observations for your time variable.
Suppose we use the Grunfeld data set
. webuse grunfeld, clear
. tsset
panel variable: company, 1 to 10
time variable: year, 1935 to 1954
. xtabond2 inv mval, gmm(kst) robust small
Building GMM instruments..
Estimating.
Performing specification tests.
Arellano-Bond dynamic panel-data estimation, one-step system GMM results
----------------------------------------------------------------------------
--
Group variable: company Number of obs =
200
Time variable : year Number of groups =
10
Obs per group: min =
20
F(1, 9) = 245.79 avg =
20.00
Prob > F = 0.000 max =
20
---------------------------------------------------------------------------
<snip>
Now create an incomplete panel data set with gaps for two of the groups.
. drop in 185/195
(11 observations deleted)
. drop in 5/15
(11 observations deleted)
. tsset
panel variable: company, 1 to 10
time variable: year, 1935 to 1954, but with gaps
. xtabond2 inv mval, gmm(kst) robust small
Building GMM instruments..
Estimating.
Performing specification tests.
Arellano-Bond dynamic panel-data estimation, one-step system GMM results
----------------------------------------------------------------------------
--
Group variable: company Number of obs =
178
Time variable : year Number of groups =
10
Obs per group: min =
9
F(1, 9) = 65.62 avg =
17.80
Prob > F = 0.000 max =
20
---------------------------------------------------------------------------
<snip>
Instead, if we replace one of the year observations to missing...
. webuse grunfeld, clear
. replace year = . in 2
(1 real change made, 1 to missing)
. tsset
panel variable: company, 1 to 10
time variable: year, 1935 to 1954, but with a gap
. xtabond2 inv mval, gmm(kst) robust small
Missing values in time variable (year).
r(459);
Note, deleting observations from invest, mvalue, or kstock does not have
this affect.
If there is data where there are missing year values then the solution would
be to replace the missing year values; if there is no data then simply drop
the missing observations.
If you take a look at lines 73 through 76 of the -xtabond2- code you will
see:
count if `t' >= .
if r(N) {
di as err "Missing values in time variable (`t')."
exit 459
...this is where you are getting tripped up.
Again, it is the missing values on time variable that is causing the
problems, not that the data is an incomplete or unbalanced panel.
Hope this helps,
Scott
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Clive Nicholas
Sent: Saturday, July 03, 2004 8:21 PM
To: [email protected]
Subject: Re: st: RE: Why does -xtabond2- not work with unbalanced panels?
Thuy Le replied:
> I think your panel is not -tsset- properly, so that there are missing
> value. Try to generate new time variable using -tsmktim- and -tsset-
> again.
I should point out here that -tsmktim- is a user-written package by Kit
Baum and Vince Wiggins, downloadable via SSC.
Thanks for the suggestion, but although -tsmktim- worked a treat (it now
works with panels: it didn't before), -xtabond2- still didn't work:
. tsmktim dcyear, start(1976) seq(edyear) i(pano)
panel variable: pano, 1 to 659
time variable: dcyear, 1976 to 1992, but with gaps
. tsset pano dcyear
panel variable: pano, 1 to 659
time variable: dcyear, 1976 to 1992, but with gaps
. xtabond2 edconpc ledconpc ed2-ed13 edpollch lagconch laglabch lagldmch
clmargin cdmargin conplace edenp class if edmarker==1, gmm(l3edconpc)
robust small
Missing values in time variable (dcyear).
r(459);
This simply makes no sense to me: there's _nothing_ in the help file that
states that -xtabond2- does not work with unbalanced panels, or indeed
should not work (in evidence I submitted in an earlier post). I wish I
knew why it behaves like this.
However, I've found a solution, although it's not a satisfactory one:
. drop if edyear==.
(2838 observations deleted)
. xtabond2 edconpc ledconpc ed2-ed13 edpollch lagconch laglabch lagldmch
clmargin cdmargin conplace edenp class if edmarker==1, gmm(l3edconpc)
robust small
ed2 dropped because of collinearity.
ed3 dropped because of collinearity.
Building GMM instruments..
8 instruments dropped because of collinearity.
Estimating.
Performing specification tests.
Arellano-Bond dynamic panel-data estimation, one-step system GMM results
----------------------------------------------------------------------------
--
Group variable: pano Number of obs =
1842
Time variable : edyear Number of groups =
302
Number of instruments = 32 Obs per group: min =
1
F(20, 301) = 6.01 avg =
6.10
Prob > F = 0.000 max =
11
----------------------------------------------------------------------------
--
| Robust
| Coef. Std. Err. t P>|t| [95% Conf.
Interval]
-------------+--------------------------------------------------------------
--
ledconpc | -.2848439 .3087844 -0.92 0.357 -.8924934
.3228057
ed4 | -159.8244 166.6444 -0.96 0.338 -487.76
168.1112
ed5 | 8320.825 177418.4 0.05 0.963 -340816.6
357458.2
ed6 | 8325.268 177418.9 0.05 0.963 -340813.2
357463.8
ed7 | 8446.951 177421.6 0.05 0.962 -340696.8
357590.7
ed8 | 13121.3 288135.6 0.05 0.964 -553894
580136.6
ed9 | 13129.14 288135.9 0.05 0.964 -553886.7
580145
ed10 | 13108.4 288134.5 0.05 0.964 -553904.8
580121.6
ed11 | 8341.228 177400.4 0.05 0.963 -340760.8
357443.2
ed12 | 8341.948 177400 0.05 0.963 -340759.3
357443.2
ed13 | 8332.933 177399.2 0.05 0.963 -340766.7
357432.6
edpollch | -1.050461 1.017076 -1.03 0.303 -3.051941
.9510186
lagconch | 3.718234 7.188094 0.52 0.605 -10.42705
17.86352
laglabch | 2.686741 6.064912 0.44 0.658 -9.248257
14.62174
lagldmch | 1.971574 6.519442 0.30 0.763 -10.85788
14.80103
clmargin | -1.538656 2.839115 -0.54 0.588 -7.125683
4.048371
cdmargin | .1944372 1.942892 0.10 0.920 -3.628934
4.017808
conplace | -44.53702 78.18992 -0.57 0.569 -198.4051
109.3311
edenp | -14.35173 6.621796 -2.17 0.031 -27.3826
-1.320851
class | -4.739345 6.573417 -0.72 0.471 -17.67502
8.196327
_cons | -8060.598 177387.4 -0.05 0.964 -357137.1
341015.9
----------------------------------------------------------------------------
--
Hansen test of overid. restrictions: chi2(11) = 5.72 Prob > chi2 =
0.891
Arellano-Bond test for AR(1) in first differences: z = -1.85 Pr > z =
0.064
Arellano-Bond test for AR(2) in first differences: z = . Pr > z =
.
----------------------------------------------------------------------------
--
What I'll need to automatically impute years not in EDYEAR so that I won't
need to -drop- observations in my dataset that I need to keep.
CLIVE NICHOLAS |t: 0(044)191 222 5969
Politics |e: [email protected]
Newcastle University |http://www.ncl.ac.uk/geps
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/