Dear Mark,
Thanks for the detailed thoughts on what is going on in xthtaylor and xtoverid. This allowed me to peer a bit more deeply into these commands' inner workings. In my case, I am not sure though if unbalancedness is the sole problem here. First of all, I'm not sure what you meant with unbalanced. To be on the safe side, I interpreted that strictly and made sure that, among all variables in my regression, there is NO variable such that, for a given observation (in my case, for a given district), values for all years are missing; and similarly, there is NO variable such that, for a given year, values for all districts are missing. [In fact, in investigating that, there was a nice side-effect that I discovered a small glitch in my data; thanks!].
As you'll see from the output below, we still get the error, but there is this time more information, albeit very strange information: a) -xtoverid- now drops a whole host of variables, b) treats many (12 to be exact) more variables as exogenous, than I declared endogenous in the first place (just 2)!! I also realised (not shown in output below) that -xtoverid- is well behaved and runs well if I drop all the regional dummies (called dumreg...) from the Hausman Taylor regression. In order to find closure, I may run the model without the dummies, though this is somewhat unattractive from the perspective of specification, and it is also disconcerting to have to rely on xtoverid to identify the presence of too-strong-collinearity, since that's not the type of info that -xtoverid- is mainly designed to offer...
Here is the output on the balanced panel (and commands making the panel balanced):
.. use "$f/fin & nonfin panel.dta", clear
..
.. use "$f/fin & nonfin panel.dta", clear
..
.. * Since the direct use of lags may make analysing -xtoverid, noi- difficult, rename the variables to
> be lagged, before bringing them into the model:
..
.. for any rev_IGF_2avg rev_EXT exp_pers_act exp_NPR exp_cap_act: g L_X=L.X
-> g L_rev_IGF_2avg=L.rev_IGF_2avg
(330 missing values generated)
-> g L_rev_EXT=L.rev_EXT
(220 missing values generated)
-> g L_exp_pers_act=L.exp_pers_act
(231 missing values generated)
-> g L_exp_NPR=L.exp_NPR
(231 missing values generated)
-> g L_exp_cap_act=L.exp_cap_act
(232 missing values generated)
..
.. * Next, drop the years that could be causing panel imbalance: Drop the first year (1994) because the
> dependent variable is a variable that's averaged over two periods, i.e. Y_t = mean(Z_t Z_t-1);
.. * ... drop 1995, since the LDV becomes missing for this year;
.. * ... and drop the last two years, 2005 and 2006 for which several variables don't have data.
..
.. drop if year==1994 | year==1995 | year==2005 | year==2006
(440 observations deleted)
..
.. * Now run the main regression:
..
.. xthtaylor rev_IGF_2avg popurb_share popdens pop p0 rain_av road_no literate rel_christ *akan *ewe L_r
> ev_IGF_2avg L_rev_EXT L_exp_pers_act L_exp_NPR L_exp_cap_act dumreg1-dumreg7 dumreg9-dumreg10, endog(
> L_rev_EXT L_rev_IGF_2avg) varying(L_rev_IGF_2avg L_rev_EXT L_exp_pers_act L_exp_NPR L_exp_cap_act)
>
Hausman-Taylor estimation Number of obs = 978
Group variable: code Number of groups = 110
Obs per group: min = 7
avg = 8.9
max = 9
Random effects u_i ~ i.i.d. Wald chi2(24) = 1868.46
Prob > chi2 = 0.0000
------------------------------------------------------------------------------
rev_IGF_2avg | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
TVexogenous |
L_exp_pers~t | -.0125667 .0249512 -0.50 0.615 -.0614703 .0363368
L_exp_NPR | .2358651 .0326824 7.22 0.000 .1718088 .2999214
L_exp_cap_~t | -.0328466 .0170683 -1.92 0.054 -.0662999 .0006066
TVendogenous |
L_rev_EXT | -.0071075 .0141421 -0.50 0.615 -.0348255 .0206105
L_rev_IGF_~g | .4204889 .0358268 11.74 0.000 .3502697 .4907082
TIexogenous |
popurb_share | .0021465 .0010059 2.13 0.033 .000175 .0041181
popdens | -.0001024 .0000955 -1.07 0.283 -.0002895 .0000847
pop | -.0004385 .0002147 -2.04 0.041 -.0008593 -.0000178
p0 | -.7584724 .2114781 -3.59 0.000 -1.172962 -.3439829
rain_av | .0000893 .0001371 0.65 0.515 -.0001794 .0003579
road_no | -.0770933 .125977 -0.61 0.541 -.3240036 .169817
literate | .005749 .0029767 1.93 0.053 -.0000853 .0115833
rel_christ | -.0029042 .0021226 -1.37 0.171 -.0070644 .0012559
ethn_akan | -.0006008 .0012249 -0.49 0.624 -.0030015 .0017999
ethn_ewe | .0012169 .0015824 0.77 0.442 -.0018845 .0043183
dumreg1 | .2399481 .1435465 1.67 0.095 -.0413979 .521294
dumreg2 | .0407062 .1309883 0.31 0.756 -.2160262 .2974385
dumreg3 | .3356461 .1527193 2.20 0.028 .0363219 .6349704
dumreg4 | -.0132408 .1290034 -0.10 0.918 -.2660829 .2396012
dumreg5 | .0240335 .1252261 0.19 0.848 -.2214051 .2694721
dumreg6 | .1773062 .124622 1.42 0.155 -.0669484 .4215607
dumreg7 | .1753394 .1195659 1.47 0.143 -.0590055 .4096843
dumreg9 | .4189782 .0923807 4.54 0.000 .2379154 .6000411
dumreg10 | .4799591 .1026885 4.67 0.000 .2786934 .6812248
|
_cons | 3.24853 .3265345 9.95 0.000 2.608534 3.888526
-------------+----------------------------------------------------------------
sigma_u | .00654479
sigma_e | .45401058
rho | .00020776 (fraction of variance due to u_i)
------------------------------------------------------------------------------
Note: TV refers to time varying; TI refers to time invariant.
..
.. * But when doing the post-estimation test, it still gives the error message, and provides stranger in
> formation than even before:
..
.. xtoverid, noi
Warning - endogenous variable(s) collinear with instruments
Vars now exogenous: __00000X __000011 __000014 __000015 __000016 __000017
__000018 __000019 __00001A __00001B __00001C __00001D
Warning - collinearities detected
Vars dropped: road_no ethn_akan ethn_ewe dumreg1 dumreg2 dumreg4 dumreg5
dumreg6 dumreg7 dumreg9
Unable to display summary of first-stage estimates; macro e(first) is missing
IV (2SLS) estimation
--------------------
Estimates efficient for homoskedasticity only
Statistics consistent for homoskedasticity only
Number of obs = 978
F( 25, 953) = 11017.57
Prob > F = 0.0000
Total (centered) SS = 612.7773989 Centered R2 = 0.6767
Total (uncentered) SS = 57474.83461 Uncentered R2 = 0.9966
Residual SS = 198.082309 Root MSE = .4559
------------------------------------------------------------------------------
__00000F | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
__00000J | -.0125026 .0249504 -0.50 0.616 -.0614667 .0364614
__00000M | .2361092 .0326823 7.22 0.000 .1719715 .3002468
__00000P | -.0327462 .0170671 -1.92 0.055 -.0662397 .0007473
__00000S | -.0074428 .0141302 -0.53 0.599 -.0351727 .0202871
__00000V | .4201414 .0357976 11.74 0.000 .3498902 .4903926
__00000W | .0021458 .0010059 2.13 0.033 .0001717 .0041199
__00000Y | -.0004393 .0002147 -2.05 0.041 -.0008605 -.000018
__00000Z | -.7585726 .2114717 -3.59 0.000 -1.173577 -.3435687
__000010 | .0000892 .0001371 0.65 0.515 -.0001798 .0003582
__000012 | .0057551 .0029768 1.93 0.053 -.0000868 .011597
__000013 | -.0029059 .0021226 -1.37 0.171 -.0070714 .0012597
__00001E | .4799851 .1026759 4.67 0.000 .2784882 .681482
__00000E | 3.251012 .326437 9.96 0.000 2.610394 3.891631
__00000X | -.0001021 .0000955 -1.07 0.285 -.0002895 .0000852
__000011 | -.0770089 .1259801 -0.61 0.541 -.3242393 .1702214
__000014 | -.0006012 .0012249 -0.49 0.624 -.003005 .0018027
__000015 | .0012157 .0015824 0.77 0.443 -.0018897 .0043212
__000016 | .2401393 .1435471 1.67 0.095 -.0415656 .5218442
__000017 | .0407647 .13099 0.31 0.756 -.2162975 .2978269
__000018 | .3359035 .1527116 2.20 0.028 .0362137 .6355934
__000019 | -.0131465 .1290063 -0.10 0.919 -.2663158 .2400228
__00001A | .0241284 .1252282 0.19 0.847 -.2216266 .2698833
__00001B | .1774268 .1246225 1.42 0.155 -.0671393 .421993
__00001C | .1754524 .1195626 1.47 0.143 -.0591839 .4100888
__00001D | .4192401 .0923717 4.54 0.000 .2379647 .6005156
------------------------------------------------------------------------------
Sargan statistic (overidentification test of all instruments): 19.614
Chi-sq(5) P-val = 0.0015
------------------------------------------------------------------------------
Instrumented: __00000J __00000M __00000P __00000S __00000V __00000W
__00000Y __00000Z __000010 __000012 __000013 __00001E
Included instruments: __00000E __00000X __000011 __000014 __000015 __000016
__000017 __000018 __000019 __00001A __00001B __00001C
__00001D
Excluded instruments: __00000I __00000L __00000O __00000R __00000U __00000H
__00000K __00000N popurb_share popdens pop p0 rain_av
literate rel_christ dumreg3 dumreg10
Dropped collinear: road_no ethn_akan ethn_ewe dumreg1 dumreg2 dumreg4 dumreg5
dumreg6 dumreg7 dumreg9
Reclassified as exog: __00000X __000011 __000014 __000015 __000016 __000017
__000018 __000019 __00001A __00001B __00001C __00001D
------------------------------------------------------------------------------
xtoverid error: internal reestimation of eqn differs from original
r(198);
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/