Ali Mehryar Karim asked why he -xtreg , fe- includes the panels
with only one observation in calculating the number of observations.
-xtreg ,fe- does not drop the panels with only one observation because they
provide information about the constant, the variance components, the between
R-sq, the overall R-sq and the correlation between the u_i and xb. In fact,
this can be seen from the output that he provided.
> . xtreg hiv_tot_cor year core_ad,i(newcs) fe
>
> Fixed-effects (within) regression Number of obs = 7237
> Group variable (i) : newcs Number of groups = 5015
>
> R-sq: within = 0.0332 Obs per group: min = 1
> between = 0.0017 avg = 1.4
> overall = 0.0081 max = 2
>
> F(2,2220) = 38.16
> corr(u_i, Xb) = -0.0424 Prob > F = 0.0000
>
> ------------------------------------------------------------------------
> hiv_tot_cor | Coef. Std. Err. t P>|t| [95% Conf.
> Interval]
> -------+----------------------------------------------------------------
> year | .1355214 .0191276 7.09 0.000 .0980115 .1730313
> core_ad | .1429555 .0335556 4.26 0.000 .077152 .2087591
> _cons | 1.763668 .0149903 117.65 0.000 1.734272 1.793065
> -------+----------------------------------------------------------------
> sigma_u | .68639679
> sigma_e | .63330582
> rho | .54016449 (fraction of variance due to u_i)
> ------------------------------------------------------------------------
> F test that all u_i=0: F(5014, 2220) = 1.54 Prob > F =
> 0.0000
>
> Therefore, I select the sample with observations for both the periods,
> then I get:-
>
> . xtreg hiv_tot_cor year core_ad if interview==2,i(newcs) fe
>
> Fixed-effects (within) regression Number of obs = 4444
> Group variable (i) : newcs Number of groups = 2222
>
> R-sq: within = 0.0332 Obs per group: min = 2
> between = 0.0080 avg = 2.0
> overall = 0.0175 max = 2
>
> F(2,2220) = 38.16
> corr(u_i, Xb) = 0.0057 Prob > F = 0.0000
>
> ------------------------------------------------------------------------
> hiv_tot_cor | Coef. Std. Err. t P>|t| [95% Conf.
> Interval]
> -------------+----------------------------------------------------------------
> year | .1355214 .0191276 7.09 0.000 .0980115 .1730313
> core_ad | .1429555 .0335556 4.26 0.000 .077152 .2087591
> _cons | 1.793538 .0150765 118.96 0.000 1.763972 1.823103
> -------------+----------------------------------------------------------
> sigma_u | .5792349
> sigma_e | .63330582
> rho | .45549543 (fraction of variance due to u_i)
> ------------------------------------------------------------------------------
> F test that all u_i=0: F(2221, 2220) = 1.67 Prob > F =
> 0.0000
Since these observations provide information that affect the estimates, they
should not be dropped by default. If one is interested in estimates that
exclude the information contained in singleton panels, then they should be
explicitly dropped.
I hope that this helps.
--David
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/