Richard A. Forshee suggested the following as a solution to Nishant's
problem (see the end of this e-mail for the problem):
"Have you excluded a reference category? If not, your dummy variables
will be perfectly collinear with the constant."
It is a good idea to pick a reference category by yourself---(i) you
might wanna look at point estimates of the dummies in comparison to a
specific group; (ii) you might wanna pick a category that ensures
"stable results", put differently: pick a category of reasonable size
(=a group that is not too small).
However, excluding a reference category is not the solution to Nishant's
problem as I see it; Stata automatically drops variables that show very
high collinearity. Actually, Nishant's output looks like this is what
happened: "dummy10" is reported as "(dropped)".
As far as I read Nishant's output, the problem is likely to be linked to
the usage of the -cluster()- option. I am inferring from the output,
since Nishant is not reporting the code he typed. Actually, I have no
idea what the problem is exactly, but Nishant should check the relations
among the cluster-variable "familyid", the dummies ("dummy1"-"dummy10"),
and the dependent variable on collinearities, nested structures, and
group/cell sizes.
Hope this helps (somewhat),
Sebastian
> First of all, I am using Stata/SE 10.0 on Windows.
>
> My question is about missing standard errors. I am implementing a
> simple linear regression model with roughly 50 indicator/dummy variables
> on the right-hand side (besides a dozen other independent variables),
> and in the results generated, standard errors for the coefficients of
> all the dummy variables are not reported. In addition, the standard
> error for the constant term is also not reported.
>
> I thought it might be due to the skewed distribution of my observations
> across the 50 categories (represented by the 50 indicator/dummy
> variables), i.e., it might be that there are too many 1's or 0's in some
> of the categories. So I tried reducing the number of indicator/dummy
> variables by using much more coarsely-defined categories. This coarse
> categorization brings down the number of indicator/dummy variables to
> 10, but I still get the same problem! (Attached below is part of the
> output generated.)
>
> Any help would be much appreciated.
>
> Thanks,
>
> Nishant
>
>
> P.S. Here's a sample of what I see (using 10 indicator variables) in
> the output generated by Stata:
>
> Linear regression Number of obs =
> 226223
> F( 58, 454) =
> .
> Prob > F =
> .
> R-squared =
> 0.0750
> Root MSE =
> .02272
>
> (Std. Err. adjusted for 455 clusters in
> familyid)
> ------------------------------------------------------------------------
> ------
> | Robust
> familyport~1 | Coef. Std. Err. t P>|t| [95% Conf.
> Interval]
> -------------+----------------------------------------------------------
> ------
> indvar1 | .0002341 .0001428 1.64 0.102 -.0000465
> .0005147
>
> ...
>
> indvar14 | .0002029 .0005647 0.36 0.720 -.0009069
> .0013127
> dummy1 | -.0041449 . . . .
> .
> dummy2 | -.0039503 . . . .
> .
> dummy3 | -.0038193 . . . .
> .
> dummy4 | -.003429 . . . .
> .
> dummy5 | -.0034715 . . . .
> .
> dummy6 | -.003175 . . . .
> .
> dummy7 | -.0033819 . . . .
> .
> dummy8 | -.002303 . . . .
> .
> dummy9 | -.0022382 . . . .
> .
> dummy10 | (dropped)
> _cons | .0790628 . . . .
> .
> ------------------------------------------------------------------------
> ------
>
> .
>
>
>
>
>
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/