Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: margins "not estimable" collinear variable
From
[email protected] (Jeff Pitblado, StataCorp LP)
To
[email protected]
Subject
Re: st: margins "not estimable" collinear variable
Date
Tue, 15 May 2012 11:48:22 -0500
Robert Duval <[email protected]> is using -margins- after -probit- with an
interesting model specification:
> I estimate a probit model with a set of (regional) dummies Z1,...,Zk,
> and the interaction between a categorical variable (3 levels of
> education at the individual level) with a continuous regressor x
> defined at the regional level.
>
> In particular the model is
>
> probit y i.region i.edu i.edu#c.x
>
> The estimation presents problems of collinearity and it drops the last
> interaction between the 3rd educational category and x:
>
> note: 3.edu#c.x omitted because of collinearity
>
> [Output Omitted] [...]
>
> edu |
> 2 | .2202739 .0785022 2.81 0.005 .0664123 .3741354
> 3 | .284186 .0887165 3.20 0.001 .1103049 .4580672
> |
> edu#c.x |
> 1 | .2472436 .224275 1.10 0.270 -.1923273 .6868146
> 2 | .1672766 .241174 0.69 0.488 -.3054158 .6399691
> 3 | (omitted)
> |
> _cons | .3254296 .1255826 2.59 0.010 .0792922 .5715671
>
>
> Since I am most interested in comparing the coefficients for
> educ(1)#c.x with educ(3)#c.x I tried omitting the interaction
> edu(2)#c.x using
>
> probit y i.region ib2.edu##c.x
>
> This gives me coefficients for the dummies edu(1) and edu(3) and their
> respective interactions with x. Of course x on it's own is dropped due
> to perfect collinearity with the regional dummies i.region.
>
> [Output Omitted] [...]
>
> edu |
> 1 | -.2202739 .0785022 -2.81 0.005 -.3741354 -.0664123
> 3 | .0639122 .0935549 0.68 0.495 -.1194521 .2472764
> |
> x | (omitted)
> |
> edu#c.x |
> 1 | .079967 .1907966 0.42 0.675 -.2939875 .4539215
> 3 | -.1672766 .241174 -0.69 0.488 -.6399691 .3054158
> |
> _cons | .4591956 .0927699 4.95 0.000 .2773699 .6410213
>
> However, my problems begin when I try to estimate margins comparing
> marginal effects of edu(3) wrt edu(1) at different levels of x
>
> margins, dydx(3.edu) at(x=1)
>
> as it gives me that the margin is not estimable. (Btw the margin at
> the mean IS estimable). Exploring the matrix H of estimability
>
> mat H = get(H)
> mat l H
>
> I indeed get that not all of its entries are -1,0,1 (some are +/-
> fractions between these numbers).
>
> I read in another post
> (http://www.stata.com/statalist/archive/2011-07/msg00514.html) that
> sometimes it is ok to ask Stata not to perform the estimability check
> as in
>
> margins, dydx(3.edu) at(x=1) noestimcheck
>
> Average marginal effects Number of obs = 2153
> Model VCE : OIM
>
> Expression : Pr(y), predict()
> dy/dx w.r.t. : 3.edu
> at : x = 1
>
> ------------------------------------------------------------------------------
> | Delta-method
> | dy/dx Std. Err. z P>|z| [95% Conf. Interval]
> -------------+----------------------------------------------------------------
> 3.edu | -.0378778 .1084452 -0.35 0.727 -.2504265 .1746709
> ------------------------------------------------------------------------------
>
>
> But I don't know if that same advice can be applied in my case here.
>
> Any advice on whether it is safe to estimate the effects using with
> the noestimcheck option would be greatly appreciated.
I think the problem here is that Robert's 'x' variable is perfectly collinear
with 'i.region'. This particular kind of collinearity is producing a very
unstable H matrix, hence the values outside of -1, 0, and 1.
I believe a reasonable approximation to Robert's model is
. probit y region##educ
This model is a standard twoway fully factorial specification, and should only
yield non-estimable margins when there are empty cells.
In Stata 12, Robert can use -contrast- to test for an overall interaction
effect between these two variables via
. contrast educ#region
If this test is estimable, and it should be if Robert's data does not have any
empty cells, then I would propose that Robert's model specification is
reasonable. In that case, I believe Robert is justified in using the
-noestimcheck- option with his original specification.
--Jeff
[email protected]
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/