Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: margins "not estimable" collinear variable


From   [email protected] (Jeff Pitblado, StataCorp LP)
To   [email protected]
Subject   Re: st: margins "not estimable" collinear variable
Date   Tue, 15 May 2012 11:48:22 -0500

Robert Duval <[email protected]> is using -margins- after -probit- with an
interesting model specification:

> I estimate a probit model with a set of (regional) dummies Z1,...,Zk,
> and the interaction between a categorical variable (3 levels of
> education at the individual level) with a continuous regressor x
> defined at the regional level.
> 
> In particular the model is
> 
> probit y i.region i.edu i.edu#c.x
> 
> The estimation presents problems of collinearity and it drops the last
> interaction between the 3rd educational category and x:
> 
> note: 3.edu#c.x omitted because of collinearity
> 
> [Output Omitted] [...]
> 
>        edu |
>           2  |   .2202739   .0785022     2.81   0.005     .0664123    .3741354
>           3  |    .284186   .0887165     3.20   0.001     .1103049    .4580672
>              |
>      edu#c.x |
>           1  |   .2472436    .224275     1.10   0.270    -.1923273    .6868146
>           2  |   .1672766    .241174     0.69   0.488    -.3054158    .6399691
>           3  |  (omitted)
>              |
>        _cons |   .3254296   .1255826     2.59   0.010     .0792922    .5715671
> 
> 
> Since I am most interested in comparing the coefficients for
> educ(1)#c.x with educ(3)#c.x I tried omitting the interaction
> edu(2)#c.x using
> 
> probit y i.region ib2.edu##c.x
> 
> This gives me coefficients for the dummies edu(1) and edu(3) and their
> respective interactions with x. Of course x on it's own is dropped due
> to perfect collinearity with the regional dummies i.region.
> 
> [Output Omitted] [...]
> 
>          edu |
>           1  |  -.2202739   .0785022    -2.81   0.005    -.3741354   -.0664123
>           3  |   .0639122   .0935549     0.68   0.495    -.1194521    .2472764
>              |
>            x |  (omitted)
>              |
>      edu#c.x |
>           1  |    .079967   .1907966     0.42   0.675    -.2939875    .4539215
>           3  |  -.1672766    .241174    -0.69   0.488    -.6399691    .3054158
>              |
>        _cons |   .4591956   .0927699     4.95   0.000     .2773699    .6410213
> 
> However, my problems begin when I try to estimate margins comparing
> marginal effects of edu(3) wrt edu(1) at different levels of x
> 
> margins, dydx(3.edu) at(x=1)
> 
> as it gives me that the margin is not estimable. (Btw the margin at
> the mean IS estimable). Exploring the matrix H of estimability
> 
> mat H = get(H)
> mat l H
> 
> I indeed get that not all of its entries are -1,0,1 (some are +/-
> fractions between these numbers).
> 
> I read in another post
> (http://www.stata.com/statalist/archive/2011-07/msg00514.html) that
> sometimes it is ok to ask Stata not to perform the estimability check
> as in
> 
> margins, dydx(3.edu) at(x=1) noestimcheck
> 
> Average marginal effects                          Number of obs   =       2153
> Model VCE    : OIM
> 
> Expression   : Pr(y), predict()
> dy/dx w.r.t. : 3.edu
> at           : x               =           1
> 
> ------------------------------------------------------------------------------
>              |            Delta-method
>              |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
> -------------+----------------------------------------------------------------
>        3.edu |  -.0378778   .1084452    -0.35   0.727    -.2504265    .1746709
> ------------------------------------------------------------------------------
> 
> 
> But I don't know if that same advice can be applied in my case here.
> 
> Any advice on whether it is safe to estimate the effects using with
> the noestimcheck option would be greatly appreciated.

I think the problem here is that Robert's 'x' variable is perfectly collinear
with 'i.region'.  This particular kind of collinearity is producing a very
unstable H matrix, hence the values outside of -1, 0, and 1.

I believe a reasonable approximation to Robert's model is

        . probit y region##educ

This model is a standard twoway fully factorial specification, and should only
yield non-estimable margins when there are empty cells.

In Stata 12, Robert can use -contrast- to test for an overall interaction
effect between these two variables via

        . contrast educ#region

If this test is estimable, and it should be if Robert's data does not have any
empty cells, then I would propose that Robert's model specification is
reasonable. In that case, I believe Robert is justified in using the
-noestimcheck- option with his original specification.

--Jeff
[email protected]
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index