Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | FU Youyan <s1150901@sms.ed.ac.uk> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | st: Multicollinearity Problem in Stata |
Date | Mon, 29 Jul 2013 17:10:34 +0100 |
Dear Statalist users, I am encountering a strange multicollinearity problem when I conduct regression using Stata. The problem is illustrated below. I will VERY appreciate if any of you can answer my question. ***************************************************************************************************** note: r_ew omitted because of collinearity Linear regression Number of obs = 159 F( 3, 155) = 73.74 Prob > F = 0.0000 R-squared = 0.4900 Root MSE = .88944 ------------------------------------------------------------------------------ | Robust n2_ln | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- r_ow | -6.150886 1.861984 -3.30 0.001 -9.829026 -2.472746 r_ew | 0 (omitted) lnnc | .1853104 .0502188 3.69 0.000 .0861089 .2845119 n1_ln | .2328174 .0912362 2.55 0.012 .0525905 .4130443 _cons | 1.945399 .5489629 3.54 0.001 .8609843 3.029813 ------------------------------------------------------------------------------ In the above regression table, r_ew is omitted due to the perfectly negative collinearity between r_ow and r_ew. (Correlation table is showed below). The relationship between these two variables is r_ow+r_ew=0.2407656,so there exists perfect collinearity. | n2_ln r_ow r_ew lnnc n1_ln -------------+--------------------------------------------- n2_ln | 1.0000 r_ow | -0.6565 1.0000 r_ew | 0.6565 -1.0000 1.0000 lnnc | 0.4587 -0.4285 0.4285 1.0000 n1_ln | 0.6419 -0.8468 0.8468 0.4103 1.0000 However, the variable of r_ew is not omitted when I run the exactly same regression but without intercept. Linear regression Number of obs = 159 F( 4, 155) = 441.13 Prob > F = 0.0000 R-squared = 0.8909 Root MSE = .88944 ------------------------------------------------------------------------------ | Robust n2_ln | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- r_ow | 1.929168 .8763971 2.20 0.029 .1979442 3.660391 r_ew | 8.080053 2.280073 3.54 0.001 3.576027 12.58408 lnnc | .1853104 .0502188 3.69 0.000 .0861089 .2845119 n1_ln | .2328174 .0912363 2.55 0.012 .0525905 .4130443 ------------------------------------------------------------------------------ My question is why Stata does not omit r_ew when intercept term is excluded? And whether the regression result without intercept is valid? Thanks for your help. Youyan -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/