Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Multicollinearity Problem in Stata
From
FU Youyan <[email protected]>
To
"[email protected]" <[email protected]>
Subject
RE: st: Multicollinearity Problem in Stata
Date
Tue, 30 Jul 2013 15:40:56 +0100
I have double checked my data. I am sure that r_ew+r_ow equals a constant (0.2407656). The coefficients of lnnc and n1_ln in these two regressions are same indeed, but the coefficients of r_ow dose change. We can also see that the t-value and p-value of the constant in the first regression are exactly same as the t-value and p-value of r_ew in the second regression, which is consistent with your explanation about the omitted variable. Therefore, I am still confused about the coefficient change of r_ow and wonder which result is more reliable.
________________________________________
From: [email protected] [[email protected]] On Behalf Of Yuval Arbel [[email protected]]
Sent: 30 July 2013 14:04
To: statalist
Subject: Re: st: Multicollinearity Problem in Stata
See the example below on wage and gender. The fact that these
variables are continous rather than dummies are irrelevant here. If
indeed r_ew+r_ow equals a constant - the coefficients should be the
same in both regressions
On Mon, Jul 29, 2013 at 1:14 PM, FU Youyan <[email protected]> wrote:
> Dear Yuval,
>
> Thank you very much for this answer, it is quite helpful. I have a followed up question:
> The r_ew and r_ow are two types of investment return in my research ( they are continuous variable rather than dummy), what I want to test is the impact of these two returns on investors' future behavior. In other words, I want to know how investors weight these two types of return. Therefore, I have to include both of the returns into my regression. In the regression with constant but omitting r_ew, the coefficient of r_ow is significantly negative (t-value=-3.30). However, in the regression without constant but including r_ew, the coefficient of r_ow is significantly positive (t-value=2.20). So, I would like to know which result is more reliable?
>
> Best wishes,
> Youyan
> ________________________________________
> From: [email protected] [[email protected]] On Behalf Of Yuval Arbel [[email protected]]
> Sent: 29 July 2013 17:58
> To: statalist
> Subject: Re: st: Multicollinearity Problem in Stata
>
> Dear FU,
>
> This outcome is not strange at all. I believe what you encountered is
> known in econometrics as "the dummy variable trap":
>
> I believe that r_ew+r_ow=constant. Consequently - when you run the
> model with a constant - you get a perfect colinearity with the
> constant term. But when you omit the constant - the problem is solved.
>
> In fact you can make use of these two specifications. Consider the
> following exercise. Lets say that w is the wage male=0 for female and
> 1 for male, and female=1 for female and 0 for male. if the average
> wage is 1200 for male and 1000 for female - and you run the model
> without the constant, you will get:
>
> w(hat)=1200*male+1000*female
>
> But if you omit male and use constant (in order to avoid the dummy
> variable trap), you get
>
> w(hat)=1200-200*female
>
> The second specification is more common because it permits you to test
> whether wage differences across gender are significant
>
> On Mon, Jul 29, 2013 at 9:10 AM, FU Youyan <[email protected]> wrote:
>> Dear Statalist users,
>>
>> I am encountering a strange multicollinearity problem when I conduct regression using Stata. The problem is illustrated below. I will VERY appreciate if any of you can answer my question.
>>
>>
>> *****************************************************************************************************
>> note: r_ew omitted because of collinearity
>>
>> Linear regression Number of obs = 159
>> F( 3, 155) = 73.74
>> Prob > F = 0.0000
>> R-squared = 0.4900
>> Root MSE = .88944
>>
>> ------------------------------------------------------------------------------
>> | Robust
>> n2_ln | Coef. Std. Err. t P>|t| [95% Conf. Interval]
>> -------------+----------------------------------------------------------------
>> r_ow | -6.150886 1.861984 -3.30 0.001 -9.829026 -2.472746
>> r_ew | 0 (omitted)
>> lnnc | .1853104 .0502188 3.69 0.000 .0861089 .2845119
>> n1_ln | .2328174 .0912362 2.55 0.012 .0525905 .4130443
>> _cons | 1.945399 .5489629 3.54 0.001 .8609843 3.029813
>> ------------------------------------------------------------------------------
>>
>> In the above regression table, r_ew is omitted due to the perfectly negative collinearity between r_ow and r_ew.
>>
>> (Correlation table is showed below). The relationship between these two variables is r_ow+r_ew=0.2407656,so there exists perfect collinearity.
>>
>>
>> | n2_ln r_ow r_ew lnnc n1_ln
>> -------------+---------------------------------------------
>> n2_ln | 1.0000
>> r_ow | -0.6565 1.0000
>> r_ew | 0.6565 -1.0000 1.0000
>> lnnc | 0.4587 -0.4285 0.4285 1.0000
>> n1_ln | 0.6419 -0.8468 0.8468 0.4103 1.0000
>>
>> However, the variable of r_ew is not omitted when I run the exactly same regression but without intercept.
>>
>>
>> Linear regression Number of obs = 159
>> F( 4, 155) = 441.13
>> Prob > F = 0.0000
>> R-squared = 0.8909
>> Root MSE = .88944
>>
>> ------------------------------------------------------------------------------
>> | Robust
>> n2_ln | Coef. Std. Err. t P>|t| [95% Conf. Interval]
>> -------------+----------------------------------------------------------------
>> r_ow | 1.929168 .8763971 2.20 0.029 .1979442 3.660391
>> r_ew | 8.080053 2.280073 3.54 0.001 3.576027 12.58408
>> lnnc | .1853104 .0502188 3.69 0.000 .0861089 .2845119
>> n1_ln | .2328174 .0912363 2.55 0.012 .0525905 .4130443
>> ------------------------------------------------------------------------------
>>
>> My question is why Stata does not omit r_ew when intercept term is excluded? And whether the regression result without intercept is valid?
>>
>>
>> Thanks for your help.
>> Youyan
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>
>
>
> --
> Dr. Yuval Arbel
> School of Business
> Carmel Academic Center
> 4 Shaar Palmer Street,
> Haifa 33031, Israel
> e-mail1: [email protected]
> e-mail2: [email protected]
> You can access my latest paper on SSRN at: http://ssrn.com/abstract=2263398
> You can access previous papers on SSRN at: http://ssrn.com/author=1313670
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
--
Dr. Yuval Arbel
School of Business
Carmel Academic Center
4 Shaar Palmer Street,
Haifa 33031, Israel
e-mail1: [email protected]
e-mail2: [email protected]
You can access my latest paper on SSRN at: http://ssrn.com/abstract=2263398
You can access previous papers on SSRN at: http://ssrn.com/author=1313670
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/