Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: Beta coefficients are not equal to coefficients on standardized variables?
From
Kieran McCaul <[email protected]>
To
"[email protected]" <[email protected]>
Subject
st: RE: Beta coefficients are not equal to coefficients on standardized variables?
Date
Sat, 16 Jun 2012 14:03:24 +0800
...
Don't standardize the dependent variable.
clear*
sysuse auto
regress weight length turn displacement, beta
egen length_std = std( length )
egen turn_std = std(turn)
egen displacement_std = std(displacement)
regress weight length_std turn_std displacement_std , beta
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Roberto Liebscher
Sent: Friday, 15 June 2012 11:23 PM
To: [email protected]
Subject: st: Beta coefficients are not equal to coefficients on standardized variables?
There is one thing that makes me puzzling about the - beta - option in
regression commands. In a simple example using the lifeexp dataset I
first used the built-in function - beta - :
sysuse auto
regress lexp gnppc popgrowth, beta
. regress lexp gnppc popgrowth, beta
Source | SS df MS Number of obs =
63
-------------+------------------------------ F( 2, 60) =
36.20
Model | 777.530873 2 388.765436 Prob > F =
0.0000
Residual | 644.405635 60 10.7400939 R-squared =
0.5468
-------------+------------------------------ Adj R-squared =
0.5317
Total | 1421.93651 62 22.9344598 Root MSE =
3.2772
------------------------------------------------------------------------------
lexp | Coef. Std. Err. t P>|t| Beta
-------------+----------------------------------------------------------------
gnppc | .000293 .0000419 6.99 0.000 .6506803
popgrowth | -.9833919 .485387 -2.03 0.047 -.1885781
_cons | 70.67366 .8071596 87.56 0.000 .
------------------------------------------------------------------------------
Then I standardized the variables by hand and re-ran the regression with
the new variables:
. egen popgrowth_std = std(popgrowth)
. egen lexp_std = std(lexp)
. egen gnppc_std = std(gnppc)
(5 missing values generated)
regress lexp_std gnppc_std popgrowth_std
Source | SS df MS Number of obs =
63
-------------+------------------------------ F( 2, 60) =
36.20
Model | 34.9700449 2 17.4850225 Prob > F =
0.0000
Residual | 28.9826364 60 .483043939 R-squared =
0.5468
-------------+------------------------------ Adj R-squared =
0.5317
Total | 63.9526813 62 1.03149486 Root MSE =
.69501
------------------------------------------------------------------------------
lexp_std | Coef. Std. Err. t P>|t| [95% Conf.
Interval]
-------------+----------------------------------------------------------------
gnppc_std | .6608475 .0945336 6.99 0.000 .4717521
.8499428
popgrowth_~d | -.1942026 .0958554 -2.03 0.047 -.3859419
-.0024633
_cons | -.0042032 .0875655 -0.05 0.962 -.1793602
.1709538
------------------------------------------------------------------------------
Now the coefficients are slightly different. For example the coefficient
on gnppc_std is 0.6608475 whereas it has been 0.6506803 in the first
calculation.
Is this caused by rounding errors? Or is there any other explanation for
this?
Thanks in advance.
Roberto
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/