Have to disagree with Martin here. I'm assuming you used standardised
variables in the regression. Standardised variables can be a bit
tricky. here's a little simulation I did.
. clear
* a Q&D simulation
. set seed 123456
* suppose we want to examine the effect of sex and eating carrots on a
particular outcome
. set obs 1000
. gen sex=(_n<=500) // sex is pretty evenly distributed
. gen osex=sex // keep the original sex because we're going to standardise it
. summ sex
. replace sex=(sex-r(mean))/r(sd)
. bysort osex: gen carrot=(_n<=50) // eating carrots is relatively rare
. gen ocarrot=carrot
. summ carrot
. replace carrot=(carrot-r(mean))/r(sd)
* let's suppose that after standardisation, sex and carrots have
exactly the same effect
. gen y=2+1*sex+1*carrot+5*(runiform()-0.5)
. regress y sex carrot
. predict yhat, xb
. list osex sex ocarrot carrot yhat if inlist(_n,500,1,1000,501)
With this code, I get the following
Source | SS df MS Number of obs = 1000
-------------+------------------------------ F( 2, 997) = 539.52
Model | 2216.14901 2 1108.07451 Prob > F = 0.0000
Residual | 2047.66704 997 2.05382853 R-squared = 0.5198
-------------+------------------------------ Adj R-squared = 0.5188
Total | 4263.81606 999 4.26808414 Root MSE = 1.4331
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sex | 1.01654 .0453419 22.42 0.000 .9275632 1.105516
carrot | 1.088584 .0453419 24.01 0.000 .9996074 1.17756
_cons | 1.982797 .0453192 43.75 0.000 1.893865 2.071729
------------------------------------------------------------------------------
+---------------------------------------------------+
| osex sex ocarrot carrot yhat |
|---------------------------------------------------|
1. | 0 -.9994999 1 2.9985 4.230884 |
500. | 0 -.9994999 0 -.3331666 .6040861 |
501. | 1 .9994999 1 2.9985 6.262946 |
1000. | 1 .9994999 0 -.3331666 2.636148 |
+---------------------------------------------------+
The regression coefficients and t-values are pretty similar, but the
if you compare the a 1 SD change in the variables, the effects are
very different. Comparing rows 1 and 500 (a change in carrots), we see
a change of around 3.6. Comparing rows 1 and 501 (a change in sex) we
see a difference of around 2.
If we then replace 50 with 250 in
. bysort osex: gen carrot=(_n<=50) // eating carrots is relatively rare
we get
Source | SS df MS Number of obs = 1000
-------------+------------------------------ F( 2, 997) = 475.93
Model | 1961.23173 2 980.615863 Prob > F = 0.0000
Residual | 2054.23188 997 2.06041312 R-squared = 0.4884
-------------+------------------------------ Adj R-squared = 0.4874
Total | 4015.46361 999 4.01948309 Root MSE = 1.4354
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sex | 1.01654 .0454145 22.38 0.000 .9274206 1.105659
carrot | .9642833 .0454145 21.23 0.000 .8751643 1.053402
_cons | 1.982797 .0453918 43.68 0.000 1.893723 2.071871
------------------------------------------------------------------------------
+---------------------------------------------------+
| osex sex ocarrot carrot yhat |
|---------------------------------------------------|
1. | 0 -.9994999 1 .9994999 1.930567 |
500. | 0 -.9994999 0 -.9994999 .0029649 |
501. | 1 .9994999 1 .9994999 3.962629 |
1000. | 1 .9994999 0 -.9994999 2.035027 |
+---------------------------------------------------+
Again, the regression coefficients are pretty much what we would
expect but now a 1 SD change in either variable leads to a change of
around 2. Distribution is important.
Cheers
Joseph
On Thu, Jul 30, 2009 at 4:44 AM, Martin Weiss<[email protected]> wrote:
>
> <>
>
> What you are describing could mean either of two things: The underlying
> economic theory is wrong and should be replaced by one supported by the
> data. Or you are unlucky and have picked a very special dataset that is not
> representative of the population. You have to make this pick yourself, I am
> afraid...
>
> HTH
> Martin
>
> -----Ursprüngliche Nachricht-----
> Von: [email protected]
> [mailto:[email protected]] Im Auftrag von Erasmo Giambona
> Gesendet: Mittwoch, 29. Juli 2009 10:09
> An: statalist
> Betreff: st: How to Reconcile R2 with Economic Significance
>
> Dear Statalist,
>
> I am trying to understand how to reconcile statistical and economc
> significance.
>
> Consider a simple model: y = a + b1x1 + b2x2 +e, fitted for panel data
> and estimated via OLS. Suppose the t-values are respectively 10 and 2
> for x1 and x2, implying that x1 contributes more to the R2 for the
> model. Suppose also that a 1 standard deviation increase in x1 cause y
> to increase by 2% from its mean while a 1 standard deviation increase
> in x2 causes y to increase by 25% from its mean. Now, a simple
> interpretation of a model R2 is that it is a proportion in the
> variability of y that is accounted for by the model. Accordingly,
> because of its t-value (and its effect on the R2), x1 would seem to be
> one of the key drivers of this variabillity in y. However, from an
> economic point of view, x1 seems to have a very marginal abillity in
> explaining this variation in y (while x2 seems to be very important).
>
> Statistical and economic significance would seem to lead to seemingly
> "contradicting" results. Can someone provide some suggestions that
> could help me reconciling statistical and economic significance?
>
> Thanks,
>
> Erasmo
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/