Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Adjusted R-squared comparison
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Adjusted R-squared comparison
Date
Wed, 6 Feb 2013 14:42:15 +0000
I am not providing either blessing or curse here. I am just saying
that you need to think about the implications of what you are doing.
You need to explain to your readership that your -bootstrap-ping is
perfectly consistent with what your -xtreg- call implies and that all
depends on what assumptions your model makes. There is panel structure
there, and perhaps some kind of dependence structure. (You will agree
that neither I nor anybody else can tell what lurks underneath your
multiple dots.)
-bootstrap- is just a robot. It does not look at your modelling
command and make independent decisions about precisely what kind of
bootstrapping makes sense for your model, and still less will it warn
if bootstrapping is not appropriate for your model.
Nick
On Wed, Feb 6, 2013 at 1:20 PM, Panagiotis Manganaris
<[email protected]> wrote:
> Just to be sure John, you mean that the bootstrap st.err. is the standard
> deviation?
>
> And Nick, do you say that if I use the following command:
> bootstrap e(r2), seed(123) reps(50) : xtreg .......
> I won't have reliable results?
On Wed, Feb 6, 2013 at 12:35 PM, Nick Cox wrote:
>> There is an extra dimension here. John's bootstrap example is a nice
>> simple example of a model applied to non-panel, non-time series data.
>> -bootstrap-ping panel data that are time series too is trickier, to
>> say the least.
>
>
>
>> Nick
>
>
> On Wed, Feb 6, 2013 at 12:30 PM, John Antonakis <[email protected]>
> wrote:
>>
>> Hi Panagiotis:
>>
>> In fact, the result you get is the mean and SD of the bootstrap.
>>
>> Specifically:
>>
>> sysuse auto
>> bootstrap e(r2), seed(123) reps(50) : reg price mpg weight
>>
>> gives:
>>
>>
>> Bootstrap replications (50)
>> ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
>> .................................................. 50
>>
>> Linear regression Number of obs = 74
>> Replications = 50
>>
>> command: regress price mpg weight
>> _bs_1: e(r2)
>>
>>
>> ------------------------------------------------------------------------------
>> | Observed Bootstrap Normal-based
>> | Coef. Std. Err. z P>|z| [95% Conf.
>> Interval]
>>
>> -------------+----------------------------------------------------------------
>> _bs_1 | .2933891 .074451 3.94 0.000 .1474678 .4393104
>>
>> ------------------------------------------------------------------------------
>>
>>
>> .2933891 is the mean of the bootstrapped r-squares and .07215 is the SD.
>>
>> If you wish to check this save the bootstrap estimates (using saving) and
>> check the mean and SD manually.
>>
>> So, with these two values from both samples, I guess you could do a t-test
>> for the difference if this is what you are looking for.
>>
>> Let's see what others might say.
>>
>>
>> Best,
>> J.
>>
>>
>> __________________________________________
>>
>> John Antonakis
>> Professor of Organizational Behavior
>> Director, Ph.D. Program in Management
>>
>> Faculty of Business and Economics
>> University of Lausanne
>> Internef #618
>> CH-1015 Lausanne-Dorigny
>> Switzerland
>> Tel ++41 (0)21 692-3438
>> Fax ++41 (0)21 692-3305
>> http://www.hec.unil.ch/people/jantonakis
>>
>> Associate Editor
>> The Leadership Quarterly
>> __________________________________________
>>
>> On 06.02.2013 12:57, Panagiotis Manganaris wrote:
>>>
>>> Unfortunately Nick and John, I must use adj r-squared because it
>>> represents a specific metric in the field of accounting. More
>>> specifically,
>>> I use a model where returns are the dependent variable and earnings,
>>> along
>>> with the change in earnings, are the independent variables. In this model
>>> the adjusted r-squared represents the value relevance of the earnings
>>> (this
>>> is what I am trying to gauge). Therefore, I am obliged to use r2.
>>> Thank you for the procedure you mention John, but I had already tried it
>>> in the past. It is helpful, but only in a vague way. It does not provide
>>> the
>>> mean and the variance of r2, so I could use them to test the
>>> significance.
>>> For instance, the intervals almost always overlap when I use this method.
>>> That does not provide concrete evidence of statistical significance or
>>> non-significance. If I don't prove that there is (or there is not) a
>>> statistically significant difference, I cannot show whether my metric
>>> (value
>>> relevance) has been altered between the two periods.
>>>
>>>
>>>
>>> 2013/2/6 John Antonakis <[email protected]>
>>> Can't agree more with you Nick. We should care more about having
>>> consistent estimators than high r-squares (i.e., Panagiotis, what I mean
>>> here is that we can still estimate the slope consistently even if we
>>> don't
>>> have a tight fitting regression line). So, I don't know why you are
>>> interested in this comparison, Panagiotis. I would think you would be
>>> more
>>> interested in comparing estimates, as in a Chow test (Chow, G. C. (1960).
>>> Tests of equality between sets of coefficients in two linear regressions.
>>> Econometrica, 28(3), 591-605). If you are using fixed-effects models, you
>>> can model the fixed-effects with dummies and then do a Chow test via
>>> suest....see -help suest-.
>>>
>>>
>>> Best,
>>> J.
>>>
>>> __________________________________________
>>>
>>> John Antonakis
>>> Professor of Organizational Behavior
>>> Director, Ph.D. Program in Management
>>>
>>> Faculty of Business and Economics
>>> University of Lausanne
>>> Internef #618
>>> CH-1015 Lausanne-Dorigny
>>> Switzerland
>>> Tel ++41 (0)21 692-3438
>>> Fax ++41 (0)21 692-3305
>>> http://www.hec.unil.ch/people/jantonakis
>>>
>>> Associate Editor
>>> The Leadership Quarterly
>>> __________________________________________
>>>
>>> On 06.02.2013 11:40, Nick Cox wrote:
>>> That's positive advice.
>>>
>>> My own other idea is that adjusted R-squares are a lousy basis to
>>> compare two models, even of the same kind. They leave out too much
>>> information.
>>>
>>> Nick
>>>
>>> On Wed, Feb 6, 2013 at 10:37 AM, John Antonakis <[email protected]>
>>> wrote:
>>> I think that the only think you can do is to bootstrap the r-squares and
>>> see
>>> if their confidence intervals overlap.
>>>
>>> To bootstrap you just do:
>>>
>>> E.g.,
>>>
>>> sysuse auto
>>> bootstrap e(r2), seed(123) reps(1000) : reg price mpg weight
>>>
>>> You will be interested in either:
>>>
>>> e(r2_w) R-squared within model
>>> e(r2_o) R-squared overall model
>>> e(r2_b) R-squared between model
>>>
>>> See help xtreg with respect to saved results.
>>>
>>> Let's see if others have other ideas.
>>> On 06.02.2013 10:22, Panagiotis Manganaris wrote:
>>>
>>> I need to compare two adjusted r-squared of the same model for two
>>> different periods of time (each one spans for a period of years). So far,
>>> I
>>> have split my data in two groups, those that belong to the period
>>> 1998-2004
>>> and those that belong to the period 2005-2011. Then I used xtreg on the
>>> same
>>> model for each group of data. I've derived their adjusted r-squared and I
>>> want to know if those two adjusted r-squared are significantly different
>>> from each other.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/