Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: RE: RE: Using ivhettest to test for heterogeneity
From
[email protected]
To
[email protected]
Subject
RE: st: RE: RE: Using ivhettest to test for heterogeneity
Date
Thu, 1 Mar 2012 22:36:58 +0100
Sorry, I did not want to puzzle anybody here by just giving priority to theoretical arguments. I think you are completely right that I should better have omitted some variables from a statistical viewpoint. I have relied on conducting standard heteroskedasticity tests in Stata because I did not anticipate that the problem might be a serious one. Certainly it would be a good idea to plot residuals vs fitted to detect outliers in the data. Would you also recommend to look at specific measures like Cook's distance to decide which predictors are likely to cause heteroskedasticity and may therefore be omitted?
Andreas
[email protected] wrote: -----
To: "'[email protected]'" <[email protected]>
From: Nick Cox
Sent by: [email protected]
Date: 03/01/2012 04:37PM
Subject: RE: st: RE: RE: Using ivhettest to test for heterogeneity
It's your problem, entirely and obviously, but I find it puzzling that like many people you seem more concerned with secondary assumptions about errors than with simplifying the model to omit predictors.
Have you plotted residuals vs fitted?
Nick
[email protected]
[email protected]
These is a firm-specific regression of the cost of capital (COC) on different firm characteristics,
including the potentially endogeneous fractional rank of disclosure quality (DRANK) and its linear
interaction with the exogenous firm age (LBAGE), D_LBAGE. LNMV, LNBM LNLEV, CapInt, ROA, AssTr, LTG,
LVOL and MAFE are different firm characteristics proxying for size, leverage, capital intensity,
profitability, growth opportunities, stock volatility and estimation risk. There are indeed important
theoretical reasons for keeping those variables in the regression, otherwise I would have dropped
any insignificant predictors. Note that because of the endogeneity problem I have also run a sensitivity
analysis using 2SLS treating both disclosure quality (DRANK) and the interaction effect (D_LBAGE) as
endogenous. However, I decide to keep to the simpler and more efficient OLS estimation because
the Hausman test statistic does not reject the null of exogeneity on the 5% level (even though my
results likely suffer from small-sample bias and the Hausman test statistic may thus not be very
conclusive).
This is all what I can tell you about the background of these data. But if you still cannot make sense
of my results, it won't be that bad because the test statistics of the different heterskedasticitiy
test versions are all rejecting homoskedacitiy on the 10% level anyway.
From: Nick Cox
I have no idea what these data are and even if I did I doubt I could add to your subject-matter expertise. Any kind of test of the results seems to me to be less important than simplifying your model by omitting some of the predictors. Conversely, if there are subject-matter reasons for keeping them in then you need to tell us, as we can hardly interpret your results otherwise.
Nick
[email protected]
[email protected]
As far as I have understood, reducing the degrees of freedom may increase the
power of the heterogeneity test if the number of observations is small. With my
sample, the opposite is apparently the case because the p-value of the test
increases with decreasing degrees of freedom:
. reg COC DRANK D_LBAGE LBAGE LNMV LNBM LNLEV CapInt ROA AssTr LTG LVOL MAFE
Source | SS df MS Number of obs = 130
-------------+------------------------------ F( 12, 117) = 27.58
Model | .185658991 12 .015471583 Prob > F = 0.0000
Residual | .065629152 117 .000560933 R-squared = 0.7388
-------------+------------------------------ Adj R-squared = 0.7120
Total | .251288143 129 .00194797 Root MSE = .02368
------------------------------------------------------------------------------
COC | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
DRANK | -.2427198 .101898 -2.38 0.019 -.4445234 -.0409162
D_LBAGE | .0284015 .0117121 2.42 0.017 .0052063 .0515967
LBAGE | -.0145643 .0073708 -1.98 0.051 -.0291619 .0000332
LNMV | -.0015457 .0018037 -0.86 0.393 -.0051178 .0020265
LNBM | .0096937 .0048182 2.01 0.047 .0001514 .019236
LNLEV | .0000791 .0060169 0.01 0.990 -.011837 .0119952
CapInt | -.0261037 .01005 -2.60 0.011 -.0460072 -.0062001
ROA | -.022799 .0294708 -0.77 0.441 -.0811644 .0355663
AssTr | -.0028776 .0033909 -0.85 0.398 -.0095932 .0038379
LTG | .0009762 .0000745 13.10 0.000 .0008286 .0011238
LVOL | .0168072 .0058712 2.86 0.005 .0051796 .0284348
MAFE | .0020086 .0004498 4.47 0.000 .0011178 .0028993
_cons | .2656843 .0619192 4.29 0.000 .1430566 .3883121
------------------------------------------------------------------------------
. imtest, white
White's test for Ho: homoskedasticity
against Ha: unrestricted heteroskedasticity
chi2(89) = 120.87
Prob > chi2 = 0.0139
Cameron & Trivedi's decomposition of IM-test
---------------------------------------------------
Source | chi2 df p
---------------------+-----------------------------
Heteroskedasticity | 120.87 89 0.0139
Skewness | 17.08 12 0.1464
Kurtosis | 1.14 1 0.2852
---------------------+-----------------------------
Total | 139.10 102 0.0086
---------------------------------------------------
. ivhettest, ivcp nr2
OLS heteroskedasticity test(s) using levels and cross products of all IVs
Ho: Disturbance is homoskedastic
White/Koenker nR2 test statistic : 120.871 Chi-sq(89) P-value = 0.0139
. ivhettest, ivsq nr2
OLS heteroskedasticity test(s) using levels and squares of IVs
Ho: Disturbance is homoskedastic
White/Koenker nR2 test statistic : 39.427 Chi-sq(24) P-value = 0.0246
. ivhettest, nr2
OLS heteroskedasticity test(s) using levels of IVs only
Ho: Disturbance is homoskedastic
White/Koenker nR2 test statistic : 19.416 Chi-sq(12) P-value = 0.0790
I appreciate your help with interpreting this counterintuitive result.
From: Nick Cox
The motivation for power calculations seems to be compromised when the
hypotheses being tested are determined by looking at results.
I am a great fan of looking at results to see whether I should revise
my analysis. But then I don't ever do power calculations and I sit
loose to most significance tests.
On Wed, Feb 29, 2012 at 9:03 AM, <[email protected]> wrote:
> Yes, I see now that the two commands (-imtest, white- and -ivhettest, ivcp nr2-)
> produce equivalent results also for my sample. I think it is a good idea to
> reduce the degrees of freedom because I have only 98 observations in my sample.
> Maybe I could even drop the -ivsq- option (and hence ignoring also the squares
> of the instruments). By calling just -ivhettest, nr2- I could enhance the
> power of the test by decreasing the degrees of freedom to 12.
>
[very big snip]
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/