Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: inconsistent results for two-dimensions fixed effects regressions using xtreg reg areg ivreg2
From
Michael Barker <[email protected]>
To
[email protected]
Subject
Re: st: inconsistent results for two-dimensions fixed effects regressions using xtreg reg areg ivreg2
Date
Wed, 14 Aug 2013 14:57:30 -0400
Hi Nahla,
You are having trouble with the xtcommands, because you're not really
doing a panel-data analysis. Panel data implies one observation per
unit per year. You are analyzing this data using industry and year, so
you have many observations (firms) per unit (industry) per year. That
is why you got the error about repeated time values within panel. Your
data may actually be panel data, at the firm-year level, but you are
analyzing it as clustered data, not panel data.
You said that you were using two-dimension fixed-effects, so I would
keep industry and year as separate groups of dummy variables, rather
than creating a single interaction. The results may come out the same,
I'm not sure about that, but I think it is easier conceptually.
Lastly, if you are including fixed effects at the industry-level, you
don't have to compute clustered standard errors at the same level. You
can just use the typical robust standard error estimator. The cluster
fixed effects will control for correlation of error-terms within
clusters.
So I think you should use one of these two commands:
reg IV DV i.year i.industry, robust
areg IV DV i.year , absorb (industry) robust
About the ivreg2 command, it is used for instrumental variables. I
think your "IV" stands for independent variable, not instrumental
variable, so it is not relevant to your topic. ivreg2 will not help
you with a fixed-effects analysis.
Mike
On Wed, Aug 14, 2013 at 10:51 AM, Nahla Betelmal <[email protected]> wrote:
> Thank you so much Mike, your detailed comments are great help. I do
> appreciate it.
>
> As I am looking for industry year fixed effects rather than firm year,
> I tried to set the panel accordingly, but did not work due to repeated
> time values within panel.
>
> So, this time I grouped based on industry-year (thanks to ur note
> about no repeated firms in different industries). I hope this time I
> did it in the right way. Kindly let me know please. I got identical
> coefficients for IV.
>
> Also, could you please explain more your comment about ivreg2 or give
> and an example how to execute it right to get fixed effects please.
>
> the command are :
> 1) egen industry_year= group(industry year) then xtset industry_year
> then xtreg IV DV, fe vce (cluster industry)
> 2) xi: reg IV DV i.year i.industry
> 3)areg IV DV , absorb ( industry_year ) cluster (industry)
>
> In the first command , I could not put i.year as it is omitted because
> of collinearity.
> In the second, I could not apply cluster (industry) option as F-test
> became missing.
> The third command gave almost identical results to the previous two
> with and without the cluster option. However, it gave slightly
> different R-Square 0.645 than that of regress 0.621. Is this OK or
> they should be identical.
>
>
>
> egen industry_year= group(industry year)
> xtset industry_year
> xtreg DV IV, fe vce (cluster industry )
>
> Fixed-effects (within) regression Number of obs = 23830
> Group variable: industry_year Number of groups = 1179
>
> R-sq: within = 0.5516 Obs per group: min = 1
> between = 0.5262 avg = 20.2
> overall = 0.4955 max = 155
>
> F(1,57) = 2233.13
> corr(u_i, Xb) = -0.1260 Prob > F = 0.0000
>
> (Std. Err. adjusted for 58 clusters in industry)
> ------------------------------------------------------------------------------
> | Robust
> DV| Coef. Std. Err. t P>|t| [95% Conf. Interval]
> -------------+----------------------------------------------------------------
> IV| .4393407 .009297 47.26 0.000 .4207237 .4579577
> _cons | 5.675498 .0673395 84.28 0.000 5.540653 5.810343
> -------------+----------------------------------------------------------------
> sigma_u | .40739078
> sigma_e | .58512671
> rho | .32648834 (fraction of variance due to u_i)
> ------------------------------------------------------------------------------
>
>
>
> Also I tried
> xi: reg DV IV i.year i.industry
>
> without a cluster(industry) as F-test became missing
>
> IV= .4397811 and SE= .0026298
> If I run xtreg without the cluster option, I get the same SE= .0026322
>
> the output is too long
>
> In addition
> areg DV IV, absorb ( industry_year ) cluster (industry)
>
> Linear regression, absorbing indicators Number of obs = 23830
> F( 1, 57) = 2122.73
> Prob > F = 0.0000
> R-squared = 0.6458
> Adj R-squared = 0.6274
> Root MSE = 0.5851
>
> (Std. Err. adjusted for 58 clusters in industry)
> ------------------------------------------------------------------------------
> | Robust
> DV | Coef. Std. Err. t P>|t| [95% Conf. Interval]
> -------------+----------------------------------------------------------------
> IV | .4393407 .0095357 46.07 0.000 .4202457
> .4584357
> _cons | 5.675498 .0690685 82.17 0.000 5.537191 5.813805
> -------------+----------------------------------------------------------------
> industry_year | absorbed (1179 categories)
>
>
> Many thanks again
>
> Nahla
>
>
> On 14 August 2013 14:29, Michael Barker <[email protected]> wrote:
>> Hi Nahla,
>>
>> You are actually running several different models there. I'll describe
>> each one below, so you can see how they differ:
>>
>>> 1) xi: reg DV IV i.year, vce (cluster industry)
>> - Year fixed effects only.
>> - Include one dummy variable for each year:
>>
>>> 2) xtset firm year then xtreg DV IV i.year, fe vce (cluster industry)
>> - Year and firm fixed effects
>> - Equivalent to including one dummy for each year and one dummy for each firm.
>> - xtreg includes fixed effects for the panel variable, firm and you
>> include year dummies manually
>>
>>> 3) egen industry_firm= group (industry firm) then xtset industry_firm year then xtreg DV IV i.year, fe vce (cluster industry)
>> - year and industry-firm level fixed effects
>> - equivalent to including one dummy for each year and one dummy for
>> each industry-firm combination
>> - apparently no firm is in multiple industries, so this regression is
>> equivalent to regression 2.
>>
>>> 4) tsset industry_firm year then ivreg2 DV IV,cluster ( industry_firm year)
>> - No fixed effects
>> - You didn't specify the endogenous / IV variables, so this is just a
>> regular regression with clustered standard errors
>> - This is equivalent to "reg DV IV,cluster ( industry_firm year)"
>>
>>> 5) areg DV IV, absorb ( year ) cluster (industry)
>> - Year fixed effects only
>> - Equivalent to regression 1, without reporting year coefficients
>> - Notice that the coefficient and standard error estimates are the
>> same as the first regression.
>>>
>>
>> If you want firm and year fixed effects, I would use regression 2. If
>> you want to see equivalent results with alternative regressions, try
>> these:
>> xi: reg DV IV i.year i.firm, vce (cluster industry)
>> areg DV IV i.year, absorb (firm) cluster (industry)
>>
>> The first suggestion might not run, since you will have to include
>> many dummy variables for all of your firms. You may exceed the maximum
>> number of variables allowed, depending on your version of Stata.
>>
>> Mike
>>
>>
>>
>>
>> On Wed, Aug 14, 2013 at 8:22 AM, Nahla Betelmal <[email protected]> wrote:
>>> Hi Statalist,
>>>
>>> I have a panel data of firms and years, however, I would like to
>>> perform industry and year fixed effect regression. using different
>>> approaches, I got different IV coefficient and standard error,
>>> although it should be identical if I am doing it right. I would highly
>>> appreciate it if someone kindly explain what I am doing wrong and what
>>> is the right way to get industry and year fixed effects.
>>>
>>> the commands I used are:
>>>
>>> 1) xi: reg DV IV i.year, vce (cluster industry)
>>>
>>> 2) xtset firm year then xtreg DV IV i.year, fe vce (cluster industry)
>>>
>>> 3) egen industry_firm= group (industry firm) then xtset industry_firm
>>> year then xtreg DV IV i.year, fe vce (cluster industry)
>>>
>>> 4) tsset industry_firm year then ivreg2 DV IV,cluster ( industry_firm year)
>>>
>>> 5) areg DV IV, absorb ( year ) cluster (industry)
>>>
>>>
>>> under reg command: IV = 0.386 with SE= 0.022
>>> under xtreg command with firm year panel set: IV = .418 with SE= .0241
>>> under xtreg command with industry-firm year panel set: IV = .418 with SE= .024
>>> under ivreg2 command: IV = .410 with SE= .007
>>> under areg command: IV = 0.386 with SE= 0.022
>>>
>>>
>>> . xi: reg DV IV i.year, vce (cluster industry)
>>> i.year _Iyear_1992-2012 (naturally coded; _Iyear_1992 omitted)
>>>
>>> Linear regression Number of obs = 23830
>>> F( 21, 57) = 768.66
>>> Prob > F = 0.0000
>>> R-squared = 0.5461
>>> Root MSE = .6461
>>>
>>> (Std. Err. adjusted for 58 clusters in industry)
>>> -------------------------------------------------------------------------------
>>> | Robust
>>> DV | Coef. Std. Err. t P>|t| [95%
>>> Conf. Interval]
>>> --------------+----------------------------------------------------------------
>>> IV | .3869693 .0225831 17.14 0.000
>>> .3417475 .4321911
>>> _Iyear_1993 | .150389 .0239546 6.28 0.000 .1024208 .1983573
>>> _Iyear_1994 | .2857099 .0271864 10.51 0.000 .2312702 .3401496
>>> _Iyear_1995 | .2927993 .0307951 9.51 0.000 .2311331 .3544654
>>> _Iyear_1996 | .4353512 .0304859 14.28 0.000 .3743044 .4963981
>>> _Iyear_1997 | .5286896 .0292151 18.10 0.000 .4701874 .5871917
>>> _Iyear_1998 | .5852497 .0337522 17.34 0.000 .5176621 .6528374
>>> _Iyear_1999 | .6969439 .0523892 13.30 0.000 .5920364 .8018514
>>> _Iyear_2000 | .8019949 .0666928 12.03 0.000 .6684448 .9355449
>>> _Iyear_2001 | .7710818 .0486744 15.84 0.000 .673613 .8685507
>>> _Iyear_2002 | .6978223 .0325914 21.41 0.000 .6325592 .7630854
>>> _Iyear_2003 | .6427671 .0347611 18.49 0.000 .5731593 .712375
>>> _Iyear_2004 | .7757021 .0394535 19.66 0.000 .6966978 .8547064
>>> _Iyear_2005 | .7806429 .0418054 18.67 0.000 .6969291 .8643566
>>> _Iyear_2006 | .7746051 .0462916 16.73 0.000 .6819076 .8673025
>>> _Iyear_2007 | .7758041 .0484202 16.02 0.000 .6788444 .8727639
>>> _Iyear_2008 | .7734638 .0508533 15.21 0.000 .6716317 .8752958
>>> _Iyear_2009 | .7319797 .0564072 12.98 0.000 .6190263 .8449332
>>> _Iyear_2010 | .8741285 .0506573 17.26 0.000 .772689 .975568
>>> _Iyear_2011 | .8889354 .0532101 16.71 0.000 .782384 .9954869
>>> _Iyear_2012 | .8979328 .0565989 15.86 0.000 .7845956 1.01127
>>> _cons | 5.403047 .1238831 43.61 0.000 5.154975 5.651118
>>> -------------------------------------------------------------------------------
>>>
>>>
>>>
>>>
>>> xtset firm year
>>> panel variable: firm (unbalanced)
>>> time variable: year, 1992 to 2012, but with gaps
>>> delta: 1 unit
>>>
>>> . xtreg DV IV i.year, fe vce (cluster industry)
>>>
>>> Fixed-effects (within) regression Number of obs = 23830
>>> Group variable: firm Number of groups = 2312
>>>
>>> R-sq: within = 0.4113 Obs per group: min = 1
>>> between = 0.5998 avg = 10.3
>>> overall = 0.5456 max = 21
>>>
>>> F(21,57) = 463.93
>>> corr(u_i, Xb) = -0.0970 Prob > F = 0.0000
>>>
>>> (Std. Err. adjusted for 58 clusters in industry)
>>> ------------------------------------------------------------------------------
>>> | Robust
>>> DV | Coef. Std. Err. t P>|t| [95% Conf. Interval]
>>> -------------+----------------------------------------------------------------
>>> IV | .4183645 .0241281 17.34 0.000 .3700488
>>> .4666802
>>> |
>>> year |
>>> 1993 | .1560772 .0200202 7.80 0.000 .1159874 .196167
>>> 1994 | .2929982 .0224807 13.03 0.000 .2479813 .3380151
>>> 1995 | .3019359 .0268163 11.26 0.000 .2482373 .3556345
>>> 1996 | .4272691 .0264501 16.15 0.000 .3743038 .4802344
>>> 1997 | .5209287 .0266063 19.58 0.000 .4676506 .5742069
>>> 1998 | .5877827 .0276877 21.23 0.000 .5323391 .6432264
>>> 1999 | .6989115 .0427304 16.36 0.000 .6133453 .7844777
>>> 2000 | .7988406 .0477286 16.74 0.000 .7032657 .8944154
>>> 2001 | .7589164 .0375573 20.21 0.000 .6837091 .8341236
>>> 2002 | .687617 .034973 19.66 0.000 .6175848 .7576492
>>> 2003 | .6310008 .0488884 12.91 0.000 .5331035 .7288982
>>> 2004 | .7611996 .0507837 14.99 0.000 .659507 .8628921
>>> 2005 | .7687923 .0552525 13.91 0.000 .6581511 .8794336
>>> 2006 | .7524079 .0609127 12.35 0.000 .6304324 .8743834
>>> 2007 | .7519399 .0642041 11.71 0.000 .6233734 .8805064
>>> 2008 | .750493 .0684401 10.97 0.000 .6134441 .887542
>>> 2009 | .7118027 .067056 10.62 0.000 .5775254 .8460799
>>> 2010 | .8504969 .0632919 13.44 0.000 .7237569 .9772368
>>> 2011 | .8674839 .0664437 13.06 0.000 .7344328 1.000535
>>> 2012 | .863437 .0733127 11.78 0.000 .7166308 1.010243
>>> |
>>> _cons | 5.18669 .152373 34.04 0.000 4.881568 5.491812
>>> -------------+----------------------------------------------------------------
>>> sigma_u | .4935113
>>> sigma_e | .47151369
>>> rho | .52278302 (fraction of variance due to u_i)
>>> ------------------------------------------------------------------------------
>>>
>>>
>>> . egen industry_firm= group (industry firm)
>>>
>>> . xtset industry_firm year
>>> panel variable: industry_firm (unbalanced)
>>> time variable: year, 1992 to 2012, but with gaps
>>> delta: 1 unit
>>>
>>>
>>>
>>>
>>>
>>> . xtreg DV IV i.year, fe vce (cluster industry)
>>>
>>> Fixed-effects (within) regression Number of obs = 23830
>>> Group variable: industry_firm Number of groups = 2312
>>>
>>> R-sq: within = 0.4113 Obs per group: min = 1
>>> between = 0.5998 avg = 10.3
>>> overall = 0.5456 max = 21
>>>
>>> F(21,57) = 463.93
>>> corr(u_i, Xb) = -0.0970 Prob > F = 0.0000
>>>
>>> (Std. Err. adjusted for 58 clusters in industry)
>>> ------------------------------------------------------------------------------
>>> | Robust
>>> DV | Coef. Std. Err. t P>|t| [95% Conf. Interval]
>>> -------------+----------------------------------------------------------------
>>> IV | .4183645 .0241281 17.34 0.000 .3700488 .4666802
>>> |
>>> year |
>>> 1993 | .1560772 .0200202 7.80 0.000 .1159874 .196167
>>> 1994 | .2929982 .0224807 13.03 0.000 .2479813 .3380151
>>> 1995 | .3019359 .0268163 11.26 0.000 .2482373 .3556345
>>> 1996 | .4272691 .0264501 16.15 0.000 .3743038 .4802344
>>> 1997 | .5209287 .0266063 19.58 0.000 .4676506 .5742069
>>> 1998 | .5877827 .0276877 21.23 0.000 .5323391 .6432264
>>> 1999 | .6989115 .0427304 16.36 0.000 .6133453 .7844777
>>> 2000 | .7988406 .0477286 16.74 0.000 .7032657 .8944154
>>> 2001 | .7589164 .0375573 20.21 0.000 .6837091 .8341236
>>> 2002 | .687617 .034973 19.66 0.000 .6175848 .7576492
>>> 2003 | .6310008 .0488884 12.91 0.000 .5331035 .7288982
>>> 2004 | .7611996 .0507837 14.99 0.000 .659507 .8628921
>>> 2005 | .7687923 .0552525 13.91 0.000 .6581511 .8794336
>>> 2006 | .7524079 .0609127 12.35 0.000 .6304324 .8743834
>>> 2007 | .7519399 .0642041 11.71 0.000 .6233734 .8805064
>>> 2008 | .750493 .0684401 10.97 0.000 .6134441 .887542
>>> 2009 | .7118027 .067056 10.62 0.000 .5775254 .8460799
>>> 2010 | .8504969 .0632919 13.44 0.000 .7237569 .9772368
>>> 2011 | .8674839 .0664437 13.06 0.000 .7344328 1.000535
>>> 2012 | .863437 .0733127 11.78 0.000 .7166308 1.010243
>>> |
>>> _cons | 5.18669 .152373 34.04 0.000 4.881568 5.491812
>>> -------------+----------------------------------------------------------------
>>> sigma_u | .4935113
>>> sigma_e | .47151369
>>> rho | .52278302 (fraction of variance due to u_i)
>>> ------------------------------------------------------------------------------
>>>
>>>
>>>
>>> ivreg2 DV IV,cluster ( industry_firm year)
>>>
>>> OLS estimation
>>> --------------
>>>
>>> Estimates efficient for homoskedasticity only
>>> Statistics robust to heteroskedasticity and clustering on
>>> industry_firm and fyear2
>>>
>>> Number of clusters (industry_firm) = 2312 Number of obs = 23830
>>> Number of clusters (fyear2) = 21 F( 1, 20) = 2849.29
>>> Prob > F = 0.0000
>>> Total (centered) SS = 21896.66904 Centered R2 = 0.4955
>>> Total (uncentered) SS = 1891568.745 Uncentered R2 = 0.9942
>>> Residual SS = 11046.6797 Root MSE = .6809
>>>
>>> ------------------------------------------------------------------------------
>>> | Robust
>>> DV | Coef. Std. Err. z P>|z| [95% Conf. Interval]
>>> -------------+----------------------------------------------------------------
>>> IV | .410624 .0075071 54.70 0.000 .3959104 .4253377
>>> _cons | 5.883496 .0562149 104.66 0.000 5.773317 5.993675
>>> ------------------------------------------------------------------------------
>>> Included instruments: IV
>>>
>>>
>>>
>>>
>>> areg DV IV, absorb ( year ) cluster (industry)
>>>
>>> Linear regression, absorbing indicators Number of obs = 23830
>>> F( 1, 57) = 293.62
>>> Prob > F = 0.0000
>>> R-squared = 0.5461
>>> Adj R-squared = 0.5457
>>> Root MSE = 0.6461
>>>
>>> (Std. Err. adjusted for 58 clusters in twodigit)
>>> ------------------------------------------------------------------------------
>>> | Robust
>>> DV | Coef. Std. Err. t P>|t| [95% Conf. Interval]
>>> -------------+----------------------------------------------------------------
>>> IV | .3869693 .0225831 17.14 0.000 .3417475 .4321911
>>> _cons | 6.05483 .1337655 45.26 0.000 5.786969 6.322691
>>> -------------+----------------------------------------------------------------
>>> year | absorbed (21 categories)
>>>
>>>
>>>
>>> Many thanks in advance,
>>>
>>> Nahla Betelmal
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/