Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Endogeneity and Panel Data : treatreg, ivregress or .. ? Any suggestion would be really appreciated !
From
Cameron McIntosh <[email protected]>
To
STATA LIST <[email protected]>
Subject
RE: st: Endogeneity and Panel Data : treatreg, ivregress or .. ? Any suggestion would be really appreciated !
Date
Mon, 14 Nov 2011 08:11:59 -0500
John,
Thanks for clarifying. I think you still need to account for dependencies/serial correlation, but note that with 70 data points, I think you are now in the realm of large N, large T panel models. I don't think you can rely on the asymptotics that were
developed for just the large N fixed (small) T case, and you would need specialized resampling methods as well, if you want to go that route (other types of robust SE estimators might be preferable):
Kapetanios, G. (2008). A Bootstrap Procedure for Panel Data Sets with Many Cross-Sectional Units. Econometrics Journal, 11(2), 377-395.
Hansen, C.B. (2007). Asymptotic properties of a robust variance matrix estimator for panel data when T is large. Journal of Econometrics, 141, 597–620.
http://www.utdallas.edu/~d.sul/Econo2/Hansen-2007-JoE-PanelHAC.pdf
http://faculty.chicagobooth.edu/christian.hansen/research/techapp_panel_cov_t.pdf
http://faculty.chicagobooth.edu/christian.hansen/research/stock_watson_note.pdf
Cam
> From: [email protected]
> Date: Mon, 14 Nov 2011 12:02:12 +0100
> Subject: Re: st: Endogeneity and Panel Data : treatreg, ivregress or .. ? Any suggestion would be really appreciated !
> To: [email protected]
>
> Dear Cameron,
>
> Thanks for you answer.
> Let me precise thus the following
>
> The average number of points per individual is about 70. There is no
> doubt that choices of the endogenous variable whithin an individual
> are correlated : some will always choose 2 particular values, some
> will choose other values, etc.
> The endonegous variable is not naturally ordered.
>
> When I run treatreg as described before, without any correction for
> clustering, t stats are great
> Of course when I add the option vce(bootstrap cluster(id) reps(400) )
> every significance vanish (maybe my instrument is not as good as
> expected?)... but do I really need to correct for clustering here?
> I mean, treatreg is not really designed for panel data right ?
>
> Thx
>
> On 14 November 2011 06:08, Cameron McIntosh <[email protected]> wrote:
> > John,
> > I strongly suggest that you take a look at the following (as for clustered observations and missing data, you haven't really described the dataset enough to be able to comment -- how many time points are there?):
> > Greene, W.H., & Hensher, D.A. (2010). Modeling Ordered Choices: A Primer. Cambridge, UK: Cambridge University Press.
> > Chesher, A., & Smolinski, K. (2012). IV models of ordered choice. Journal of Econometrics, 166(1), 33-48.
> > Carrasco, R. (2001). Binary Choice With Binary Endogenous Regressors in Panel Data. Journal of Businessand Economic Statistics, 19(4), 385-394.
> > Chesher, A., Rosen, A., & Smolinski, K. (February 11, 2011). An instrumental variable model of multiple discrete choice. cemmap working paper CWP06/11. The Institute for Fiscal Studies Department of Economics, UCL. http://www.ihs.ac.at/vienna/resources/Economics/Papers/20111117_Paper_Rosen.pdf
> > Mullahy, J. (2001). [Estimation of Limited Dependent Variable Models with Dummy Endogenous Regressors: Simple Strategies for Empirical Practice]: Comment. Journal of Business & Economic Statistics, 19(1), 23-25.
> > Altonji, J.G., & Matzkin, R.L. (2005). Cross Section And Panel Data Estimators For Nonseparable Models With Endogenous Regressors. Econometrica, 73(4), 1053-1101.http://down.cenet.org.cn/upfile/94/2005711152046101.pdf
> > Dong, Y., & Lewbel, A. (April 2011). Simple Estimators for Binary Choice Models With Endogenous Regressors.http://fmwww.bc.edu/ec-p/wp604.pdf
> > Lewbel, A. (March 2011). Binary Choice With Endogenous Or Mismeasured Regressors.http://www.indiana.edu/~caepr/visitors/2011/downloads/LewbelBinaryChoice.pdf
> > Cam
> >
> > ----------------------------------------
> >> From: [email protected]
> >> Date: Sun, 13 Nov 2011 15:41:46 +0100
> >> Subject: st: Endogeneity and Panel Data : treatreg, ivregress or .. ? Any suggestion would be really appreciated !
> >> To: [email protected]
> >>
> >> Dear Stata List,
> >> Dear Mark Schaffer (I guess ;-) )
> >>
> >> I have a econometric question related to endogenous variables and
> >> panel data, and I believe that it can be interesting for anyone who
> >> uses longitudinal data.
> >>
> >> Here's the context :
> >>
> >> I have a panel dataset of individuals who, at any time t, could
> >> endogenously chose the value of a variable E (for endogenous). E is
> >> not ordered and could take few values (in my case, 6 possible
> >> choices).
> >>
> >> I am particularly interested in the effect of one of these choices on
> >> a fully continuous outcome variable Y.
> >>
> >> That is, at any time and for any individual I would like to estimate
> >>
> >> Yit=a+bXit+cZit+eit
> >>
> >> where for example, Z is a binary variable that is equals to 1 if
> >> individual i chooses E="the value of interest" at time t, and zero
> >> otherwise. variables in X are assumed to be exogenous.
> >> I believe I have a good instrument for Z, along for other control
> >> demographic variables, and therefore I guess I have basically two
> >> choices in order to take into account the panel nature of my dataset
> >>
> >> 1) using ivregress2 with the option cluster(id) and correcting for the
> >> endogenous part with (Z= instrument + age + location of birth).
> >> However Z is a dummy variable... I know this should not be a problem
> >> but...
> >> 2) using treatreg with the option vce(bootstrap, cluster(id)
> >> reps(400)) and modeling the choice of E=2 (that is Z=1) with treat(Z=
> >> instrument + age + location of birth)
> >> 3) I tried to use xtivreg 2 with fixed effects, but location of birth
> >> is time invariant (and I believe very important in order to understand
> >> Z) so it cannot be estimated.
> >>
> >> Is my approach correct ? Do you have eventually other ways to tacke
> >> this multiple choice endogenous problem ?
> >>
> >> Moreover, in the context of panel data, do I always need to use
> >> clustering on id in order to have correct standard errors ?
> >> My dataset is large, but I have much more time variation than
> >> clusters. About 200 000 individuals and 10 million observations for
> >> the whole dataset.
> >> The period where the instrument is available reduces the dataset
> >> considerably : 1 million observations and about 20 000 individuals.
> >> An important remark : the panel is NOT balanced. So individuals could
> >> come in and out of the dataset during the 10 year period covered by my
> >> dataset. Some have thus very few observations, and some have hundreds
> >> of rows.
> >>
> >>
> >>
> >> Many thanks in advance for your suggestions,
> >>
> >> Best,
> >> *
> >> * For searches and help try:
> >> * http://www.stata.com/help.cgi?search
> >> * http://www.stata.com/support/statalist/faq
> >> * http://www.ats.ucla.edu/stat/stata/
> >
> > *
> > * For searches and help try:
> > * http://www.stata.com/help.cgi?search
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
> >
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/