Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Millimet, Daniel" <millimet@mail.smu.edu> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | RE: st: RE: Multiple endogenous variables IV. Estimating the first stage regression with only a subset of the instruments |
Date | Wed, 26 Oct 2011 18:30:33 +0000 |
First, you should see the FAQ about omitting exogenous variables from the first-stage. This should apply to instruments from other first-stages when you have multiple endogenous variables. http://www.stata.com/support/faqs/stat/ivreg.html Second, the inclusion of the x's from the second-stage in the first-stages do not break the linear dependence between perfectly collinear instruments. Third, I cannot evaluate whether your IVs are valid or not based on limited information. However, if you have 2 endogenous regressors, you need two (linearly independent) instruments. Intuitively, these instruments must give you independent sources of exogenous variation in each endogenous regressor. If the instruments are "too" correlated, the model will be weakly identified (and the weak id tests in -ivreg2- should detect this). If they are perfectly correlated, the model is under-identified. Daniel **************************************************** Daniel L. Millimet Research Fellow, IZA Professor, Department of Economics Box 0496 SMU Dallas, TX 75275-0496 phone: 214.768.3269 fax: 214.768.1821 web: http://faculty.smu.edu/millimet **************************************************** -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nicolai Borgen Sent: Wednesday, October 26, 2011 6:08 AM To: statalist@hsphsun2.harvard.edu Subject: Re: st: RE: Multiple endogenous variables IV. Estimating the first stage regression with only a subset of the instruments Thank your for your answer Daniel. In the first-stage, several control variables will be included as well. Thus, the fitted values for x1 and x2 should be less linearly dependent than z1 and z2? This should be less of a problem if I use living in municpality of school x1 in adolescence (z1b) as an instrument for x1 and distance from municipality of adolescence (z2) as an instrument for x2? Again, estimating the first-stage of x1 using only z1b and the first-stage of x2 using z2? Best, Nicolai On 25.10.2011 15:39, Millimet, Daniel wrote: > No, the system is under-identified. The easiest way to see that your solution would still fail (even if it were acceptable to only include a subset of the exogenous vars in each first-stage) is that the fitted values for x1 and x2 will be linearly dependent since z1 and z2 are linearly dependent. > > Daniel > > **************************************************** > Daniel L. Millimet > Research Fellow, IZA > Professor, Department of Economics > Box 0496 > SMU > Dallas, TX 75275-0496 > phone: 214.768.3269 > fax: 214.768.1821 > web: http://faculty.smu.edu/millimet > **************************************************** > > > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nicolai > Borgen > Sent: Tuesday, October 25, 2011 8:00 AM > To: statalist@hsphsun2.harvard.edu > Subject: st: Multiple endogenous variables IV. Estimating the first > stage regression with only a subset of the instruments > > I have a question regarding the use of instrumental variables when having multiple endogenous variables. I am estimating the economic return of attending different educational institutions X1, X2 and X3. > For simplicity, the models are presented without control variables and constant. > > [A1] Y= δ1*X1 + δ2*X2 + δ3*X3 + ε > > Since I have strong reasons to suspect that X1, X2 and X3 is > correlated with ε, it is necessary to use an instrumental variable > approach to estimate the return. I therefore use proximity (in km) > between municipality of adolescence and these institutions as > instruments (Z1, > Z2 and Z3). Using regression commands such as ivregress and ivreg2 in STATA, all instruments are included in the first stage for each endogenous variable X1, X2 and X3: > > [B1] X1 = B1*Z1 + B2*Z2 + B3*Z3 + ε > [C1] X2 = B1*Z1 + B2*Z2 + B3*Z3 + ε > [D1] X3 = B1*Z1 + B2*Z2 + B3*Z3 + ε > > My problem occurs because X1 and X2 are located in the same city, and > Z1 and Z2 are therefore perfectly collinear. Thus, in [B1] and [C1] either Z1 or Z2 is dropped. My model is therefore basically under-identified. Based on this, I have the following two questions: > > (1) Is it possible to estimate the first stage regression using a subset of the instruments? > > [B2] X1 = B1*Z1 + ε > [C2] X2 = B2*Z2 + ε > [D2] X3 = B3*Z3 + ε > > > (2) This STATA page > http://www.stata.com/support/faqs/stat/ivr_faq.html > shows an example of how to perform the two-step computations for the instrumental variable estimator without using ivregress or ivreg2. Is this a feasible solution? Are there any STATA commands I can use that do this? > > Many thanks, > Nicolai Borgen > University of Oslo > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ -- Nicolai Borgen Institutt for sosiologi og samfunnsgeografi Universitet i Oslo Postboks 1096 Blindern 0317 Oslo Tlf.: +47 22 85 86 67 e-post: nicolai.borgen@sosgeo.uio.no * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/