Endogenous variables

Order

<- See Stata's other features

Stata allows you to fit linear equations with endogenous regressors by the generalized method of moments (GMM) and limited-information maximum likelihood (LIML), as well as two-stage least squares (2SLS) using ivregress.

To fit a model of quantity consumed on income, education level, and price by using the heteroskedasticity-robust GMM estimator, with the prices of raw materials and a competing product as additional instruments, you fill in the dialog like this:

dialog box

or type

. ivregress gmm quantity income education (price = praw pcompete)

To use the LIML estimator instead, you just click the box that says LIML on the dialog box or change gmm to liml.

ivregress can provide robust, cluster robust, jackknife, bootstrap, and heteroskedasticity- and autocorrelation-consistent (HAC) standard errors. With HAC standard errors you can select the Bartlett, Parzen, or quadratic spectral kernel, and you can specify the number of lags or request that Newey and West’s optimal lag-selection algorithm be used. The GMM estimator allows you to choose among robust, cluster robust, and HAC weight matrices.

After estimation with ivregress, you can use

estat firststage to obtain various statistics measuring the relevance of instrumental variable’s. First-stage R², partial R², F statistics, Shea’s partial R², and the Cragg and Donald minimum eigenvalue statistic, along with Stock and Yogo’s critical values for tests of weak instruments, are available.
estat overid provides tests of overidentifying restrictions. For the 2SLS estimator, Sargan’s and Basmann’s chi-squared tests are available, as is Wooldridge’s robust score test. After LIML estimation, the Anderson–Rubin chi-squared test and Basmann’s F test are available, and after GMM estimation, Hansen’s J statistic is available.

Example

Is the cost to rent an apartment related to the price of houses in a community? With state-level data on hand, we believe that the rental rate is a linear function of housing prices and the percentage of a state’s population living in urban areas. However, we suspect that random shocks that affect rental rates also affect housing prices, so we treat the housing price variable hsngval as endogenous. We have median family income data along with regional dummies that can be used as additional instruments.

Let’s fit our model by using the LIML estimator. In Stata, we type

. webuse hsng2
(1980 Census housing data)

. ivregress liml rent pcturban (hsngval = faminc i.region)

Instrumental variables (LIML) regression               Number of obs =      50
                                                       Wald chi2(2)  =   75.71
                                                       Prob > chi2   =  0.0000
                                                       R-squared     =  0.4901
                                                       Root MSE      =  24.992




        rent   Coefficient  Std. err.      z    P>|z|     [95% conf. interval]

   

     hsngval     .0026686   .0004173     6.39   0.000     .0018507    .0034865

    pcturban    -.1827391   .3571132    -0.51   0.609    -.8826681    .5171899

       _cons     117.6087   17.22625     6.83   0.000     83.84587    151.3715


Endogenous: hsngval
Exogenous:  pcturban faminc 2.region 3.region 4.region

Before we dwell on these results, we should first check to make sure that the instruments are sufficiently correlated with hsngval. We can do that by using estat firststage:

. estat firststage

  First-stage regression summary statistics



                            Adjusted      Partial

      Variable     R-sq.       R-sq.        R-sq.       F(4,44)   Prob > F

     

       hsngval    0.6908      0.6557       0.5473       13.2978    0.0000




  Minimum eigenvalue statistic = 13.2978     

  Critical Values                          # of endogenous regressors:    1
  H0: Instruments are weak                 # of excluded instruments:     4



                                          5%     10%     20%     30% 

  2SLS relative bias                    16.85   10.27    6.71    5.34

     

                                         10%     15%     20%     25% 

  2SLS Size of nominal 5% Wald test     24.58   13.96   10.26    8.31

  LIML Size of nominal 5% Wald test      5.44    3.87    3.30    2.98

All the R² statistics are relatively high, so they do not imply a weak-instrument problem. The F statistic is above the often-used threshold of 10. Because we are using the LIML estimator, we look at the final line of critical values in the second table. Suppose that we are willing to accept at most a rejection rate of 10% of a nominal 5% Wald test. Here we can reject the null hypothesis that the instruments are weak, because the test statistic of 13.30 exceeds its critical value of 5.44. On the basis of this test, we do not have a weak-instrument problem. Because our model has only one endogenous regressor, the minimum eigenvalue statistic is equal to the F statistic reported in the first table.

We should also do a test of overidentifying restrictions to verify the validity of our excluded instruments. estat overid makes that easy:

. estat overid

  Tests of overidentifying restrictions:

  Anderson-Rubin chi2(3) =  12.8453  (p = 0.0050)
  Basmann F(3, 44)       =  3.76796  (p = 0.0172)

Here we reject the null hypothesis that our instruments are valid. If we were to pursue this model further, we would probably reconsider whether including faminc as a regressor made sense. Families with higher incomes probably demand larger, more expensive apartments. These tests also assume that the errors are independently and identically distributed. Heteroskedasticity could be affecting these results as well. After fitting a model with the 2SLS estimator, estat overid can perform a test of overidentifying restrictions that is robust to heteroskedasticity.

Products

New in Stata 19

Why Stata

All features

Disciplines

Stata/MP

StataNow

Order Stata

Purchase

Order Stata

Bookstore

Stata Press

Stata Journal

Gift Shop

Learn

Free webinars

NetCourses

Classroom and web training

Organizational training

Video tutorials

Third-party courses

Web resources

Teaching with Stata

Support

Training

Video tutorials

FAQs

Statalist: The Stata Forum

Resources

Technical support

Customer service

Alerts

Company

News and events

Customer service

Careers

We use cookies

We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.

Cookie Settings

Privacy policy

Last updated: 16 November 2022

StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.

Required cookies

Advertising cookies

Required cookies

These cookies are essential for our website to function and do not store any personally identifiable information. These cookies cannot be disabled.
Advertising and performance cookies

This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.

Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.

Accept Cookies


rent		Coefficient Std. err. z P>\|z\| [95% conf. interval]

hsngval		.0026686 .0004173 6.39 0.000 .0018507 .0034865
pcturban		-.1827391 .3571132 -0.51 0.609 -.8826681 .5171899
_cons		117.6087 17.22625 6.83 0.000 83.84587 151.3715


			Adjusted Partial
	Variable		R-sq. R-sq. R-sq. F(4,44) Prob > F

	hsngval		0.6908 0.6557 0.5473 13.2978 0.0000


			5% 10% 20% 30%
	2SLS relative bias		16.85 10.27 6.71 5.34

			10% 15% 20% 25%
	2SLS Size of nominal 5% Wald test		24.58 13.96 10.26 8.31
	LIML Size of nominal 5% Wald test		5.44 3.87 3.30 2.98