|
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: xttobit: Different versions, different results?
From |
[email protected] (Vince Wiggins, StataCorp) |
To |
[email protected] |
Subject |
Re: st: xttobit: Different versions, different results? |
Date |
Fri, 05 Sep 2008 14:18:04 -0500 |
Nikos Nikiforakis <[email protected]> reports that he got
"substantially different results in versions 8, 9, and 10" when estimating a
model using -xttobit-, "despite using the same set of variables and the same
dataset."
The short answer is that Nikos needs to use -quadchk- which evaluates the
accuracy of the estimates given the number of integration points used. Nikos
got different results from different Statas because (1) we have improved
Stata's quadrature routines three times, each producing more accurate results
for the same number of integration points, and (2) given Nikos' problem, there
were an insufficient number of integration points. Given the improvements,
perhaps the results from Stata 10 are accurate, or perhaps they too need more
integration points. -quadchk- will answer that question. The number of
integration points defaults to 12 and Nikos can specify option -intpoints(#)-
on -xttobit- to use more.
Nikos could have known this. We do recommend use of -quadchk- and
-intpoints()- in the printed documentation, but I was disappointed to find
that none of that made it to the on-line help files. We will fix that.
Longer Explanation
------------------
The random effects (RE) tobit model, and many of the other RE models (see [R]
quadchk), do not have a closed-form solution for their likelihood. That means
we must approximate the solution. Stata uses Gauss-Hermite quadrature to
do this.
Quadrature refers to numerical integration and its accuracy depends on the
number of integration points used. Use more and, in general, you will get a
more accurate result. Use more and results will take longer to be computed.
So how do you know you have used enough? -quadchk- answers that question by
using a few more and a few less and reporting the differences in the parameter
estimates. If the differences are small, you have used enough. If they are
large, you need to use more, or perhaps your problem is just not amenable to
quadrature or other forms of numerical integration. In our experience,
unamenable problems rarely occur unless there are high within-panel
correlation or a large number of observations within panels. Either problem
makes the likelihood function extremely spiky.
Given a number of integration points, there are different ways the quadrature
can be performed. Stata 8 used one method. Stata 9 used a better method.
And Stata 10 uses an even better one. Stata 10's method is much better
because it can accurately evaluate integrals that the other two methods could
not, even if they were given hundreds of integration points.
Anyway, when Nikos reran his estimation from Stata 8 in Stata 9 and then again
in Stata 10, he in effect performed his own -quadchk-, and he discovered
that results changed, and that means more integration points are necessary.
Nikos' data has some warning signs that suggest quadrature may be difficult.
He has only 33 panels and each panel has 40 observations. Because the
asymptotic assumptions for the consistency of -xttobit- and other RE models
require the number of panels to go to infinity, having only 33 is cause for
concern. In addition, the large number of observations per panel (40) could
be of concern computationally. The integral that must be approximated is over
each set of those 40 observations and computing that accurately can be
difficult. Nikos's data also has over 40% of the observations censored and
this makes the estimates more sensitive to the shape of the approximated
integral. All of which is to say, Nikos might be okay with the default 12
integration points, might need more, and perhaps a lot more, or even
might never find a number large enough so that -quadchk- finally returns
results that are satisfactory.
For those of you who use the other RE estimators in Stata, such as
-xtlogit-, -xtprobit-, or -xtpoisson-, quadrature problems more rarely
arise than they do with -xttobit- and -xtintreg-. That's because
of the censoring in -xttobit- and -xtintreg-.
If Nikos' data are not proprietary, I would very much appreciate if he would
send me a copy. Such datasets help us fine tune our methods for performing
the integrations.
-- Vince
[email protected]
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/