"Mwale, McDonald" <[email protected]> wrote
> I am using Stata 8 to run a simple earnings frontier model. If I
> estimate for the whole sample, everything seems fine. However, the
> problem comes in when I split the sample across gender. Estimating
> the same model for males, I am getting the message "initial values
> not feasible" r(1400). I will appreciate if anyone could tell me
> what's wrong and how do I correct the problem.
By default, -frontier- uses a method of moments estimator to get starting
values for the maximum likelihood estimator. For some datasets, especially
those with almost no inefficiency effect, these starting values will not be
feasible. When this happens, the user should specify starting values using
the -from()- or the -ufrom()- option. When the method of moments starting
values are not well defined, a good method to obtain starting values is the
following
1) use a simple linear regression to get starting values for the
coefficients.
2) Let the natural log of the square of the root mean squared error
be the starting value for -lnsig2v- parameter, the natural log of
variance of the idiosyncratic error.
3) Use a small positive number, say .1, as the starting value for the
-lnsig2u- parameter, the natural log of the variance of the log
of the inefficiency term.
The -from()- option passes the starting values as specified to the
optimization algorithm. This is appropriate for the half-normal and
exponential models. If McDonald is fitting either a half-normal or an
exponential model, then I would recommend using the -from()- option to
specify the starting parameters.
The -ufrom()- option transforms the specified starting values to the
parameterization used in the maximizing the truncated-normal model. Thus,
if McDonald is fitting a truncated-normal model, then I would recommend
using the -ufrom()- option to specify the starting parameters.
To help clarify this method, allow me to consider an example.
I begin loading one of the datasets used in the -frontier- manual entry.
. use http://www.stata-press.com/data/r8/frontier1.dta
Next, since McDonald has a binary variable used to split the sample, I will
generate an artificial one.
. gen group = (lnx1 < -1)
Now, let's use -regress- to get starting values for the coefficients and the
variance of the idiosyncratic error term.
. regress lny lnx1 lnx2 if group == 1
Source | SS df MS Number of obs = 227
-------------+------------------------------ F( 2, 224) = 127.51
Model | 643.383262 2 321.691631 Prob > F = 0.0000
Residual | 565.112633 224 2.52282426 R-squared = 0.5324
-------------+------------------------------ Adj R-squared = 0.5282
Total | 1208.4959 226 5.34732697 Root MSE = 1.5883
------------------------------------------------------------------------------
lny | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lnx1 | .6323774 .0539177 11.73 0.000 .5261267 .7386282
lnx2 | .4290466 .0477281 8.99 0.000 .3349931 .5231001
_cons | -2.003945 .1976947 -10.14 0.000 -2.393524 -1.614365
------------------------------------------------------------------------------
Now, save the coefficients, the natural log of the square of the root mean
square error and the value .1 in a matrix called b0.
. matrix b0 = e(b), ln(e(rmse)^2) , .1
List out the matrix b0, just to see that it contains the values it should.
. matrix list b0
b0[1,5]
lnx1 lnx2 _cons c4 c5
y1 .63237743 .42904663 -2.0039447 .92537901 .1
Finally, fit the frontier model. Since I am fitting a half-normal model, I
use the -from()- option. If I had been fitting a truncated-normal model, I
would have used the -ufrom()- option. The -copy- option in the -from()-
option specifies that the values should be used as the appear in the matrix
without any reference to their names. Also, note that I have included the
sample restriction in the command.
. frontier lny lnx1 lnx2 if group==1, from(b0, copy)
Iteration 0: log likelihood = -453.81232 (not concave)
Iteration 1: log likelihood = -426.39691 (not concave)
Iteration 2: log likelihood = -423.20048
Iteration 3: log likelihood = -422.63106
Iteration 4: log likelihood = -422.46
Iteration 5: log likelihood = -422.45934
Iteration 6: log likelihood = -422.45934
Stoc. frontier normal/half-normal model Number of obs = 227
Wald chi2(2) = 289.32
Log likelihood = -422.45934 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
lny | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lnx1 | .6285203 .0513946 12.23 0.000 .5277888 .7292518
lnx2 | .4401239 .045387 9.70 0.000 .3511669 .5290808
_cons | -.2963553 .2761849 -1.07 0.283 -.8376677 .2449572
-------------+----------------------------------------------------------------
/lnsig2v | -.117837 .347148 -0.34 0.734 -.7982346 .5625605
/lnsig2u | 1.50636 .2535394 5.94 0.000 1.009432 2.003288
-------------+----------------------------------------------------------------
sigma_v | .9427836 .1636427 .670912 1.324825
sigma_u | 2.123743 .2692263 1.656515 2.722755
sigma2 | 5.399124 .922338 3.591375 7.206874
lambda | 2.25263 .4101298 1.448791 3.05647
------------------------------------------------------------------------------
Likelihood-ratio test of sigma_u=0: chibar2(01) = 6.32 Prob>=chibar2 = 0.006
I hope that this helps.
-David
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/