Note: Apologies if this mail comes out twice.
Dear Statalisters,
I am trying to estimate a count model with endogenous switching as
proposed by Terza(1998). It involves the use of a two-step method of
moments estimator. First stage is done using a probit and second stage is
done using non-linear least squares. I basically have coded all but the
correction for the covariance matrix. For calculating such a matrix I have
to create an intermediate matrix that has the following general form:
Y = A'*W*A
A is nxk matrix and W is a nxn diagonal matrix with regression's
squared errors in the diagonal and zeros elsewhere. Lets say that I have
variables a1 a2 a3 forming matrix A. And that the squared errors are saved
as variable res2. I have more than 20,000 observations in my dataset.
Since it is not possible to create matrices of 20,000x20,000 in Stata, I
was very kindly suggested by Michael Blasnik to use the weighting feature
of matrix accum for calculating matrix Y. Basically he suggested to use:
.mat accum a1 a2 a3 [aw=res], noc
In order to check that this solution is correct I drop observations after
estimating my model, and residuals, and kept only 200 observations. Then,
as also kindly suggested by Nick Cox, I calculate matrix A and W in the
following way:
.mkmat a1 a2 a3, matrix(A)
.local n = _N
.matrix W = J(`n',`n',0)
.forval i=1/`n' { W[`i',`i']=res2[`i'] }
.matrix mymat = A'*W*A
.matrix list mymat
.symmetric mymat[3,3]
a1 a2 a3
a1 62054.362
a2 60504.697 60504.697
a3 1004.4707 1004.4707 1004.4707
This matrix is what I want but with large data it cannot be calculated
using Nick's suggestion. Now, using the weighting feature of matrix accum
.matrix accum H = a1 a2 a3 [aw=res2], noc
(sum of wgt is 2.4974e+03)
(obs=200)
.matrix list H
symmetric H[3,3]
a1 a2 a3
a1 4969.5487
a2 4845.4456 4845.4456
a3 80.441821 80.441821 80.441821
Clearly, mymat and H are different. Jiang, Tao very kindly suggested an
alternative which would be:
.sca m=10000
.matrix accum Z= a1 a2 a3 [fw=res2+m], noc
.matrix accum Z2= a1 a2 a3, noc
.matrix Z=Z1-m*Z2
however, since res2 is not a integer number, frequency weights cannot be
estimated. I did what Jiang, Tao suggested using analytic weigths:
sca m=10000
matrix accum Z1= g1c g1cat g1ind [aw=res2+m], noc
(sum of wgt is 2.0002e+07)
(obs=200)
matrix accum Z2= a1 a2 a3, noc
(obs=200)
matrix Z=Z1-m*Z2
matrix list Z
symmetric Z[3,3]
a1 a2 a3
a1 -4.670e+08
a2 -4.438e+08 -4.438e+08
a3 -4512856 -4512856 -4512856
Which is also different to maymat. It seems then that all suggested
strategies do not yield the matrix that I need. Does anyone has other
idea?
Many thanks,
Alfonso Miranda
University of Warwick
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/