Well, yes and no...
Yes, in the sense that, yes, this is an FGLS estimator
No, in the sense that one has to tset and what have you, and I was
interested in a problem with grouped data (but not panel data).
To be concrete: In a basic text, Powers and Xie, *Statistical Methods for
Categorical Data Analysis*, there is a simple table of six rates, for
three ages and two time periods
+----------------------------+
| y n A2 A3 P2 |
|----------------------------|
1. | 19 1073 0 0 0 |
2. | 70 3084 1 0 0 |
3. | 134 18520 0 1 0 |
4. | 10 339 0 0 1 |
5. | 23 967 1 0 1 |
|----------------------------|
6. | 69 4611 0 1 1 |
+----------------------------+
Make the response variable ln_p = ln(y/n) .
Make a weight w = n*p / (1 - p) where p = y/n
The matrix rendering of the FGLS estimator, and the estimated standard
errors (see below) is quite straightforward and yields the results shown
in Table 2.3 in their text; and you can also get the coefficients and the
correct standard errors "the old-fashioned way," which is to say
re-scaling all variables by multiplying them times sqrt(w), and then
adjusting the standard errors by dividing through by the RMSE; but the
only way that I have found to have Stata do this in one "canned" swoop is:
. glm lograte A2 A3 P2 [fweight=y], scale(1)
Iteration 0: log likelihood = -301.55676
Generalized linear models No. of obs =
325
Optimization : ML Residual df =
321
Scale parameter =
1
Deviance = 5.803478257 (1/df) Deviance =
.0180794
Pearson = 5.803478257 (1/df) Pearson =
.0180794
Variance function: V(u) = 1 [Gaussian]
Link function : g(u) = u [Identity]
AIC =
1.880349
Log likelihood = -301.5567624 BIC =
-1850.804
------------------------------------------------------------------------------
| OIM
lograte | Coef. Std. Err. z P>|z| [95% Conf.
Interval]
-------------+----------------------------------------------------------------
A2 | .1362063 .2130081 0.64 0.523 -.2812819
.5536945
A3 | -.821337 .1985175 -4.14 0.000 -1.210424
-.4322497
P2 | .5366818 .1200295 4.47 0.000 .3014284
.7719353
_cons | -4.042851 .1902521 -21.25 0.000 -4.415739
-3.669964
------------------------------------------------------------------------------
(Standard errors scaled using dispersion equal to square root of 1)
which strikes me as kind of ugly since it is iterative and involves a
weight that only happens to resemble w because p is close to zero! (This
is the Stata analogue of the GENMOD commands that Powers has on his web
site for this example....)
As I say below, what I am looking for is a routine that does
var(b-gls) = invsym(X'*W*X)
without doing ML, or having to fake panel data, etc. If it doesn't exist,
it doesn't exist, and it is simple enough to write one....
--Herb
Herbert L. Smith
Professor of Sociology and
Director, Population Studies Center
230 McNeil Building
3718 Locust Walk CR
University of Pennsylvania
Philadelphia, PA 19104-6298
[email protected]
215.898.7768 (office)
215.898.2124 (fax)
On Tue, 16 Jan 2007, Clive Nicholas wrote:
> Herbert Smith wrote:
>
> > For a garden-variety, cross-sectional regression, an estimator of
> >
> > var(b)
> >
> > is
> >
> > var(b)=invsym(X'*W*X)
> >
> > where X is the design matrix and W is a diagonalized weight matrix.
> >
> > Is there a way in Stata to get the FGLS estimated var-cov in a single
> > command? By which I mean:
> >
> > -regress depvar indvars [pweight=w]-
> >
> > gives the GLS estimates for b
> >
> > b=invsym(X'*W*X)*(X'*W*y)
> >
> > but the standard errors are computed as though
> >
> > -regress depvar indvars [pweight=w], vce(robust)-
> >
> > and are close to the FGLS estimates, but are not the same....
>
> Isn't this satisfactory?
>
> . webuse grunfeld, clear
>
> . tsset company year
> panel variable: company (strongly balanced)
> time variable: year, 1935 to 1954
>
> . xtgls invest mvalue kstock time
>
> Cross-sectional time-series FGLS regression
>
> Coefficients: generalized least squares
> Panels: homoskedastic
> Correlation: no autocorrelation
>
> Estimated covariances = 1 Number of obs = 200
> Estimated autocorrelations = 0 Number of groups = 10
> Estimated coefficients = 4 Time periods = 20
> Wald chi2(3) = 867.82
> Log likelihood = -1191.645 Prob > chi2 = 0.0000
>
> ----------------------------------------------------------------------------
> invest | Coef. Std. Err. z P>|z| [95% Conf. Interval]
> -----------+----------------------------------------------------------------
> mvalue | .1163783 .0059669 19.50 0.000 .1046834 .1280732
> kstock | .2213351 .0302499 7.32 0.000 .1620463 .2806239
> time | .7737904 1.377808 0.56 0.574 -1.926665 3.474245
> _cons | -49.14306 14.83261 -3.31 0.001 -78.21443 -20.07169
> ----------------------------------------------------------------------------
>
> . matrix list e(V)
>
> symmetric e(V)[4,4]
> mvalue kstock time _cons
> mvalue .0000356
> kstock -.00009563 .00091506
> time .00200231 -.02292234 1.8983561
> _cons -.03314052 .09155466 -15.771641 220.00619
>
> Or am I missing something? :)
>
> CLIVE NICHOLAS |t: 0(044)7903 397793
> Politics |e: [email protected]
> Newcastle University |http://www.ncl.ac.uk/geps
>
> Whereever you go and whatever you do, just remember this. No matter how
> many like you, admire you, love you or adore you, the number of people
> turning up to your funeral will be largely determined by local weather
> conditions.
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/