Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: (Feasible) generalized least squares


From   Herb Smith <[email protected]>
To   [email protected]
Subject   Re: st: (Feasible) generalized least squares
Date   Tue, 16 Jan 2007 17:01:38 -0500 (EST)

Well, yes and no...

Yes, in the sense that, yes, this is an FGLS estimator

No, in the sense that one has to tset and what have you, and I was
interested in a problem with grouped data (but not panel data).

To be concrete:  In a basic text, Powers and Xie, *Statistical Methods for
Categorical Data Analysis*, there is a simple table of six rates, for
three ages and two time periods

    +----------------------------+
     |   y       n   A2   A3   P2 |
     |----------------------------|
  1. |  19    1073    0    0    0 |
  2. |  70    3084    1    0    0 |
  3. | 134   18520    0    1    0 |
  4. |  10     339    0    0    1 |
  5. |  23     967    1    0    1 |
     |----------------------------|
  6. |  69    4611    0    1    1 |
     +----------------------------+

Make the response variable ln_p = ln(y/n) .

Make a weight w = n*p / (1 - p)   where p = y/n

The matrix rendering of the FGLS estimator, and the estimated standard
errors (see below) is quite straightforward and yields the results shown
in Table 2.3 in their text; and you can also get the coefficients and the
correct standard errors "the old-fashioned way," which is to say
re-scaling all variables by multiplying them times sqrt(w), and then
adjusting the standard errors by dividing through by the RMSE; but the
only way that I have found to have Stata do this in one "canned" swoop is:

. glm  lograte  A2 A3 P2 [fweight=y], scale(1)

Iteration 0:   log likelihood = -301.55676

Generalized linear models                          No. of obs      =
325
Optimization     : ML                              Residual df     =
321
                                                   Scale parameter =
1
Deviance         =  5.803478257                    (1/df) Deviance =
.0180794
Pearson          =  5.803478257                    (1/df) Pearson  =
.0180794

Variance function: V(u) = 1                        [Gaussian]
Link function    : g(u) = u                        [Identity]

                                                   AIC             =
1.880349
Log likelihood   = -301.5567624                    BIC             =
-1850.804

------------------------------------------------------------------------------
             |                 OIM
     lograte |      Coef.   Std. Err.      z    P>|z|     [95% Conf.
Interval]
-------------+----------------------------------------------------------------
          A2 |   .1362063   .2130081     0.64   0.523    -.2812819
.5536945
          A3 |   -.821337   .1985175    -4.14   0.000    -1.210424
-.4322497
          P2 |   .5366818   .1200295     4.47   0.000     .3014284
.7719353
       _cons |  -4.042851   .1902521   -21.25   0.000    -4.415739
-3.669964
------------------------------------------------------------------------------
(Standard errors scaled using dispersion equal to square root of 1)

which strikes me as kind of ugly since it is iterative and involves a
weight that only happens to resemble w because p is close to zero!  (This
is the Stata analogue of the GENMOD commands that Powers has on his web
site for this example....)

As I say below, what I am looking for is a routine that does

var(b-gls) = invsym(X'*W*X)

without doing ML, or having to fake panel data, etc.  If it doesn't exist,
it doesn't exist, and it is simple enough to write one....

--Herb

Herbert L. Smith
Professor of Sociology and
Director, Population Studies Center
230 McNeil Building
3718 Locust Walk CR
University of Pennsylvania
Philadelphia, PA  19104-6298

[email protected]

215.898.7768 (office)
215.898.2124 (fax)

On Tue, 16 Jan 2007, Clive Nicholas wrote:

> Herbert Smith wrote:
>
> > For a garden-variety, cross-sectional regression, an estimator of
> >
> > var(b)
> >
> > is
> >
> > var(b)=invsym(X'*W*X)
> >
> > where X is the design matrix and W is a diagonalized weight matrix.
> >
> > Is there a way in Stata to get the FGLS estimated var-cov in a single
> > command?  By which I mean:
> >
> > -regress depvar indvars [pweight=w]-
> >
> > gives the GLS estimates for b
> >
> > b=invsym(X'*W*X)*(X'*W*y)
> >
> > but the standard errors are computed as though
> >
> > -regress depvar indvars [pweight=w], vce(robust)-
> >
> > and are close to the FGLS estimates, but are not the same....
>
> Isn't this satisfactory?
>
> . webuse grunfeld, clear
>
> . tsset company year
>        panel variable:  company (strongly balanced)
>         time variable:  year, 1935 to 1954
>
> . xtgls invest mvalue kstock time
>
> Cross-sectional time-series FGLS regression
>
> Coefficients:  generalized least squares
> Panels:        homoskedastic
> Correlation:   no autocorrelation
>
> Estimated covariances      =         1        Number of obs      =       200
> Estimated autocorrelations =         0        Number of groups   =        10
> Estimated coefficients     =         4        Time periods       =        20
>                                               Wald chi2(3)       =    867.82
> Log likelihood             = -1191.645        Prob > chi2        =    0.0000
>
> ----------------------------------------------------------------------------
>     invest |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
> -----------+----------------------------------------------------------------
>     mvalue |   .1163783   .0059669    19.50   0.000     .1046834    .1280732
>     kstock |   .2213351   .0302499     7.32   0.000     .1620463    .2806239
>       time |   .7737904   1.377808     0.56   0.574    -1.926665    3.474245
>      _cons |  -49.14306   14.83261    -3.31   0.001    -78.21443   -20.07169
> ----------------------------------------------------------------------------
>
> . matrix list e(V)
>
> symmetric e(V)[4,4]
>             mvalue      kstock        time       _cons
> mvalue    .0000356
> kstock  -.00009563   .00091506
>   time   .00200231  -.02292234   1.8983561
>  _cons  -.03314052   .09155466  -15.771641   220.00619
>
> Or am I missing something? :)
>
> CLIVE NICHOLAS        |t: 0(044)7903 397793
> Politics              |e: [email protected]
> Newcastle University  |http://www.ncl.ac.uk/geps
>
> Whereever you go and whatever you do, just remember this. No matter how
> many like you, admire you, love you or adore you, the number of people
> turning up to your funeral will be largely determined by local weather
> conditions.
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index