I wrote:
>> By definition, aweights are for cell means data, i.e. data which have been
>> collapsed through averaging, and pweights are for sampling weights.
and Mark Schaffer <M.E.Schaffer@hw.ac.uk> asks:
> Is this true by definition, strictly speaking? One reason for using
> aweights may be to do WLS. We might have a view on the form
> heteroskedasticity takes and use WLS to eliminate it. It might be caused by
> collapsing data through averaging, but there are other reasons it can arise.
Yes, this is true by definition. In the case of least squares, the variance
of the response for cell mean data is sigma^2/n_i (n_i being the aweight)
instead of sigma^2, so indeed by picking the right n_i's you can control for
all sorts of heteroskedasticity without having to think things out as cell
means.
Because of the ever-wonderful Taylor linearization, this argument follows
through with general maximum likelihood estimation and, in fact, I can't think
of an example where you would ever get in trouble if you just continued
thinking in terms of relative variances rather than cell means.
I think in terms of cell means mainly so that I can (a) respect the documented
definition and (b) not try to "aweight" a command that has no business being
aweighted, e.g. -streg-.
--Bobby
rgutierrez@stata.com
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/