Mark Schaffer <[email protected]> noticed an odd behavior that -regress-
exhibits with non-integer valued -iweight-s:
> Hi all. I have a question about the behavior of -regress- with
> iweights. It seems peculiar to me, and isn't documented anywhere I can
> find, and I wonder if anyone can see any logic to it.
>
> The behavior is the treatment of the sample size. Iweights are not
> normalized, and in principle the sample size N can take non-integer
> values.
>
> If I recall correctly, Stata allows only integer values of N to be
> posted with the obs() option of -ereturn-. If iweights generates a
> non-integer N, -regress- rounds down before posting. That makes sense
> to me.
>
> But what I don't understand is why -regress- seems to use the same
> rounded-down N for calculations of things like the var-cov matrix and
> the error variance. Wouldn't it make sense to use the more precise,
> unrounded N when calculating them?
>
> In practice, iweights are, I think, used mostly by programmers, so this
> is probably relevant only to those who are using -regress- to produce
> these things.
>
> An example using the toy auto dataset is below.
>
> I am using Stata 11 but this behavior of -regress- appears in earlier
> versions of Stata as well.
Thanks to Mark for pointing this out. We've found where in -regress- this
rounded value is unintentionally being applied and will have it fixed in the
next executable update.
--Jeff
[email protected]
> *********** EXAMPLE **************
>
> . sysuse auto, clear
> (1978 Automobile Data)
>
> . qui reg mpg weight [iweight=headroom]
>
> . predict double e, resid
>
> . gen double e2=e^2
>
> . qui sum e2 [iweight=headroom], meanonly
>
> . di r(sum_w)
> 221.5
>
> . di e(N)
> 221
>
> . scalar s2unrounded=r(sum)/(221.5-2)
>
> . scalar s2rounded =r(sum)/(221-2)
>
> . di sqrt(s2unrounded)
> 3.2509661
>
> . di sqrt(s2rounded)
> 3.2546751
>
> . di e(rmse)
> 3.2546751
>
> .
> . mat accum XX = weight [iweight=headroom]
> (obs=221.5)
>
> . mat XXinv=syminv(XX)
>
> . mat Vunrounded = XXinv*s2unrounded
>
> . mat list Vunrounded
>
> symmetric Vunrounded[2,2]
> weight _cons
> weight 8.109e-08
> _cons -.00025336 .83926154
>
> . mat Vrounded = XXinv*s2rounded
>
> . mat list Vrounded
>
> symmetric Vrounded[2,2]
> weight _cons
> weight 8.128e-08
> _cons -.00025394 .84117766
>
> . mat list e(V)
>
> symmetric e(V)[2,2]
> weight _cons
> weight 8.128e-08
> _cons -.00025394 .84117766
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/