Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Spss's aggregate vs stata's collapse.
From
Ulrich Kohler <[email protected]>
To
[email protected]
Subject
Re: st: Spss's aggregate vs stata's collapse.
Date
Wed, 13 Apr 2011 11:52:17 +0200
Am Mittwoch, den 13.04.2011, 10:05 +0100 schrieb Brendan Halpin:
> On Wed, Apr 13 2011, Amadou DIALLO wrote:
>
> > Hi,
> > I am translating spss commands to stata and have trouble with different outputs.
> > Results are different after "aggregate" for ceb (children ever born).
>
> If the two files are exactly identical at the collapse/aggregate point
> (and that's worth verifying, as the generate/if and compute/if commands
> will not necessarily be identical in the case of missing values on the
> right hand side), I would guess it has to do with SPSS and Stata
> handling weights differently in this situation. You could test this by
> re-running the manipulation without weights. Note the
> "negative/zero/missing weight" warning you get with SPSS.
>
> If that is the problem, one possible workaround is to handle the weights
> yourself: multiply ceb by the weight variable, and sum the result in the
> -collapse- statement.
This reminds me to something. SPSS, might be inconsistent in handling
the weights in itself. In Stata doing something like this
. sysuse auto, clear
. reg price for [aweight=gear_ratio]
. scalar d1 = _b[foreign]
. sum price if !for [aweight=gear_ratio]
. scalar d2 = r(mean)
. sum price if for [aweight=gear_ratio]
. scalar d2 = r(mean)-d2
. collapse price [aweight=gear_ratio], by(for)
. scalar d3 = price[2]-price[1]
yields to (almost) identical results for scalars d1, d2, d3:
. scalar list d1 d2 d3
d1 = 478.0205
d2 = 478.0205
d3 = 478.02051
The last time I checked (some years ago) this was not the case in SPSS.
With non-integer weights SPSS yielded to different results for d1 than
for d2 and d3. If I remember correctly, SPSS-aggregate and
SPSS-descriptives seemed to use some kind of rounding for non-integer
weights, although I did not found out what kind of rounding they used.
Wonder whether this observation is still valid.
Uli
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/