Dear Statalisters,
can anybody give me a clue as to the array of weighting options in Stata? I
have an important project where I would really like to make headway...
My dataset features a size of 2.4 GB as .csv. When I translate this into
SPSS, it ends up with 2.7 GB while the equivalent Stata dataset has 5.5 GB
(!). Anyway, I usually pick out the interesting variables beforehand because
Stata is unable to open the entire dataset. The first column of the data
contains samplingweights. The dataprovider ships a pdf with the descriptives
for the marginal distributions of the variables in the population so I know
the true values.
Now here lies the rub: when I weight -summarize- with analytic weights, the
approximately correct mean and standard deviation pop out. When I let Stata
estimate the mean with the -mean- command, with analytic weights attached in
the same fashion, I get widely differing results for the point estimate of
the mean, far from the true values. In SPSS, I simply go to -weight cases-
and everything comes out correct.
Do I have to -svyset- the data? When I try to -frequency weight- the data,
Stata complains that non-integers are not allowed while SPSS seems to not
quarrel with them. Why is it that SPSS needs one command at the beginning of
the session while Stata has a (differing) tab dedicated to weighting for
every single command?
Martin Weiss
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/