I ran -svymean- on some data and got the following result:
. svymean congo;
Survey mean estimation
pweight: wgt Number of obs =
937
Strata: dis Number of strata =
4
PSU: <observations> Number of PSUs =
937
Population size =
72787.999
------------------------------------------------------------------------
------
Mean | Estimate Std. Err. [95% Conf. Interval] Deff
---------+--------------------------------------------------------------
------
congo | .0052457 .003009 -.0006595 .011151 1.624072
------------------------------------------------------------------------
------
I wanted to try the jackknife method of computing the variance, so I
used -svrmean- by Nick Winter:
. survwgt create jkn, strata(dis) psu(ssn) weight(wgt) stem(jkwgt_);
Generating replicate weights...........................
[snip a bunch of output about the replicate weights]
. svrset set meth jkn;
. svrset set pw wgt;
. svrset set rw "jkwgt_1-jkwgt_937";
. svrmean congo;
Survey mean estimation, replication (jkn) variance method
Analysis weight: wgt Number of obs =
937
Replicate weights: jkwgt_1... Population size =
72787.999
Number of replicates: 937 Degrees of freedom =
933
------------------------------------------------------------------------
------
Mean | Estimate Std. Err. [95% Conf. Interval] Deff
---------+--------------------------------------------------------------
------
congo | .0052457 .003009 -.0006595 .011151 1.624072
------------------------------------------------------------------------
------
It seems strange to me that this is the exact same result as -svymean-.
Is this possible? My PSUs are the individual observations, would
-svrmean- give the same result with such a survey design?
On a related note, I have been looking for a way to do -ci congo,
binomial wilson- using data with unequal sampling weights. I have not
been able to find much of anything, even using software other than
Stata. Using Gauss I run a bootstrap, assuming each stratum to be iid,
but not iid across strata, and taking the empirical confidence interval
from that (which is reassuringly close to the confidence interval from
the unweighted -ci- results). From the software packages I have seen
the full bootstrap is not used very often when it comes to computing the
std errors of survey data. Why is that? Do more complicated survey
designs make a full bootstrap intractable?
For comparison below is the output -ci, binomial wilson- and my
bootstrapped estimates.
. ci congo, binomial wilson
------ Wilson
------
Variable | Obs Mean Std. Err. [95% Conf.
Interval]
-------------+----------------------------------------------------------
-----
congo | 937 .0032017 .0018455 .0010895
.0093708
My bootstrap results:
Mean Std Error 95% confidence intervals
0.0052548 0.0030023 0.0000000 0.0122401
-Tim
_____________________________________________________________
This e-mail and any attachments may be confidential or legally privileged. If you received this message in error or are not the intended recipient, you should destroy the e-mail message and any attachments or copies, and you are prohibited from retaining, distributing, disclosing or using any information contained herein. Please inform us of the erroneous delivery by return e-mail.
Thank you for your cooperation.
_____________________________________________________________
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/