Tell us more about the problem. As far as I know, sampling networks is
heck of a mess. To simulate anything, you would need to have almost
perfect understanding how your network was formed. Any simulation is
just as good as the model to create the data that was used in that
simulation. And survey bootstrap is a moderately crazy topic. No
textbook covers it sufficiently well, unfortunately. Certainly not in
Efron's book; there is a chapter in Shao & Tu (1995) Springer book,
but it only covered stuff until late 1980s. The newer (and important!)
methods are only out there in the papers.
Yates-Grundy-Sen variance estimator for Horvitz-Thompson estimator is
sum over j<k (p[j]*p[k] - p[j,k]) (y[j]/p[j] - y[k]/p[k])^2
If you can write Mata functions to compute the unit and pair
probabilities of selection, you can have a pretty compact code for
your variance estimator. You won't have to store the huge matrices of
pairwise selection probabilities that likely have well structured form
if you talk about cluster sampling.
On Wed, Mar 11, 2009 at 3:49 AM, Inna Becher
<[email protected]> wrote:
I can calculate the probability for each network (=cluster) to be included
in the sample. I also can
calculate for each pair of selected clusters to be included in the sample.
My problem is: this probabilities are to be saved somewhere. Should it be a
matrix? I have not yet worked with matrices to calculate variances. The
version of H-T-estimator I need is not implemented in svy-.
I wrote an ado for sampling design that I need and implemented H-T-estimator
for the mean, but not for the variance.