Jian Zhang <[email protected]> has a -svyset- question:
> The sample was obtained as follows. I sampled the population by
> stratifying it first, and then I randomly selected several clusters for
> each stratum. Within each cluster, I then random selected several
> subclusters, and then for each subcluster, I randomly selected a certain
> number of observations. for this sampling plan, how do I set up the
> sampling plan using command svyset in STATA?
> Would it be:
> svyset [pweight = pwt], fpc(fpc) psu(cluster) strata(strata)?
I'll assume Stata 9, since this is the first release where -svyset- has a
syntax to deal with multiple stages of clustered sampling.
Let's make up some variable names to represent survey design characteristics:
pwt - sampling weights
strata1 - stage 1 strata
su1 - stage 1 sampling units (PSU)
fpc1 - stage 1 finite population correction
strata2 - stage 2 strata
su2 - stage 2 sampling units (PSU)
fpc2 - stage 2 finite population correction
... you get the idea
Given Jian's description above, the -svyset- command should be as follows:
svyset su1 [pw=pwt], strata(strata1) fpc(fpc1) ///
|| su2, fpc(fpc2) || _n, fpc(fpc3)
(note: '///' tells Stata to continue to the next line in ado/do files.)
> I know this is for stratified TWO-stage cluster sampling plan, which is "
> sample the population by stratifying it first, and then randomly select
> several clusters for each stratum. Within each cluster, then randomly
> select a certain number of observations."
>
> Would the svyset for multiple-stage cluster sample (more than 2 stages)
> with stratification be same as TWO-stage cluster sampling with
> stratification?
Actually, Jian's original -svyset- command:
> svyset [pweight = pwt], fpc(fpc) psu(cluster) strata(strata)
should not be used with a two-stage design because an -fpc()- was specified
but nothing was mentioned about the second stage.
Prior to Stata 9, -svyset- only allowed you to specify the first stage design
variables and we recommended that you omit the -fpc()- if the design involved
sampling within PSUs. In Stata 9 you can specify the design variables for
each stage provided you have them, using '||' to delimit between the stages.
> More complicated is that what if I do cluster sampling first, and then
> stratify each cluster, and then do cluster sampling again, what would
> the command svy for setting up this sampling plan be?
In this case Jian stratified in the second stage, so Jian should have a
variable like 'strata2' instead of 'strata1':
svyset su1 [pw=pwt], fpc(fpc1) ///
|| su2, strata(strata2) fpc(fpc2) || _n, fpc(fpc3)
> Similarly, if I stratify the population first, and then do the cluster,
> and then do stratification again and then do cluster sampling again, what
> would the svyset command be for this sampling plan?
svyset su1 [pw=pwt], strata(strata1) fpc(fpc1) ///
|| su2, strata(strata2) fpc(fpc2) || _n, fpc(fpc3)
> To generalize the question, if we change the order of cluster sampling
> and stratification sampling when sampling the population, would the
> svyset command be different?
Yes.
In Stata 9, you need to know from which stage a stratum variable identifies
the strata. See -[SVY] svyset- for more examples of how to -svyset-
multi-stage designs.
Prior to Stata 9, you would only use the -strata()- option if your design had
stratification in the first stage.
--Jeff
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/