| |
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: svyset question
Jeff, I recommend that you consult a sampling statistician. You have:
CODE PROBLEMS
1. With only one district selected within each stratum, STATA has no
replicates with which to compute a SE. You have several choices: 1.
Combine neighboring regions to get three strata. Be warned that the
standard error multiplier for t-statistics will be: 3.18 (compared to
1.95 for a normal approximation). 2. If regions do not differ much-
but how can you tell with only one obs per region?-then omit the
stratum specification and get 6-1 5 degrees of freedom for the
highest level of sampling. The t-multiplier in this case is 2.57,
about 20% less. Personally I would go with 90% intervals, so that
the t-multiplier is 2.35 for 3 strata.
2. The weight variable must be specified in a [pweight= ] statement
before the comma.
CONCEPTUAL PROBLEMS
3. It looks like there are 3-4 stages of sampling: 1. District in
strata. 2. village/ward within district. 3. hh within village/ward.
4. person within hh. Only the first 3 would be specified if only
one person is selected from each hh.
4. It looks like there was a second level of stratification- urban
vs rural- someplace in the design. Your description "PPS" sampling
makes no sense unless this is true.
ANALYTIC ISSUES.
5. In a survey of this size, especially with no replication at the
first stage, some post-stratification or sample raking would be
standard practice.
Regards,
Steven
On Apr 20, 2007, at 10:24 AM, Jeff Edmeades wrote:
Hi all,
I am working with survey data with the following design (as
described to
me): "Respondents were selected through stratified cluster sampling,
with one district randomly selected from six geographic regions. Ten
sampling units (villages in rural areas and urban wards in urban
areas)
were then selected in each district through probability
proportional to
size sampling, with purposeful oversampling of urban areas to ensure
sufficient cases for the analysis of rural-urban differences. A
household listing was conducted in each of the sampling units, from
which 40 eligible individuals were randomly selected."
My understanding of the correct syntax for this (following from the
SVY
manual) is:
svyset district, strata(region) fpc(ndistricts) || samplingunit
[sampling weight for urban oversample] fpc(nsamplingunits)
Is this correct??
Many thanks,
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/