Hi;
I have 2 surveys, 1997 & 2007, with complex survey designs and
available for analysis with Jackknife replicate weights. The surveys
were more or less equivalent in their design: nationally
representative random (independent) samples (without replacement).
A possible issue is that the 1997 survey included persons aged 18
years or older (no upper limit); the 2007 survey included persons aged
16 – 85 years inclusive. As such, there could be issues about
combining the two surveys given that the replicate weights are
calibrated to different population structures – not sure there is any
way around this if this is an explanation.
I want to age-sex standardize the 1997 data to the 2007 population
structure. To do so, I first limited the two samples to respondents
aged 20-69 years inclusive – to get like-with-like comparisons. I then
created a new variable indicating the age-sex strata (10 5-year age
bands x 2 sex = 20 strata – variable name, st_agesex). I then
estimated the 2007 population size for each of these 20 age-sex strata
– variable name, st_wt).
I do several runs through the data.
The first specifies the complex survey design:
. quietly svyset [pweight=mhsfinwt], jkrweight(wpm*, multiplier(1))
vce(jackknife) mse
Stata output reports: Number of strata = 1; Population size = 1.39
million. All this makes sense.
I then estimate proportions who consulted a psychologist for mental
health problems in the last 12-months (mhpsyco12: code 0/1) over the
two surveys (nsmhwb, 0 = 1997; 1 = 2007)
. svy jackknife, nodots : proportion mhpsyo12, over(nsmhwb)
These give estimates for the unadjusted populations:
1997 - 17.0% (SE 1.3%);
2007 – 37.3% (SE 2.5%).
All good so far.
The second pass through the data declares the complex survey design
with poststratification specification strata and weights:
. quietly svyset [pweight=mhsfinwt], poststrata(st_agesex) postweight(st_wt) ///
jkrweight(wpm*, multiplier(1)) vce(jackknife) mse
Stata output reports Number of strata = 1; N. of std strata = 20 –
both of these make sense. Stata also reports a Population size of 1. I
don’t understand the Population size parameter – why isn’t it 1.39
mill per above?
I then estimate proportions who consulted a psychologist for mental
health problems in the last 12-months adjusted for the age-sex stratum
factors
. svy jackknife, nodots : proportion mhpsyo12, over(nsmhwb)
These give estimates for the ‘adjusted’ age-sex standardized populations:
1997 - 15.7% (SE 1.4%);
2007 – 37.1% (SE 2.6%).
I expected the 1997 estimate to be reduced given age-sex adjustment –
this is the case. But I do not understand why the 2007 ‘adjusted’
estimates vary at all from the 2007 ‘unadjusted’ unadjusted estimates.
Finally, to try and unravel this matter, I resorted to the original
complex survey design declaration:
. quietly svyset [pweight=mhsfinwt], jkrweight(wpm*, multiplier(1))
vce(jackknife) mse
Stata output reports Number of strata = 1; N. of poststrata = 20, and
a Population size of 1.39 million. All of these make sense.
I then tried to ‘directly standardize’ the proportions:
. svy jackknife, nodots : proportion mhpsyo12, stdize(st_agesex)
stdweight(st_wt) over(nsmhwb)
These give estimates for the ‘adjusted’ age-sex standardized populations:
1997 - 15.6% (SE 1.4%);
2007 – 37.8% (SE 3.0%).
So, I’m confused. I understand why the 1997 estimates vary given
age-sex adjustment (although a bit confused why the results differ
between poststratification and direct standardization); I have more
trouble understanding the varying estimates for 2007.
I’m struggling to understand all of this and welcome any ideas! It’s
likely I do not properly understand the postsratification processes.
I’m using Stata 11.0, born 21 October 2009.
Any thoughts or ideas most grateful!
Thanks;
Philip.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/