Friends,
I'm comparing tobacco survey data for two years. I want to standardize
estimates for the later year (2005) using age-sex-ethnicity distribution
of the earlier year (2001). When I run the estimates separately using -
svy: prop - I get the expected results: std-ized and non-std-ized
estimates are essentially the same for 2001 but different for 2005.
However, when I run the contrast directly using - svy: tab - the
std-ized estimates change by 0.1% for 2001 (the reference year for
std-ized weights) but don't change for 2005. Can someone explain why
this happens? Here's the output.
thx...arnold
/* estimated smoking prevalence is same for 2001 data using
standardization to 2001 sex-age-ethnicity*/
. svy: prop nowsmoke if year==2001, stdize(stdstrata)
stdweight(stdwgt01)
(running proportion on estimation sample)
Survey: Proportion estimation
Number of strata = 8 Number of obs = 12964
Number of PSUs = 11219 Population size = 3.1e+06
N. of std strata = 48 Design df = 11211
Linearized Binomial Wald
Proportion Std. Err. [95% Conf. Interval]
nowsmoke
0 .8026479 .0043218 .7941765 .8111193
1 .1973521 .0043218 .1888807 .2058235
. svy: prop nowsmoke if year==2001,
(running proportion on estimation sample)
Survey: Proportion estimation
Number of strata = 8 Number of obs = 12964
Number of PSUs = 11219 Population size = 3.1e+06
Design df = 11211
Linearized Binomial Wald
Proportion Std. Err. [95% Conf. Interval]
nowsmoke
0 .8026311 .0043816 .7940423 .8112199
1 .1973689 .0043816 .1887801 .2059577
/* estimated smoking prevalence is different (as expected) for 2005 data
using standardization to 2001 sex-age-ethnicity*/
. svy: prop nowsmoke if year==2005, stdize(stdstrata)
stdweight(stdwgt01)
(running proportion on estimation sample)
Survey: Proportion estimation
Number of strata = 65 Number of obs = 12086
Number of PSUs = 11440 Population size = 3.4e+06
N. of std strata = 48 Design df = 11375
Linearized Binomial Wald
Proportion Std. Err. [95% Conf. Interval]
nowsmoke
0 .8254121 .0050684 .8154771 .8353471
1 .1745879 .0050684 .1646529 .1845229
. svy: prop nowsmoke if year==2005,
(running proportion on estimation sample)
Survey: Proportion estimation
Number of strata = 65 Number of obs = 12086
Number of PSUs = 11440 Population size = 3.4e+06
Design df = 11375
Linearized Binomial Wald
Proportion Std. Err. [95% Conf. Interval]
nowsmoke
0 .8267205 .0050774 .8167679 .8366731
1 .1732795 .0050774 .1633269 .1832321
/* using tab, standardized difference appears in 2001 data, not 2005
data*/
. svy: tab nowsmoke year, col
(running tabulate on estimation sample)
Number of strata = 73 Number of obs = 25050
Number of PSUs = 22659 Population size = 6480351.9
Design df = 22586
year
nowsmoke 2001 2005 Total
0 .8026 .8267 .8152
1 .1974 .1733 .1848
Total 1 1 1
Key: column proportions
Pearson:
Uncorrected chi2(1) = 24.0803
Design-based F(1, 22586) = 12.6389 P = 0.0004
. svy: tab nowsmoke year, col stdize(stdstrata) stdweight(stdwgt01)
(running tabulate on estimation sample)
Number of strata = 73 Number of obs = 25050
Number of PSUs = 22659 Population size = 6480351.9
N. of std strata = 48 Design df =
22586
year
nowsmoke 2001 2005 Total
0 .8017 .8263 .8144
1 .1983 .1737 .1856
Total 1 1 1
Key: column proportions
Pearson:
Uncorrected chi2(1) = 24.9490
Design-based F(1, 22586) = 13.2600 P = 0.0003
Arnold H. Levinson, Ph.D.
Director, Tobacco Program Evaluation Group (TPEG)
Assistant Professor of Preventive Medicine
University of Colorado at Denver & Health Sciences Center
AMC Cancer Research Center
1600 Pierce Street, Lakewood CO 80214
[email protected]
303-239-3402
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/