Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Poststratification weighting, subpop, and missing values
From
<[email protected]>
To
<[email protected]>
Subject
st: Poststratification weighting, subpop, and missing values
Date
Wed, 26 Sep 2012 09:25:55 -0400
Hi everyone,
I'm currently working on analyzing the results of a survey and have run into some strange results when using poststratification weights and the subpop modifier. An example is shown below, where we're simply totaling 2011 sales. The flag variable indicates the subpopulation we're interested in. When only limiting the population by flag, the command calculates the total over 2,624 PSUs, while when we try and further limit the population to those with flag equal to one and where total sales is not missing, it calculates over 2,639 PSUs. In the second command, STATA seems to be including the 15 missing values in its calculations. Also, the total for the more limited subpopulation is lower, which does not coincide with what we expect to happen when removing missing values and its effect on the background calculation of the adjusted weight.
Could someone shed some light on why this is happening?
Thank you,
Ricky Ubee
. svyset uniqueID [pweight=weight_prop], strata(strata2) singleunit(scaled) poststrata(type2) postweight(postwt4) fpc(N)
pweight: weight_prop
VCE: linearized
Poststrata: type2
Postweight: postwt4
Single unit: scaled
Strata 1: strata2
SU 1: uniqueID
FPC 1: N
. svy, subpop(if flag==1): total TOT_SALES_11
(running total on estimation sample)
Survey: Total estimation
Number of strata = 26 Number of obs = 2624
Number of PSUs = 2624 Population size = 23794
N. of poststrata = 16 Subpop. no. obs = 652
Subpop. size = 5245.94
Design df = 2598
--------------------------------------------------------------
| Linearized
| Total Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
TOT_SALES_11 | 2.20e+12 2.77e+11 1.65e+12 2.74e+12
--------------------------------------------------------------
Note: 2 strata omitted because they contain no subpopulation
members.
. svy, subpop(if flag==1 & TOT_SALES_11~=.): total TOT_SALES_11
(running total on estimation sample)
Survey: Total estimation
Number of strata = 26 Number of obs = 2639
Number of PSUs = 2639 Population size = 23794
N. of poststrata = 16 Subpop. no. obs = 652
Subpop. size = 5222.38
Design df = 2613
--------------------------------------------------------------
| Linearized
| Total Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
TOT_SALES_11 | 2.18e+12 2.76e+11 1.64e+12 2.72e+12
--------------------------------------------------------------
Note: 2 strata omitted because they contain no subpopulation
members.
. count if flag==1 & TOT_SALES_11==.
15
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/