Jason,
There is a very important difference in using the subpop option of -svy- vs.
using an -if- statement to drop cases.
A good link to get you started:
http://www.cpc.unc.edu/services/computer/presentations/statatutorial/example
33.html
I also recommend Stata's Survey Data manual.
HTH,
Carter
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Jason Ferris
Sent: Sunday, March 18, 2007 6:47 PM
To: [email protected]
Subject: st: Weights in survey design
I have a large dataset with weights calculated as PPS based on household
size, stratified by sex. The age group respondents are from 16-64.
I am interested in looking at data only from those aged 16-24. I can
use the subpop command "subpop(if age>=16 & age<=24)" for all the
commands. But I am wondering if I can drop all other cases (keep if
age>=16 & age<=24) and the 'reset' my weights based only on those aged
16-24.
In the original form (with all data) I have the following summary data:
(note the survey design is quiet a simple one)
Svyset
pweight: pps
VCE: linearized
Strata 1: sex
SU 1: <observations>
FPC 1: <zero>
. svy: tab sex
(running tabulate on estimation sample)
Number of strata = 2 Number of obs = 8664
Number of PSUs = 8664 Population size = 8664
Design df = 8662
-----------------------
sex | proportions
----------+------------
female | .5046
male | .4954
|
Total | 1
-----------------------
Key: proportions = cell proportions
If I select the subgroup (age 16-24):
. svy,subpop(if age<=24): tab sex
(running tabulate on estimation sample)
Number of strata = 2 Number of obs = 8664
Number of PSUs = 8664 Population size = 8664
Subpop. no. of obs = 999
Subpop. size = 1438.7586
Design df = 8662
-----------------------
sex | proportions
----------+------------
female | .4599
male | .5401
|
Total | 1
-----------------------
Key: proportions = cell proportions
When I reset my weights with data only representing those 16-24 years of
age (ie., as if this was the way I original designed my study) I get the
following results:
. svy: tab sex
(running tabulate on estimation sample)
Number of strata = 2 Number of obs = 999
Number of PSUs = 999 Population size = 999
Design df = 997
-----------------------
sex | proportions
----------+------------
female | .4655
male | .5345
|
Total | 1
-----------------------
Key: proportions = cell proportions
As it can be seen there is now a difference in the proportions between
using subpop and resetting my weights. Is this a problem?
Jason
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/