All,
I am using survey regression with the subpop command, see below.
svyset psuscid [pweight = gswgt1], strata(region)
svy, subpop(allsp): regress esteem white3 hisp child_sex
However the number of observations in the output does not match the
total number of cases in my dataset. I have 18924 cases in the original
dataset, here the number of observations is only 18768.
However, the subpopulation number of observations does appear correct
(10224).
Survey: Linear regression
Number of strata = 4 Number of obs
= 18768
Number of PSUs = 132 Population size
= 22000302
Subpop. no. of
obs = 10224
Subpop. size
= 12582072
Design df
= 128
F( 3, 126)
= 47.41
Prob > F
= 0.0000
R-squared
= 0.0331
In another output not only are the number of observations incorrect
(should be 10244) but the PSUs are also lower.
svyset psuscid [pweight = gswgt1], strata(region)
svy, subpop(if bhsp == 1): regress esteem racebh
Number of strata = 4 Number of obs
= 6973
Number of PSUs = 126 Population size
= 5955176.3
Subpop. no. of
obs = 4017
Subpop. size
= 3346350.8
Design df
= 122
F( 1, 122)
= 33.36
Prob > F
= 0.0000
R-squared
= 0.0253
There are missing cases in some of the variables in my regression. Is
stata dropping these cases from the number of original observations? I
do specify in my subpop command to not include cases with missing data.
If STATA is dropping observations from my original dataset due to
incomplete data, is the survey design information from these
observations retained in the calculation of the standard errors?
Every example I have found of stata output using survey regression with
the subpop command the number of observations matches the total number
of cases in the dataset.
Thanks,
Heather
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/