All,
I recently posted a question regarding losing observations and PSUs when
using svy regress and the subpop command. Jeff Pitblado recommended I
update the version of stata I was using to correct this problem.
"Heather should check that her Stata is fully up-to-date. On 02apr2008,
we
posted an ado-file update that fixed a problem similar to what Heather
is
describing above. Here is the corresponding entry from -help whatsnew-:
5. svy's linearized variance estimator was marking out observations
that
had missing values in the independent variables for observations
outside
the subpopulation. This affects the estimated variance values when
the
primary sampling units were the individual observations and could
decrease
the design degrees of freedom. Both of these effects are very
slight and
inversely related to the sample size. This has been fixed.
Note that, prior to this update, entire PSU's can be dropped if each
observation within the PSU contains a missing value in one of the
variables in
the model fit. With an updated Stata, only observations containing
missing
values within the subpop are dropped."
This did in fact fix the problem. However, I am still having a problem
when I run svy, mean.
svyset psuscid [pweight = gswgt1], strata(region)
svy, subpop(bhgsp): mean efficacy esteem child_sex age meduc new_inc2
dm_support dm_college m_indep MDCOLLEG md_spprt MDACHIEV, over(racebh)
Survey: Mean estimation
Number of strata = 4 Number of obs = 4657
Number of PSUs = 121 Population size = 3921354
Subpop. no. obs = 2126
Subpop. size = 1736680
Design df = 117
If STATA is dropping observations from my original dataset due to
incomplete data, is the survey design information from these
observations retained in the calculation of the standard errors? Is
there any way to fix this or another way to get means while using the
survey commands and subpop?
Thanks,
Heather
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/