Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Bootstrapping in svy with vce(linearized)
From
Stas Kolenikov <[email protected]>
To
[email protected]
Subject
Re: st: Bootstrapping in svy with vce(linearized)
Date
Thu, 12 Apr 2012 13:58:57 -0500
-if- should never be a part of an -svy- command. Incorporate the
condition into the -subpop- option:
svy, sub( if s_gaby_diab_1318 & homa_1318<4 & htn==0): regress egfr abdobes
See also http://stata-journal.com/article.html?article=st0153
The bootstrap with complex survey data is also complex, and the
existing Stata -bootstrap- command is not the right tool. See
http://stata-journal.com/article.html?article=st0187.
Note also that you can never get anything unbiased. The only unbiased
estimators in practical use are the sample mean and the regression
coefficients for i.i.d. data, and sometimes the total estimator for
complex survey data. Everything else is a nonlinear statistic of the
underlying random variables, and hence is biased. In particular, the
standard errors are almost always biased down (unless you deliberately
use the methods that inflate the standard errors, e.g. by ignoring
fpcs and strata when specifying complex survey designs).
On Thu, Apr 12, 2012 at 1:18 PM, Rini Rao <[email protected]> wrote:
> Hello fellow Stata folks,
>
> I am using the NHANES data with the svy command and vce(linearized).
> For my analysis, I have created subgroups based on certain parameters
> for complete data. e.g (s_white_1840)
>
> For certain subgroups, I have ended up with very small sample sizes
> (e.g. n=353) and in some cases I only get a model co-efficient
> estimate without accompanying standard errors.
> e.g.
>
> . svyset psu [pweight=wtsaf], strata(strata) vce(linearized) singleunit(missing)
>
>
> . xi: svy, sub(s_gaby_diab_1318): regress egfr abdobes if homa_1318<4 & htn==0
> (running regress on estimation sample)
>
> Survey: Linear regression
>
> Number of strata = 74 Number of obs
> = 1544
> Number of PSUs = 148 Population size = 6726762.5
>
> Subpop. no. of obs = 1544
>
> Subpop. size = 6726762.5
>
> Design df = 74
> F(
> 0, 74) = .
> Prob
>> F = .
>
> R-squared = 0.0029
>
> ------------------------------------------------------------------------------
> | Linearized
> egfr | Coef. Std. Err. t P>|t| [95% Conf. Interval]
> -------------+----------------------------------------------------------------
> abdobes | 3.147383 . . . . .
> _cons | 92.33331 . . . . .
> ------------------------------------------------------------------------------
> Note: missing standard errors because of stratum with single sampling unit.
>
>
> Now, how can I obtain unbiased S.E. and confidence intervals for this
> estimate. Is it acceptable to run a bootstrap using this command? (I
> get a reasonable output using this command but I don't know if it
> makes sense to use this).
>
> bootstrap _b _se if homa_1318<4 & fabg<100 & htn==0, reps(1000)
> strata(s_gaby_diab_1318) seed(89) : regress egfr abdobes
--
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/