Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: subpop() and over() with survey mean estimation using SDR weights
From
[email protected] (Jeff Pitblado, StataCorp LP)
To
[email protected]
Subject
Re: st: subpop() and over() with survey mean estimation using SDR weights
Date
Mon, 30 Jan 2012 16:06:44 -0600
Sam Schulhofer-Wohl <[email protected]> is getting a syntax error from
the -svy:- prefix when using SDR weights with a -subpop()- and -mean, over()-:
> I get a syntax error message, r(198), when I try to estimate a survey
> mean using both the subpop() and over() options and SDR weights.
>
> I can't find any indication in the manual that this combination of
> options and weights is not allowed. Also, for other kinds of weights,
> there is no error. Some examples are below.
>
> Has anyone run into this problem before? If so, any suggestions on how
> to solve it? (Obviously, I can make a new variable that combines my
> over() and subpop() categories, but in the particular analysis I'm
> using, it would be more convenient to be able to use over() and
> subpop() simultaneously.)
We are able to reproduce the syntax error and believe that this is a bug in
-svy sdr-.
We hope to have this fixed in a future update in Stata 12.
In the mean time, Sam can trick the -svy bootstrap- prefix to produce the SDR
standard errors by using option -bsn(4)- with the SDR weights -svyset- as
-bsrweight()- instead.
Assuming that only the SDR weight variables are prefixed 'pwgtp' Sam's
-svyset- command would be
. svyset [pw=pwgtp], bsrweight(pwgtp?*) bsn(4) vce(bootstrap)
--Jeff
[email protected]
This is Sam's example:
> *example of the error using SDR weights
>
> . webuse ss07ptx, clear
>
> . svyset
>
> pweight: pwgtp
> VCE: sdr
> MSE: off
> sdrweight: pwgtp1 pwgtp2 pwgtp3 pwgtp4 pwgtp5 pwgtp6 pwgtp7 pwgtp8 pwgtp9
> pwgtp10 pwgtp11 pwgtp12 pwgtp13 pwgtp14 pwgtp15 pwgtp16 pwgtp17
> pwgtp18 pwgtp19 pwgtp20 pwgtp21 pwgtp22 pwgtp23 pwgtp24 pwgtp25
> pwgtp26 pwgtp27 pwgtp28 pwgtp29 pwgtp30 pwgtp31 pwgtp32 pwgtp33
> pwgtp34 pwgtp35 pwgtp36 pwgtp37 pwgtp38 pwgtp39 pwgtp40 pwgtp41
> pwgtp42 pwgtp43 pwgtp44 pwgtp45 pwgtp46 pwgtp47 pwgtp48 pwgtp49
> pwgtp50 pwgtp51 pwgtp52 pwgtp53 pwgtp54 pwgtp55 pwgtp56 pwgtp57
> pwgtp58 pwgtp59 pwgtp60 pwgtp61 pwgtp62 pwgtp63 pwgtp64 pwgtp65
> pwgtp66 pwgtp67 pwgtp68 pwgtp69 pwgtp70 pwgtp71 pwgtp72 pwgtp73
> pwgtp74 pwgtp75 pwgtp76 pwgtp77 pwgtp78 pwgtp79 pwgtp80
> Single unit: missing
> Strata 1: <one>
> SU 1: <observations>
> FPC 1: <zero>
>
> . set seed 12345
>
> . gen mysubpop=(uniform()<.5)
>
> . svy, subpop(mysubpop): mean agep, over(sex)
> (running mean on estimation sample)
>
> SDR replications (80)
> ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
> .................................................. 50
> ..............................
> invalid syntax
> r(198);
>
> *no error if I use only over()
>
> . svy: mean agep, over(sex)
> (running mean on estimation sample)
>
> SDR replications (80)
> ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
> .................................................. 50
> ..............................
>
> Survey: Mean estimation Number of obs = 230817
> Population size = 23904380
> Replications = 80
>
> Male: sex = Male
> Female: sex = Female
>
> --------------------------------------------------------------
> | SDR
> Over | Mean Std. Err. [95% Conf. Interval]
> -------------+------------------------------------------------
> agep |
> Male | 33.24486 .0470986 33.15255 33.33717
> Female | 35.23908 .0386393 35.16335 35.31481
> --------------------------------------------------------------
>
>
> *no error if I use only subpop()
>
> . svy, subpop(mysubpop): mean agep
> (running mean on estimation sample)
>
> SDR replications (80)
> ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
> .................................................. 50
> ..............................
>
> Survey: Mean estimation Number of obs = 230817
> Population size = 23904380
> Subpop. no. obs = 115334
> Subpop. size = 11954384
> Replications = 80
>
> --------------------------------------------------------------
> | SDR
> | Mean Std. Err. [95% Conf. Interval]
> -------------+------------------------------------------------
> agep | 34.17763 .0666841 34.04693 34.30833
> --------------------------------------------------------------
>
>
> *no error when using subpop() and over() with pweights
> *(example from page 62 of [SVY])
>
> . use http://www.stata-press.com/data/r12/nmihs, clear
>
> . svyset
>
> pweight: finwgt
> VCE: linearized
> Single unit: missing
> Strata 1: stratan
> SU 1: <observations>
> FPC 1: <zero>
>
> . generate nonblack = (race == 0) if !missing(race)
>
> . svy, subpop(nonblack): mean birthwgt, over(marital age20)
> (running mean on estimation sample)
>
> Survey: Mean estimation
>
> Number of strata = 3 Number of obs = 4724
> Number of PSUs = 4724 Population size = 3230403
> Subpop. no. obs = 4724
> Subpop. size = 3230403
> Design df = 4721
>
> Over: marital age20
> _subpop_1: single age20+
> _subpop_2: single age<20
> _subpop_3: married age20+
> _subpop_4: married age<20
>
> --------------------------------------------------------------
> | Linearized
> Over | Mean Std. Err. [95% Conf. Interval]
> -------------+------------------------------------------------
> birthwgt |
> _subpop_1 | 3312.012 24.2869 3264.398 3359.625
> _subpop_2 | 3244.709 36.85934 3172.448 3316.971
> _subpop_3 | 3434.923 8.674633 3417.916 3451.929
> _subpop_4 | 3287.301 34.15988 3220.332 3354.271
> --------------------------------------------------------------
> Note: 3 strata omitted because they contain no subpopulation
> members.
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/