Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: st: Stata coding to SAS


From   Steve Samuels <[email protected]>
To   [email protected]
Subject   Re: st: st: Stata coding to SAS
Date   Wed, 23 Jan 2013 09:28:24 -0500

Reading  the -help- for -svyset-, or my previous post would have told you what the postwt() variable should be and how to convert the Stata variable to a formal  probability weight for SAS. And, as error message you reported separately to the indicated, the postwt() variable must have the same value for everyone in the same poststratum.  So giving cases a constant value is incorrect. Another issue is that that SAS is seeing 5 more observations than Stata--you will have to track this one down. 

But you have other problems:  Re-weighting by deprivation will lead to biased estimates of associations with deprivation or with related factors.  Moreover, the point of matching as you did it is to make control distributions resemble case distributions, not vice-versa. 

As you don't really have a probability sample, use of the subpop() option is unnecessary.

A good sampling reference to is Sharon Lohr: Sampling Design and Analysis, 2009, Cengage.

I'm curious: Where are you working or studying?

Steve

 
 Jan 21, 2013, at 11:36 PM, K C Wong wrote:

Thanks Steven and Stas for your kind reply.
And my apology for my previous uninformative post.

The study's a multi-ethnic, age- and ethnicity-matched population
case-control study.
A weight was calculated for each stratum of ethnicity(let say
European, Asian and Indian) * deprivation(5-category), by dividing the
expected deprivation distribution of each ethnic group by the observed
deprivation distribution in the controls from our study because of the
low response rates and differential non-response by deprivation
quintile. The expected distributions were estimated to be xx%, .....,
xx% for European and xx%,...., xx% for both Asian and Indian. I have a
total of 4342 respondents.

The Stata weighting coding is done as below:
. gen aweight=1 if case==0
. replace aweight = 0.881 if ethnicity==0 & deprivation==0 & case==0
. replace aweight = 0.813 if ethnicity==0 & deprivation==1 & case==0
. replace aweight = 0.962 if ethnicity==0 & deprivation==2 & case==0
. replace aweight = 1.105 if ethnicity==0 & deprivation==3 & case==0
. replace aweight = 1.667 if ethnicity==0 & deprivation==4 & case==0
. replace aweight = 0.881 if ethnicity==0 & deprivation==.  & case==0

. replace aweight = 0.187 if ethnicity==1 & deprivation==0 & case==0
. replace aweight = 0.190 if ethnicity==1 & deprivation==1 & case==0
. replace aweight = 0.565 if ethnicity==1 & deprivation==2 & case==0
. replace aweight = 0.833 if ethnicity==1 & deprivation==3 & case==0
. replace aweight = 2.044 if ethnicity==1 & deprivation==4 & case==0

. replace aweight = 0.435 if ethnicity==2 & deprivation==0 & case==0
. replace aweight = 0.323 if ethnicity==2 & deprivation==1 & case==0
. replace aweight = 0.926 if ethnicity==2 & deprivation==2 & case==0
. replace aweight = 0.948 if ethnicity==2 & deprivation==3 & case==0
. replace aweight = 1.236 if ethnicity==2 & deprivation==4 & case==0
. replace aweight = 1.236 if ethnicity==2 & deprivation==.  & case==0

. replace aweight = 1 if case==1 /* since the weighting is only for controls */

. gen poststrata=0

. recode poststrata 0=1 if ethnicity==0 & nzdepgrp==0 & case==0
. recode poststrata 0=2 if ethnicity==0 & nzdepgrp==1 & case==0
. recode poststrata 0=3 if ethnicity==0 & nzdepgrp==2 & case==0
. recode poststrata 0=4 if ethnicity==0 & nzdepgrp==3 & case==0
. recode poststrata 0=5 if ethnicity==0 & nzdepgrp==4 & case==0
. recode poststrata 0=.  if ethnicity==0 & nzdepgrp==.  & case==0

. recode poststrata 0=6   if ethnicity==1 & nzdepgrp==0 & case==0
. recode poststrata 0=7   if ethnicity==1 & nzdepgrp==1 & case==0
. recode poststrata 0=8   if ethnicity==1 & nzdepgrp==2 & case==0
. recode poststrata 0=9   if ethnicity==1 & nzdepgrp==3 & case==0
. recode poststrata 0=10 if ethnicity==1 & nzdepgrp==4 & case==0

. recode poststrata 0=11 if ethnicity==2 & nzdepgrp==0 & case==0
. recode poststrata 0=12 if ethnicity==2 & nzdepgrp==1 & case==0
. recode poststrata 0=13 if ethnicity==2 & nzdepgrp==2 & case==0
. recode poststrata 0=14 if ethnicity==2 & nzdepgrp==3 & case==0
. recode poststrata 0=15 if ethnicity==2 & nzdepgrp==4 & case==0
. recode poststrata 0=.    if ethnicity==2 & nzdepgrp==.  & case==0

. recode poststrata 0=16 if case==1

. gen european=0
. recode european0=1 if eth_new==1

. gen asian=0
. recode asian 0=1 if eth_new==2

. gen indian=0
. recode indian 0=1 if eth_new==0

. svyset _n,  poststrata(poststrata) postweight(aweight)
. svy, subpop(european): logistic case  i.age_cat i.interviewmethod
i.status i.deprivation
(running logistic on estimation sample)

Survey: Logistic regression

Number of strata   =            1            Number of obs        =
      4337
Number of PSUs   =      4337           Population size       =         14.996
N. of poststrata     =          17            Subpop. no. of obs =
       259
                                                          Subpop.
size           = 3.9069105
                                                          Design df
             =          4336
                                                          F(  10,
4327)         =           1.92
                                                          Prob > F
             =        0.0382


SAS:
proc surveylogistic data=bc;
 strata poststrata;
 weight aweight;
 domain european;
 class age_cat(ref="0") interviewmethod(ref="3") deprivation(ref="0")
/ param=ref;
 model case(event="1") = age_cat  interviewmethod status deprivation;
run;

                 Domain Summary


Number of Observations                                4342
Number of Observations in Domain              264
Number of Observations not in Domain        4078
Sum of Weights in Domain                            267.82300

            Variance Estimation
Method                                             Taylor Series
Variance Adjustment                       Degrees of Freedom (DF)
Number of Observations Read       4342
Number of Observations Used       4331
Sum of Weights Read                     267.823
Sum of Weights Used                     261.643

I understand that I have not done it right for my lack of
understanding in weighting.
I'd be greatly appreciated if anyone could further shed some light on this.
Once again, thanks Steven and Stas.



On Thu, Jan 17, 2013 at 6:17 AM, Steve Samuels <[email protected]> wrote:
> I agree with Stas's diagnosis.
> 
> Statum h:  n_h subjects N_h population total
> 
> In Stata: post weights are N_h, but normalized are: N_h/n_h, so sum to
> N_h in the stratum
> 
> In SAS: probability weights are N_h, so sum to n_h x N_h in the stratum.
> 
> To get corresponding weights in SAS, KC could create "probability
> weights" for SAS as wt_h = N_h/n_h.
> 
> But post-strata are technically subpopulations (also known as "domains"), so,
> depending on sample size, the standard errors given by SAS could  be
> wrong. There's also a question of whether reference groups are the same for
> the predictor with reference group "3" in SAS.
> 
> 
> KC specifies a subpopulation "european" in his analyses.
> Post-stratifying with a subpopulation can give poor results if
> population and subpopulation stratum proportions are very different. In
> such a case KC could be better off not post-stratifying.KC's sample
> might not have been drawn SRS, so that even Stata might not be giving
> proper standard errors. All in all, I'd like to see more information and the
> results as I requested.
> 
> 
> Steven Samuels
> Consulting Statistician
> 18 Cantine's Island
> Saugerties, NY 12477 USA
> 845-246-0774
> 
> 
> 
> 
> On Jan 16, 2013, at 10:57 AM, Stas Kolenikov wrote:
> 
> Stata's -postweight()- is the target sum of weights for a given
> poststrata, rather than a weight variable as you specified for SAS.
> Besides, I am not sure SAS supports poststratification, unlike Stata.
> 
> 
> --
> -- Stas Kolenikov, PhD, PStat (SSC)  ::  http://stas.kolenikov.name
> -- Senior Survey Statistician, Abt SRBI  ::  work email kolenikovs at
> srbi dot com
> -- Opinions stated in this email are mine only, and do not reflect the
> position of my employer
> 
> On Wed, Jan 16, 2013 at 9:05 AM, Steve Samuels <[email protected]> wrote:
>> KC
>> 
>> You will have a better chance of getting an answer to your question if,
>> as the FAQ request, you show the results of all commands.
>> 
>> Also note that the one "rule" on Statalist is to use full names.
>> If you are professionally known as "KC Wong", then please give your
>> affiliation.
>> 
>> Steve
>> 
>> 
>> Steven Samuels
>> Consulting Statistician
>> 18 Cantine's Island
>> Saugerties, NY 12477 USA
>> 845-246-0774
>> 
>> 
>> On 12 January 2013 KC Wong wrote:
>> 
>> 
>>> I wish to translate the below Stata coding to SAS and I'm wondering if
>>> I have the SAS coding right because the result from Stata differs from
>>> SAS's.
>>> I'm now using StataIC 11.
>>> 
>>> Stata:
>>> svyset _n,  poststrata(poststrata) postweight(aweight)
>>> svy, subpop(european): logistic case  i.age_cat i.interviewmethod
>>> i.status i.deprivation
>>> 
>>> SAS:
>>> proc surveylogistic data=bc;
>>> strata poststrata;
>>> weight aweight;
>>> domain european;
>>> class age_cat(ref="0") interviewmethod(ref="3") deprivation(ref="0")
>>> / param=ref;
>>> model case(event="1") = age_cat  interviewmethod status deprivation;
>>> run;
>> 
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index