Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Sampling weights (pweights) and regression analysis
From
Steve Samuels <[email protected]>
To
[email protected]
Subject
Re: st: Sampling weights (pweights) and regression analysis
Date
Thu, 12 Jul 2012 20:16:25 -0400
On Jul 11, 2012, at 4:15 PM, Fatih Yilmaz wrote:
> I am having trouble with using sampling weights in my simple regression
> analysis.
>
> Here is the story:
>
> The survey data I have is not representative, where some groups were
> deliberately over or under-sampled.
> The weights I was provided ara computed as follows:
>
> For group one (strata), population weight is 60%
> sample weight is 40%
> Final Pweight = 60%/40%=1.5
>
> My questions:
>
> 1- I needed to drop some of the observations from the survey data: outliers,
> missings obs and also unrelated data.
> so, can I still use the old (initial) weights or do I have to re-weight the
> data with respect to the dropped observations?
> Or how problematic could it be to use old weights?
>
You should reweight for non-response.. Not doing so could be quite problematic.
How you do thisdepends on what you know about the population. See the sections
on nonresponse weighting in the books by Lohr or Groves et al. and in the PEAS page
referenced below. If you are dropping observations because of missing data for
some variables, you have a couple of choices. Probably best is to treat these as
"nonrespondents". Better would be to impute missing variables with Stata's
multiple imputation commands (see the help for -mi svyset-), but this would take
your analysis out of the realm of the "simple".
Note that if you want to analyze a subgroup, it is an error to discard
members of the sample who are not in the subgroup. Doing so risks standard
errors that are too small. See the section on "subpopulations" in
Stata's survey manual and in Lohr's book (reference)
> 2- Since, my weights were computed as w=(pop%)/(sample%) (in general, some other
> researchers may compute them as w=(sample%)/(pop%) ),
> when I estimate weighted OLS should I use "reg y x [pw=1/w]" or ""reg y x
> [pw=w]".
>
Other researchers may, but they would be wrong. From your description, I think that
you have the right weights. You can check by seeing if the stratum weight totals
add up to the known stratum population sizes. ("total w, over(stratum)"
To do survey regression in Stata, you -svyset- the data and identify weights,
sampling strata, and clusters, if any. The regression estimation command is
s -svy, subpop(): regress-
> Could you pls also recommend some resources on sampling weights and regression
> analysis (preferably practical sources ),
>
Resources:
Lohr, S. L. (1999 1st Ed & 2009 2nd Ed). Sampling: Design and Analysis (2nd
ed.). Boston, MA: Cengage Brooks/Cole.
Groves, R. M., Fowler, F. J., Couper, M. P., Lepkowski, J. M., Singer, E., &
Tourangeau, R. (2004 1st Ed, 2009, 2nd). Survey methodology. Hoboken, N.J.: Wiley.
http://www.restore.ac.uk/PEAS/about.php
especially http://www.restore.ac.uk/PEAS/theory.php
with sections on weighting and non-response
and the exemplars page
http://www.restore.ac.uk/PEAS/aboutex2.php
http://www.statcan.gc.ca/edu/power-pouvoir/ch13/5214895-eng.htm. See especially:
http://www.statcan.gc.ca/edu/power-pouvoir/ch13/estimation/5214893-eng.htm.
help.pop.psu.edu/help-by-statistical-method/weighting
http://www.ats.ucla.edu/stat/stata/seminars/applied_svy_stata11/default.htm
Steve
[email protected]
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/