Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Weighted Averages
From
Steven Samuels <[email protected]>
To
[email protected]
Subject
Re: st: Weighted Averages
Date
Mon, 17 Jan 2011 14:07:46 -0500
You are welcome, Christopher. You will find communication easier if
you use standard terms. In survey usage, "N" is population size; "n"
is sample size. (In some data sets, weights have been normalized to
sum to "n", a practice I don't like.) Also, the proper term for
Stata's default variance estimate in regression is "linearized". Its
publication (Woodruff, 1971) preceded White's publication by over 10
years. And White's estimate, I believe, applied only to a standard
generated model (mean + error term), not to a finite population
sampling design with weights (not to mention clusters and strata).
Steve
Woodruff, RS. 1971. A simple method for approximating the variance of
a complicated estimate. Journal of the American Statistical
Association : 411-414.
On Jan 17, 2011, at 1:34 PM, Christopher Steiner wrote:
Thanks Steve!
I noticed that with binary data, the discrepancy I received was near
the sqrt(DEFF), which I didn't realize Stata was accounting for.
Also, I did not realize that the formula I posted was the frequency
weight formula, so thanks for pointing that out.
In this particular application, the average of the weights is 1, so
summing them is equivalent to N.
This is my first time with survey data, so I'm learning fast. I'm,
confident now that Stata's doing things "correctly," and did a little
reading of the survey book last night.
Thanks,
Christopher Paul Steiner
On Mon, Jan 17, 2011 at 8:04 AM, Steven Samuels <[email protected]>
wrote:
Christopher:
After looking more closely at your formulas, typos aside, I think
that you
were trying to estimate the variance of the mean as:
(Estimated Population Variance)/(sum of weights)
This would be true only if you had a simple random sample with
replacement
and your weights were frequency weights, not probability weights.
The sum
of probability weights is an estimate of N. Dividing a variance by N
would
ordinarily make the standard error of the mean much too small. If
yours are
sometimes larger than the linearized variance estimates, you
probably also
made other mistakes in the formula or your calculations.
Steve
[email protected]
On Jan 16, 2011, at 3:46 PM, Steven Samuels wrote:
Christopher:
The variance formula you present has little relation to the true
formula,
whether for sampling with or without replacement. See for example
page 230
of Sharon Lohr. 2009. Sampling: Design and Analysis. Boston, MA:
Cengage
Brooks/Cole.
On Jan 15, 2011, at 8:02 PM, Christopher Steiner wrote:
Hello everyone:
I am computing some basic summary statistics with weighted means from
a weighted, but otherwise simple design survey. When I use the
following commands:
svyset [pweight=weight2]
svy: reg fcost_1
I get a weighted average of "fcost_1" that matches my hand
calculation. I also receive White "robust" standard errors, which is
fine. However, when I do a hand calculation of regular standard
errors using the formula:
sigma^2 = [sum(weights*(x-xbar))/sum(weights)] * (N/N-1)
and then divide by sum(weights) to get the standard error, I often
receive *larger* standard errors than the robust estimate. Is this a
function of the pweights? Around 10% of the values are also missing,
so is it a function of this? Or am I doing something incorrectly?
Thank so much,
Christopher Paul Steiner
--
Christopher Paul Steiner
Third Year Grad Student, Ph.D. Economics
University of California, San Diego
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
--
Christopher Paul Steiner
Third Year Grad Student, Ph.D. Economics
University of California, San Diego
University of Illinois Alumnus, BS Mathematics & Economics
[email protected] | [email protected] | [email protected]
(Note the number "1" instead of the "p" in the UCSD email address.)
<3!
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/