I apologize if my question was confusing. I know that the weights in my
sample are frequency weights. The problem is not in accounting for weights
in the regression but in the statistical significance of the coefficients. I
remember from literature that with weighted data one must be careful with
the interpretation of statistical significance, as t-statistics tend to be
overstated. I am curious if anyone knows how to account for this
statistically.
MM
----- Original Message -----
From: "Copeland, Laurel" <[email protected]>
To: <[email protected]>
Sent: Friday, May 23, 2003 4:09 PM
Subject: st: RE: Re: RE: statistical significance in a data set with
weighted observations
> As I understand things, the t-statistics for the parameter estimates
> correctly reflect the importance of your predictors in your analysis,
> assuming the sample was taken as represented by the weights. The effect is
> taken into account if you use -svy...- specifying weights (psu, strata).
>
> If you do not include the weights (so analyze the small sample as if it
were
> an entity unto itself), you will not get correct parameter estimates (or
> accompanying t-statistics) to generalize.
>
> The fact that the t-statistics are significant or insignificant is
> immaterial. Your approach need only be consistent with what the data
> actually represent.
>
> You mention fweights (frequency weights). I am assuming these are
1/pweight
> (probability weights) for your dataset. If this is not the case, you may
> need to find out more about your sample and how it was taken.
>
> Actually, you should find out as much as you can about your sample and how
> it was taken, regardless.
>
> -Laurel
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]]
> Sent: Friday, May 23, 2003 3:32 PM
> To: [email protected]
> Subject: st: Re: RE: statistical significance in a data set with
> weighted observations
>
> Thank you!
>
> I do account for the weights using fweights in my regression, but the
> weights increase the impact of observations, and thereby impacting the
> t-statistics making the effect that all explanatory variables are
> significant. Is there a way of accounting for that effect on t-stats?
>
> thanks,
>
> Mikhail
> ----- Original Message -----
> From: "Copeland, Laurel" <[email protected]>
> To: <[email protected]>
> Sent: Friday, May 23, 2003 3:05 PM
> Subject: st: RE: statistical significance in a data set with weighted
> observations
>
>
> > The data can be weighted to reflect the sampling design. The sampling
> > design is complex to give you a sample that is representative of the
> > underlying population, and to allow inferential statistics. The complex
> > sampling lets you get a good sample of a large population of unlisted
> > smaller units (e.g., all US residents), based on a complete list of
larger
> > units (e.g., US census tracts). The weight is the inverse of the
> > probability of getting sampled. In your sample, individual units had
> > differing probabilities of being sampled, so they have differing
weights.
> > The calculated size of the population that is represented by your sample
> > will be produced by Stata -svy-- commands. To analyze such a sample
> > properly, you must include the PSU, strata, and weights in your
analysis,
> if
> > they exist. Without the weights, the estimates you get will be biased.
> > Sometimes weights are used to allow post-stratification (for matching to
a
> > known distribution) or to deal with non-response.
> > -Laurel
> >
> > -----Original Message-----
> > From: [email protected] [mailto:[email protected]]
> > Sent: Friday, May 23, 2003 2:52 PM
> > To: [email protected]
> > Subject: st: statistical significance in a data set with weighted
> > observations
> >
> > Dear Stata Users,
> >
> > I have encountered this small problem and since I am not sure about how
to
> > address it myself I've decided to ask you all. Thank you in advance for
> any
> > advice you might have for me.
> >
> > I am working with a dataset that has weights for all observations, and
> these
> > weights exhibit large variation, from 1 to over 500. When I run a
> > nonweighted estimation my t-statistics are relatively small, but when
> > weights are introduced, the t-statistics jump. Is there a way of
> determining
> > the true statistical significance of coefficients in this case?
> >
> > Thanks again for any help you might have,
> >
> > MM
> >
> > *
> > * For searches and help try:
> > * http://www.stata.com/support/faqs/res/findit.html
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
> > *
> > * For searches and help try:
> > * http://www.stata.com/support/faqs/res/findit.html
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/