|
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: Median test & ANOVA with sampling weights
hafida--
You've given us very little information about your survey sample and
its design. More would have been helpful.
You appear to be misusing the terms "sample" and "population". A
"population" is the larger group of people represented by the sample;
statistics for a population are known from outside sources such as a
census. For example, in the U.S. a sample of 1500 people might
represent the population of millions. What you are calling "sample"
and "population" appear to be, respectively, one subgroup of a
sample (those with dmstat=1) and the entire sample.
The proper way to compare one subgroup to the whole group is to
compare the subgroup to the others. So, form two groups: group = 1 if
dmstat =1 and group = 2 if dmstat is not 1 (the rest of the sample).
-pctile- will estimate weighted medians, but the CI's will not be
correct, for they assume independent observations. To proceed, you
must know the sampling design, including cluster and stratum
information. The program -cendif- by Roger Newson (-findit cendif-)
will estimate differences in the medians and accommodates sampling
weights and clustering. The sign test, in contrast, is for a set of
paired independent observations, not for any list of paired numbers.
To do ANOVA, you must first -svyset- your data and use -svy: reg-.
There is nothing special about -svy: reg-; ust set up the ANOVA as
you would do with ordinary -reg-. To compare individual groups to one
another, after the regression run -test-, with options -mtest(holm)-
or -mtest(sidak)-.
Your post shows that you are fairly new to sampling concepts. Before
proceeding, I suggest that you look at a good text; I recommend
"Sampling Design and Analysis", by Sharon Lohr. Your faculty may be
able to suggest local resources.
-Steve
On Sep 19, 2008, at 7:53 AM,
[email protected] wrote:
I'm using a survey data and wonder how can I perform a comparison
between median in the sample and in the population. Medians were
separately obtained using -pctile- or -_pctile-.
. pctile pctGH = o4gh [pw=o1wtarea], nq(4) genp(percent)
. list percent pct in 1/4
+-----------------+
| percent pctGH |
|-----------------|
1. | 25 50 |
2. | 50 67 |
3. | 75 77 |
4. | . . |
+-----------------+
. pctile pctileGH1 = o4gh if dmstat==1 [pw=o1wtarea], nq(4) genp
(pctGH1)
. list pctGH1 pctileGH1 in 1/4
+------------------+
| pctGH1 pctileGH1 |
|------------------|
1. | 25 40 |
2. | 50 60 |
3. | 75 72 |
4. | . . |
+------------------+
Should I calculate the difference between each value in the sample
and population first and carry out a sign test then? If so, how is
sampling weight taken into account? (I mean, can I use weighted
median in the population to substract each 'unweighted' value?)
Secondly, is it possible to perform one-way ANOVA with sampling
weight, particularly for post-hoc comparison? Using svy: regress
did not give enough information.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/