[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: SV: st: Survey - raking - calibration - post stratification - calculating weights

From	Steven Samuels <[email protected]>
To	[email protected]
Subject	Re: SV: st: Survey - raking - calibration - post stratification - calculating weights
Date	Sun, 7 Dec 2008 00:43:08 -0500

--

Stas, I am envious of statisticians who draw samples from thoselists. This is a double sample and I agree with your advice: giveeveryone the weight for their age stratum:

                          weight1 = N_i/n_i

where "N" denotes population and "n" denotes sample size. Kristianapparently thinks of the 5,000 person sample as his "population"; thefigure that he linked to does not show the initial sampling step atall. He may not have access to the one-year census counts. If hedoes not, I suggest that he use the N's from the 5,000. I suggestbelow that he also form geographic categories and rake those, withpopulation counts, if possible, otherwise with counts from the5,000. I roughly calculate that with 5,000 in the first phasesample, bias in estimates and in standard errors will be small.

Kristian, here is how to simultaneously match the age distributionand the geographic distribution of the final sample to yourpopulation. (This is called "sample balancing" or "raking".) Formage groups (agegp) and geographical groupings (geogp) and get thepopulation counts(or percentages, see below) in each cell.


**************************CODE BEGINS**************************
* tot_agep =  total for population in participant age group (agegp)

* tot_geogp = total for population in participant geographical group(geogp)

**************************************************************

survwgt rake  weight1  ///
      by(agegp geogp) ///
      totvars(tot_agegp tot_geogp ///
      gen(weight2)
***************************CODE ENDS***************************

Raking can present problems, so so I suggest that you read http://www.abtassociates.com/presentations/raking_survey_data_2_JOS.pdf. If you cannot getpopulation counts, perhaps you can get population percentages,multiply by 10 or 100 and round to the nearest whole number (e.g.5.12% = 51 or 512), so that the population "size" is 1,000 or 10,000.For estimating means and proportions, these will yield nearly thesame results as actual population counts. The Denmark census countsor percentages might be available only in larger age categories thanthe ones you used to draw the sample: say (60-64, 65-70,70-74). Ifso, use those for the raking calculations.

If you have, say, four geographical categories, you may be tempted touse 4 x 15 =60 stratification combinations. However, with only 600people in the final sample, the numbers in individual cells will betoo small for reliable estimation.

Theory for double sampling can be found in WG Cochran, 1973, SamplingTechniques, pp 117-119, 327-334, or in most other texts.Unfortunately, raking will not completely solve the problem of non-response.


-Steven

On Dec 6, 2008, at 11:19 PM, Stas Kolenikov wrote:

Steven,

you might be shocked, but people in Nordic countries do have their
population completely enumerated. Putting NJC's hat on :)), let me
remind you that this is an international list, and different countries
have different standards of how they collect and store their official
data. Denmark has a register with an equivalent of SSN that makes it
possible to combine the data three ways from economic, medical and
social perspectives. That's a survey statistician and a
microeconometrician dream... and they actually do have the capacity of
drawing SRS. That is, the first 5000 were SRS of the population, and
then Kristian continued a with stratified second phase sampling.

I would probably just give everybody the weight = # in age group
across Denmark (in some meaningfully defined period of the study) / #
in age in group in the sample. If you treat sample groups as
non-response adjustment cells, that's what this will probably boil
down to after multiplication of three or so fractions.
ches and help try:

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- SV: SV: st: Survey - raking - calibration - post stratification - calculating weights
  - From: "Kristian Wraae" <[email protected]>

References:
- SV: st: Survey - raking - calibration - post stratification - calculating weights
  - From: "Kristian Wraae" <[email protected]>
- Re: SV: st: Survey - raking - calibration - post stratification - calculating weights
  - From: Steven Samuels <[email protected]>
- Re: SV: st: Survey - raking - calibration - post stratification - calculating weights
  - From: "Stas Kolenikov" <[email protected]>

Prev by Date: Re: SV: st: Survey - raking - calibration - post stratification - calculating weights
Next by Date: SV: SV: st: Survey - raking - calibration - post stratification - calculating weights
Previous by thread: Re: SV: st: Survey - raking - calibration - post stratification - calculating weights
Next by thread: SV: SV: st: Survey - raking - calibration - post stratification - calculating weights
Index(es):
- Date
- Thread