[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[no subject]
I'm using Stata 9.2.
Many thanks for your time and interest.
Angel Rodriguez-Laso
2009/6/4 Jeff Pitblado, StataCorp LP <[email protected]>:
> Angel Rodriguez-Laso <[email protected]>
>
>> I'm confused with the following results:
>>
>>
>>
>> . svyset psu [pweight=weight2007], strata(healtharea)fpc(psusperhealtharea)
>>
>> pweight: weight2007
>> VCE: linearized
>> Strata 1: healtharea
>> SU 1: psu
>> FPC 1: psusperhealtharea
>>
>> .
>> end of do-file
>>
>> . svy: tab p29, deff deft
>> (running tabulate on estimation sample)
>>
>> Number of strata = 11 Number of obs = 12140
>> Number of PSUs = 1266 Population size = 12134,139
>> Design df = 1255
>>
>> -------------------------------------------------
>> Any permanent
>> disability | proportions deff deft
>> ----------+--------------------------------------
>> 0, no | ,8887 -1981 ,9783
>> 1, yes | ,1113 -1981 ,9783
>> |
>> Total | 1
>> -------------------------------------------------
>> Key: proportions = cell proportions
>> deff = deff for variances of cell proportions
>> deft = deft for variances of cell proportions
>>
>>
>>
>>
>> Why do I get large negative deff values? Deft resembles more what I
>> was expecting, but it should be the square root of deff and obviously
>> this is not the case. Do you have any explanation for these results?
>
> Stas Kolenikov <[email protected]> already pointed out that the sampling
> weights appear to be normalized by the sample size. In fact, the sum of the
> weights is less than the sample size. When the first stage is sampled without
> replacement (i.e. the 'fpc()' in the above -svyset-), the 'deff' calculation
> is
>
> deff = V_db / (1-n/W) V_srswr
>
> where 'V_db' is the design based variance estimate, 'V_srswr' is simple
> randome sample with replacement variance estimate, 'n' is the sample size, and
> 'W' is an estimate for the population size. Here 'W' is the sum of the
> sampling weights. Since Angel's sampling weights are normalized, they cannot
> be used to estimate the population size, thus the above 'deff' calculation is
> not valid. Without knowing what population size, we can't compute a valid
> 'deff' statistic.
>
> On the other hand, the 'deft' calculation is
>
> deft = sqrt( V_db / V_srswr )
>
> which does not need an estimate of the population size, and thus will always
> produce a valid value.
>
> We will look into changing -svy: tabulate- and -estat effects- to report
> missing values for 'deff' in the case where the 'W' calculation is less than
> or equal to 'n'.
>
> --Jeff
> [email protected]
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
© Copyright 1996–2024 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |