I see that in my previous post I confused two issues 1) the sample
size requirements for validity of the survey-adjusted chi square
tests in Stata; 2) sample size requirements for estimates of cell
totals or proportions, with small counts. Ángel asked about the
first issue. Bottom line: I really don't have an answer.
-Steve
On Nov 6, 2008, at 5:04 PM, Steven Samuels wrote:
>
> I've looked though Chapters 6-7 of Chamber's and Skinner's book
> Analysis of Survey Data, Wiley, 2003, but I have no definitive
> answer. I do have some thoughts:
>
> * "Expected" count is not a guide in the survey setting--it is a
> sum of weights of sample observations in the table cell.
>
> * The accuracy of the second-order Rao-Scott statistic chi square,
> probably the best test in -svy: tab-, is apt to depend on the
> number of clusters, on the crude counts, and on the distribution of
> the observations across clusters. The rule of thumb of 5
> observations (or 1) in a cell is based on theory of i.i.d.
> observations that does not hold in the complex survey setting.
>
> * With a small number of events, I ordinarily display only
> unweighted numbers and do not reported weighted estimates or
> confidence intervals. When I have wanted to infer something about a
> proportion based on small outcome count, I've resorted to the
> methods on pp. 64-68 of Korn and Graubard (1999) Analysis of Health
> Surveys, Wiley.
>
> A quick Google search turned up one survey which would not report a
> cell with fewer than 25 observations (http://www.nsf.gov/statistics/
> showsrvy.cfm?srvy_CatID=5&srvy_Seri=16) and another in which the
> minimum cell size was 4,000! (http://www.phac-aspc.gc.ca/publicat/
> cdic-mcc/17-3/a_e.html).
>
> So a guess for Ángel is that not even five observations in table
> cell is enough.
>
> -Steve
>
> On Nov 6, 2008, at 7:33 AM, Nick Cox wrote:
>
>> There is no need to invoke belief! My -tabchi- and -tabchii-
>> (programs) from the -tab_chi- package on SSC do indeed give
>> warnings. (There is no Stata program called tab-chi.)
>>
>> But these old warnings are very conservative. Many writers now
>> advise that chi-square works fine so long as all expected
>> frequencies are above about 1. In any case, the point can be
>> explored by simulations or bootstrapping. Often it is better to
>> use Fisher's exact test.
>>
>> I can't advise on the main issue, which is for svy-savvy people,
>> but in general very low expected frequencies could be problematic
>> for any method.
>>
>> Nick
>> [email protected]
>>
>> Ángel Rodríguez Laso
>>
>>
>> I've been reviewing the manuals and statalist archives and I've
>> confirmed that Stata does not give any automatic warning message when
>> requirements for a valid chi-square test are not met (i.e. no more
>> than 20% of the expected values in a table are less than 5 and none
>> are less than 1), what I think is a nuisance. I suppose this can be
>> only worked out by writing the option 'expected' after tabulate and
>> checking oneself if the requirements are met. I believe Cox's tab-chi
>> package does give a warning when requirements are not met.
>>
>> I wonder also if the Rao and Scott correction of Pearson chi-square
>> that is recommended for survey designs needs the same requirements.
>> The problem then would be that -svy:tab- doesn't support the
>> 'expected' option neither tab-chi is suitable for survey analysis.
>>
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>
Steven Samuels
845-246-0774
18 Cantine's Island
Saugerties, NY 12477
EFax: 208-498-7441
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/