Jennifer,
Off the top of my head, it seems to me that, if degrees of freedom can
be considered roughly as the 'independent pieces of information' in the
model, I suppose that given the random choosing of your clusters
(villages), those 20 sub-groups are your only independent pieces of
information in your sample, and anything selected out of that
(sub-villages, then households) would be related, as they are in the
same cluster or PSU.
BUT, given the further sampling of households out of 'sub-villages' -
isn't this multi-stage sampling? Stata can handle that as well, I
believe, although I haven't had cause to use it (yet). I am not sure
how that would affect the degrees of freedom as Stata calculates it.
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Jennifer
Schmitt
Sent: September 3, 2009 1:48 PM
To: [email protected]
Subject: Re: st: RE: Survey design degrees of freedom help
My sample size is 489, but I only have 20 PSU, 3 Strata so my design df
= 20-3-1 (for the constant). I have a stratified, clustered sample and
so my design df are based off my PSU and are thus really low. I just do
not know know why this is the case (I have accepted that it is the
case), but I can't defend that without knowing why (or maybe it is
wrong, but everything in the FAQ and help sections online suggest it is
correct). Thanks.
[email protected] wrote:
Your degrees of freedom are 16??
What is your sample size?
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Jennifer
Schmitt
Sent: September 3, 2009 11:26 AM
To: [email protected]
Subject: st: Survey design degrees of freedom help
Hello everyone,
I need some important clarification about the design degrees of
freedom
for stratified clustered survey analysis. I have data that was
stratified by three areas (NW, SW, and E). We randomly chose villages
from these three areas, then chose 5 subvillages within the village
and
within the subvillage we chose households (the unit of interest for my
analysis). I am running logistic regressions with PSU = village,
strata
= area and probability weights. My design degrees of freedom are 16
(PSU-strata-one for the constant term). I get that. What I do not
understand is WHY and how to explain to others unfamiliar with STATA
that it is correct, any answers to this would be greatly appreciated.
The reason this is an issue is that I want to test more than 16
variables at once and obviously I can't with only 16 df. Thank you.
Jennifer