|
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: gllamm & stratified sampling design
Many thanks to Jay and Sebastian for the references. I just finished
reading the paper, but I�m not sure I have fully understood what is
going on.
Steven,
I appreciate your help and interest. I only have one cross-section. The
main features of the survey design are the following:
*
1.* The sample was drawn from a household database of approximately 11
million households in the United States that are identified as Latino or
Hispanic. The universe of analysis contains approximately
87.5% of the US Hispanic population.
*2.* The survey covers 15 states and the District of Columbia
metropolitan area (including counties and municipalities in Virginia and
Maryland). States were selected based on the overall size of the
Latino/Hispanic population.
*3. *The sample is stratified by geographic designation, meaning that
each state sample is a valid, stand-alone representation of that state�s
Latino population.
*4.* Respondents were selected randomly from the Latino households in
the jurisdictions covered (states) without replacement.
*5. *State sample sizes vary as a result of specific funders� requests.
The smallest sample size for any unit was 400, yielding a margin of
error of less than � 5% for each state.
*6. *A number of states were stratified internally. In each case but
California, internal strata were represented proportionately in the
final sample. In California, additional strata were imposed in a
non-proportional fashion, owing in part to the larger sample size, to
allow greater between-region comparisons.
*7.* I don�t have the formula for how weights were computed. The
survey�s documentation says that national weights were constructed so
that the numbers are accurately representative of the universe covered
by the study.
Please let me know if you think my svyset statement is inaccurate:
svyset [pweight=wt_natio], strata(usstate)
wt_natio is the national weight described in *#7* above. usstate is the
var that identifies the within-state strata.
I think I should care about conducting a multilevel analysis because I
have merged two types of state-level characteristics to each individual
in my sample. One reflects state-level characteristics of the persons'
country of origin before s(he) arrived to the US. The other reflects the
characteristics of the state in which the person lives currently.
Thanks very much,
Mabel
Steven Samuels wrote:
I have not read the article Sebastian referred to so I will ask only
about your design. This is a multistage design, so, for a start, your
-svyset- statement is incomplete. Please give more details. Exactly
what was the sampling protocol? What was frame? What were the target
populations at each stage of ssampling. How did the surveysors get
from states to communities to individuals? Was there intermediate
sampling of households or areas smaller than communities, or both? Was
sampling with or without replacement, and, at what stages? How were
the weights computed? Were Was there post-stratification weighting?
Have you multiple years of data?
Regards,
Steven
On Apr 21, 2008, at 10:51 AM, Mabel Andalon wrote:
Dear All,
I am estimating a model of community participation (1-0) using
individual-level data. These data are of immigrants in the US and
comes from a stratified simple random sampling survey. The strata are
US states (usstate). I've always used the svy option when analyzing
these data setting:
svyset [pweight=wt_natio], strata(usstate)
I just merged these data with contextual data from people's state of
origin in a foreign country based on year of arrival to the US. And I
also merged US state-level data based on current state of residence.
That is, any two people who arrived in the same year from the same
state and country and who live in the same US state were merged the
same state-level data.
My questions are two:
1. Is this considered multilevel data?
2. If so, how can I conduct a true multilevel analysis using glamm
and still include the features of sampling design (i.e. stratification).
So far, I have estimated:
gllamm participation $xvars , i(individual fostate year usstate)
pweight(wt) f(binom) l(logit) adapt
i = individuals/inmigrants
fostate = foreign state of residence
year= year of arrival to the US
usstate= current state of residence
I'm not even sure that I have correctly defined the hierarchical,
nested clusters in the i() option. The weights are individual's
sampling weights.
Any suggestions will be highly appreciated.
Best,
Mabel
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/