Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: svy subpop option and e(sample)
From
Steven Samuels <[email protected]>
To
[email protected]
Subject
Re: st: svy subpop option and e(sample)
Date
Fri, 27 May 2011 15:37:55 -0400
Austin--
Like Richard, I forgot about your post and about the need to pool singleton strata. Your "better" estimation procedure is a complete solution.
In Hitesh's case, keeping all data in memory isn't feasible. For dealing with missing data, what do you think about MI restricted to the subpopulation?
Steve
[email protected]
On May 27, 2011, at 12:13 PM, Austin Nichols wrote:
Richard--
I claimed in http://www.stata.com/statalist/archive/2007-11/msg00810.html
that "It is tempting to write a -svysubset- package
to automate this subsetting procedure, but for any given model, the
pattern of missing values might be different, which means the
automatic-subsetting package could offer no savings in general over
keeping all the data in memory." Maybe a bit strong, but the general point is
that the ad hoc solution is not straightforward to generalize in the presence
of missing data.
On Fri, May 27, 2011 at 12:25 PM, Richard Williams
<[email protected]> wrote:
> At 10:08 AM 5/27/2011, Steven Samuels wrote:
>>
>> Hitesh
>>
>> After reading Section 5.4 of Korn and Graubard (1999), I return to Stas's
>> advice: you need a good reason not to do the correct analysis. Here lack of
>> memory won't be a reason, for,as you have apparently surmised, you don't
>> need to load the entire original data set. Instead create _one_ dummy
>> observation for each PSU that contains no members of the sub-population. For
>> this observation, set the value of all the analysis variables to zero or to
>> some other convenient value.
>
> Interesting. Would it be fairly straightforward to create an -svyextract-
> command then? It seems like such a command could be quite useful for those
> who would otherwise have to deal with massive data sets. Maybe even add a
> property to the svysettings so the dof would be right when analyzing the
> extract. This might be a good wish list item for Stata 12.
>
>> There is one more thing to do: in the -svyset- statement, use the -dof()-
>> option to set the degrees of freedom to: number of PSUs with members of the
>> subpopulation minus number of strata with observations in the
>> sub-population (Korn & Graubard, 1999, p. 209).
>>
>> Ref: Korn, Edward Lee, and Barry I Graubard. 1999. Analysis of Health
>> Surveys. New York: Wiley.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/