Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: svy subpop option and e(sample)


From   Steven Samuels <[email protected]>
To   [email protected]
Subject   Re: st: svy subpop option and e(sample)
Date   Fri, 27 May 2011 15:37:55 -0400

Austin--

Like Richard, I forgot about your post and about the need to pool singleton strata. Your "better" estimation procedure is a complete solution. 

In Hitesh's case, keeping all data in memory isn't feasible. For dealing with missing data, what do you think about MI restricted to the subpopulation? 

Steve
[email protected]

On May 27, 2011, at 12:13 PM, Austin Nichols wrote:

Richard--
I claimed in http://www.stata.com/statalist/archive/2007-11/msg00810.html
that "It is tempting to write a -svysubset- package
to automate this subsetting procedure, but for any given model, the
pattern of missing values might be different, which means the
automatic-subsetting package could offer no savings in general over
keeping all the data in memory."  Maybe a bit strong, but the general point is
that the ad hoc solution is not straightforward to generalize in the presence
of missing data.

On Fri, May 27, 2011 at 12:25 PM, Richard Williams
<[email protected]> wrote:
> At 10:08 AM 5/27/2011, Steven Samuels wrote:
>> 
>> Hitesh
>> 
>> After reading  Section 5.4 of Korn and Graubard (1999), I return to Stas's
>> advice: you need a good reason not to do the correct analysis. Here lack of
>> memory won't be a reason,  for,as you have apparently surmised, you don't
>> need to load the entire original data set. Instead create _one_ dummy
>> observation for each PSU that contains no members of the sub-population. For
>> this observation, set the value of all the analysis variables to zero or to
>> some other convenient value.
> 
> Interesting. Would it be fairly straightforward to create an -svyextract-
> command then? It seems like such a command could be quite useful for those
> who would otherwise have to deal with massive data sets. Maybe even add a
> property to the svysettings so the dof would be right when analyzing the
> extract. This might be a good wish list item for Stata 12.
> 
>> There is one more thing to do: in the -svyset- statement, use the -dof()-
>> option to set the degrees of freedom to: number of PSUs with members of the
>> subpopulation minus number of  strata with observations in the
>> sub-population (Korn & Graubard, 1999, p. 209).
>> 
>> Ref: Korn, Edward Lee, and Barry I Graubard. 1999. Analysis of Health
>> Surveys. New York: Wiley.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index