Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Number of Obs with svy , suppop()
From
Michael Norman Mitchell <[email protected]>
To
[email protected]
Subject
Re: st: Number of Obs with svy , suppop()
Date
Fri, 19 Mar 2010 01:17:05 -0700
Dear Phil
Thank you for your reply... I am still struggling to solidly
understand this. Perhaps I have a more fundamental question. What is the
formula for the "Number of obs" in the context of the -svy- commands. It
sounds like, in the absence of the -subpop()- option, it is the number
of observations with non-missing values on the tabulated variable. And,
in the presence of the -subpop()- option it is the total number of
observations minus the number of observations that meet the -subpop()-
option and are missing on the tabulated variable. Am I on the right
track here?
Many thanks!
Michael N. Mitchell
See the Stata tidbit of the week at...
http://www.MichaelNormanMitchell.com
On 2010-03-18 5.04 PM, Phil Schumm wrote:
On Mar 18, 2010, at 6:20 PM, Michael Mitchell wrote:
Here is the tabulation of race and sex by race.
<snip>
. tab sex race, missing
1=male, | 1=white, 2=black, 3=other
2=female | White Black Other . | Total
-----------+--------------------------------------------+----------
male | 1,676 193 35 34 | 1,938
female | 1,824 238 34 37 | 2,133
-----------+--------------------------------------------+----------
Total | 3,500 431 69 71 | 4,071
<snip>
But now I want to analyze just the sub-population of males (sex==1)
and it shows that the number of obs is now 4037 (see below). How can
the number of observations increase when adding a -subpop()- option?
There are suddenly 37 extra observations. Note this corresponds to
the number of females with a missing race.
. svy , subpop(if sex==1): tab race, count format(%13.2fc)
(running tabulate on estimation sample)
Number of strata = 1 Number of obs
= 4037
Number of PSUs = 4037 Population size =
7932333.9
Subpop. no. of obs
= 1904
Subpop. size =
3780355.3
Design df
= 4036
This is as it should be, since information about race is not required
on those observations outside of the subpopulation. Remember,
observations outside the subpopulation are relevant only insofar as
they reflect the variability in the proportion(s) of sampled PSUs with
at least one observation in the subpopulation.
In fact, at one point Stata did not behave properly in this regard;
this was fixed in an update to Stata 10 on 02apr2008 (see -help
whatsnew10- and search for "02apr2008").
-- Phil
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/