Gaby Serdan wrote:
I have data on deaths. I need to calculate the mean &
CI of females in proportion to all population. Im
trying first to create a variable for each month then
take the total number of female per month and then
divide by total number of deaths per month.
- and Clive Nicholas gave suggestions.
---------------------------------------------------------------
I understand that you want to estimate the proportion
of females among the persons who died each month. The
data you provided are a bit surprising for the purpose,
with one female, three males, and 21 with unknown sex.
To create some more illustrative data, I:
clear
set obs 200
set seed 54321
gen year = int(2004+2*uniform())
gen month = int(1+12*uniform())
drop if year==2004 & month<10
drop if year==2005 & month>3
gen x=uniform()
gen female=1 if x<0.4
gen male=1 if x>0.4 & x<0.8
gen sex_unknown=1 if x>0.8
recode female male sex_unknown (.=0)
generate persons = female + male + sex_unknown
This dataset includes:
. table month female , by(year)
----------------------
year and | female
month | 0 1
----------+-----------
2004 |
10 | 5 3
11 | 4 4
12 | 4 3
----------+-----------
2005 |
1 | 5 4
2 | 4 4
3 | 4 3
----------------------
One possibility is the -proportion- command:
. proportion female , over(year month)
Proportion estimation Number of obs = 47
_prop_1: female = 0
_prop_2: female = 1
Over: year month
_subpop_1: 2004 10
_subpop_2: 2004 11
_subpop_3: 2004 12
_subpop_4: 2005 1
_subpop_5: 2005 2
_subpop_6: 2005 3
--------------------------------------------------------------
| Binomial Wald
Over | Proportion Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
_prop_1 |
_subpop_1 | .625 .1829813 .2566778 .9933222
_subpop_2 | .5 .1889822 .1195985 .8804015
_subpop_3 | .5714286 .2020305 .1647622 .9780949
_subpop_4 | .5555556 .1756821 .2019258 .9091853
_subpop_5 | .5 .1889822 .1195985 .8804015
_subpop_6 | .5714286 .2020305 .1647622 .9780949
-------------+------------------------------------------------
_prop_2 |
_subpop_1 | .375 .1829813 .0066778 .7433222
_subpop_2 | .5 .1889822 .1195985 .8804015
_subpop_3 | .4285714 .2020305 .0219051 .8352378
_subpop_4 | .4444444 .1756821 .0908147 .7980742
_subpop_5 | .5 .1889822 .1195985 .8804015
_subpop_6 | .4285714 .2020305 .0219051 .8352378
--------------------------------------------------------------
To get exact binomial confidence intervals, use -ci- :
. by year month: ci female , binomial
------------------------------------------------------------------------
-------
-> year = 2004, month = 10
-- Binomial
Exact --
Variable | Obs Mean Std. Err. [95% Conf.
Interval]
-------------+----------------------------------------------------------
-----
female | 8 .375 .1711633 .0852334
.7551368
------------------------------------------------------------------------
-------
-> year = 2004, month = 11
-- Binomial
Exact --
Variable | Obs Mean Std. Err. [95% Conf.
Interval]
-------------+----------------------------------------------------------
-----
female | 8 .5 .1767767 .1570128
.8429872
.....
To get one time variable, use the time series facilities:
. gen mdate = ym(year,month)
. format mdate %tm
. tab1 mdate
-> tabulation of mdate
mdate | Freq. Percent Cum.
------------+-----------------------------------
2004m10 | 8 17.02 17.02
2004m11 | 8 17.02 34.04
2004m12 | 7 14.89 48.94
2005m1 | 9 19.15 68.09
2005m2 | 8 17.02 85.11
2005m3 | 7 14.89 100.00
------------+-----------------------------------
Total | 47 100.00
Hope this helps
Svend
__________________________________________
Svend Juul
Institut for Folkesundhed, Afdeling for Epidemiologi
(Institute of Public Health, Department of Epidemiology)
Vennelyst Boulevard 6
DK-8000 Aarhus C, Denmark
Phone: +45 8942 6090
Home: +45 8693 7796
Email: [email protected]
__________________________________________
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/