Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: stptime command issue
From
[email protected] (Wesley D. Eddings, StataCorp)
To
[email protected]
Subject
Re: st: stptime command issue
Date
Thu, 06 Jan 2011 12:49:33 -0600
Elisabetta Petracci <[email protected]> asked how to reproduce the
person-time calculations of -stptime-:
> I am running the example 2 (page 338) of the st.pdf Stata 11 manual to
> calculate age-specific incidence rates.
...
> Then I run:*
> stptime, per(1000) at(40(10)70) trim
(output inserted below)
failure _d: fail == 1 3 13
analysis time _t: (dox-origin)/365.25
origin: time dob
enter on or after: time doe
id: id
note: _group<=40 trimmed
Cohort | person-time failures rate [95% Conf. Interval]
-----------+-----------------------------------------------------------
(40 - 50]| 907.00616 6 6.6151701 2.971936 14.72457
(50 - 60]| 2107.0418 18 8.5427828 5.382317 13.55906
(60 - 70]| 1493.2923 22 14.732548 9.700656 22.37457
-----------+-----------------------------------------------------------
total | 4507.3402 46 10.205575 7.644246 13.62512
> At this point I wanted to check by hand the calculation of person-time at
> least for the first age group.
> I don't get 907 person-time for the first age-group. Shouldn't I take those
> subjects who have _t<=50 and then sum over the _t within this interval?*
The -stptime- estimate of 907 cannot be reproduced by summing the values of _t.
(The _t variable was created by -stset- and records analysis time.) Let's look
at the observations that satisfy _t <= 50:
. sort _t
.
. list _st _d _t _t0 if _t<=50
+----------------------------------+
| _st _d _t _t0 |
|----------------------------------|
1. | 1 1 42.57358 31.4141 |
2. | 1 0 46.201232 30.075291 |
3. | 1 0 46.373717 30.332649 |
4. | 1 0 46.86653 36.15332 |
5. | 1 0 47.26078 31.134839 |
|----------------------------------|
6. | 1 0 47.753593 36.87885 |
7. | 1 0 47.86037 37.314168 |
8. | 1 1 47.8987 45.46475 |
9. | 1 0 47.915127 32.125941 |
10. | 1 1 47.964408 43.59206 |
|----------------------------------|
11. | 1 1 48.041068 46.546201 |
12. | 1 0 48.741958 48.180698 |
13. | 1 1 49.062286 45.442847 |
14. | 1 1 49.062286 44.5859 |
15. | 1 0 49.1718 38.297057 |
|----------------------------------|
16. | 1 0 49.976728 46.89117 |
+----------------------------------+
(The variable _t0 tells when each record begins.) Summing the sixteen values of
_t gives 762.7, which doesn't match the 907 reported by -stptime-. The
calculations don't match because _t records the ending time for each record, not
the person-time contribution. The contribution for the first record, for
example, is not 42.57358---it's 42.57358 - 40 = 2.57358. And the contribution
for the tenth observation is only 47.964408 - 43.59206 = 4.372348, because the
record does not begin until time _t0 = 43.59206.
We've listed only the sixteen smallest values of _t, but other records will have
person-time contributions too. Here's how to create a variable "pt4050"
containing the contributions:
. generate pt4050 = min(50 - 40, _t - 40) if _t0 <= 40
(311 missing values generated)
.
. replace pt4050 = min(50 - _t0, _t - _t0) if _t0 > 40 & _t0 <= 50
(170 real changes made)
The observations with missing values of "pt4050" do not contribute person-time
for the (40, 50] cohort. The sum of the values of "pt4050" is the estimate we
need:
. summarize pt4050
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
pt4050 | 196 4.627582 3.057497 .1327858 10
.
. display r(sum)
907.00616
-- Wes
[email protected]
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/