An alternate approach is to conduct a nested case-control study and use
conditional logistic regression. This is a particularly good idea when
the proportion of cohort members who develop the event of interest is
rare and measuring the covariates is expensive -- say requiring some
assay of a liquid database collected at entry into the cohort. It has
been shown that the loss of power doing this with 10 controls per case
is close to zero compared to analyzing the entire cohort. With, say,
two controls per case the loss of power is often reasonable compared to
the cost savings and can make the financial and labor costs of many
studies in molecular epidemiology feasible. If you sample your controls
from the risk set of event free people with at least as much follow-up
as the matched case then conditional logistic regression is similar in
spirit to proportional hazards regression analysis and should give
similar answers (see Lubin and Gail, Biometrics 1984).
Bill Dupont
-----Original Message-----
From: Nick Cox [mailto:[email protected]]
Sent: Wednesday, June 18, 2003 6:00 PM
To: [email protected]
Subject: st: RE: [Cox model]
roger webb
> I need to run a Cox model on a very large cohort (of approximately 1.5
> million subjects). Has anyone implemented a memory efficient routine
> that uses a sample from (as opposed to all) the individuals at risk?
Nothing to do with me, but I doubt that there
is amy special procedure needed here. That is,
you just should take a sample upstream and then
fit a Cox model on the sample data, I guess.
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/