>>> [email protected] 03/31/03 12:22 PM >>>
On Mon, 31 Mar 2003 10:41:32 -0500 Robert Bozick <[email protected]>
wrote:
> Hello everyone --
> I am currently working on a project that requires the use of
hazard/event history models. I am relatively new to estimating these
models. The basic model I am estimating is the rate of college degree
completion as a function of a set of covariates > COMPLETION = a + bX
> > I have two questions:
> 1) How do I adjust the estimates for the clustered-stratified nature
of the sample? > The data set I am using is a two stage
stratified-cluster sample (i.e. the National Education Longitudinal
Study for those of you who use NCES data sets). In the first stage,
schools are randomly sampled with a probability proportional to a given
strata (defined by socioeconomic status, urban v. suburban, etc.) In
the second stage, students are sampled randomly within the schools. In
typical logit models using this data, I use 'survey' commands to adjust
for the strata and hierarchically clustered desigin of the sample using
the code : >
> svyset psu psu > svyset strata stratum
> > (*where psu = the primary sampling unit (the school) and stratum is
the strata that the schools were proportionally sampled from)....then
when I estimate a logit model, I use the command: >
> svylogit y x1 x2 x3 >
> That command estimates the model correcting for the sample design. I
noticed there is no 'survey' command for cox proportional hazard
models. How do I correct for the sample design (cluster and strata)
when estimating a cox proportional hazard model? >
> > 2) How do you weight data when estimating a cox proportional
hazard model? > I tried the command:
> stcox x1 x2 x3 [pweight = weight] > Stata gave me the response:
weights not allowed > Are you not allowed to weight data when
estimating a cox proportional hazard model or is there some other
procedure that I need to do to incorporate a probability weight when
estimating this type of model? > Thanks in advance for any help with
these issues! >
As far as I know, there is no Stata -svy- command for the Cox
proportional hazard model (though there might be in specialist software
such as SUDAAN).
... but how about the following idea?
The Cox PH model is a continuous time hazard model. Suppose instead
that you used a discrete time model instead (see Manual entry under
-discrete- in version 8 Manual ST). This may be what you should use
anyway if your data are interval-censored. (Do you have exact dates for
survival times? Or are they grouped?)
If you went the discrete time route, and estimated a discrete time
logistic hazard model, then maybe you could then take advantage of
Stata's -svylogit- estimator.
Perhaps the survey design effect experts out there could comment on
whether this 'trick' is OK?
Stephen
Thanks Stephen --
I do have exact dates of degree completion (month/year). I had wanted to use hazard models because of the right censoring issue in the data: a large proportion of the sample had not completed a degree before the time of the interview. I guess that leaves me in a bind: If I use the logit model, I can obtain the 'proper' standard errors, but not correct for the censoring. If I use the hazard model, I can correct for the right censoring problem, but not have the proper standard errors.
Am I looking at this correctly? Any other thoughts?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/