Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Large standard error, Cox PH
From
Steve Samuels <[email protected]>
To
[email protected]
Subject
Re: st: Large standard error, Cox PH
Date
Sat, 28 Jul 2012 11:27:21 -0400
A scatter plot of "Minority" against your time variable is likely to show
very little overlap of minority/non-minority countries. If so, the effect of the
"Minority" variable is not accurately described by a proportional hazards model.
The ordinary solution would be to designate "Minority" as a stratum variable
in the Cox model.
But you have a far more serious problem: overfitting (Bayak, 2004). Rules of
thumb are not easy to come by, but I would say that the ratio of the number of
failures to the number of predictors should be no more than 5:1. At 19:10, You
are far over that limit. Thus you must throw the entire model out and start from
scratch. You simply cannot assess the simultaneous effects of all those
predictors.
For solutions see Chapters 4 and 5 of: Harrell (2001).If you have access to the
R Statistical package, you can employ the lasso (Tibshirani, 1997) for
coefficient shrinkage, which is available in packages -glmpath-, -glmnet, and
-penalized-.
References:
Babyak, MA. 2004. What you see may not be what you get: a brief, nontechnical
introduction to overfitting in regression-type models. Psychosom Med 66,
no.3:411-421. http://www.psychosomaticmedicine.org/cgi/content-nw/full/66/3/411/
Harrell, Frank E. 2001. Regression modeling strategies : with applications to
linear models, logistic regression, and survival analysis. New York: Springer.
Tibshirani, R. 1997. The lasso method for variable selection in the Cox model.
Stat Med 16, no. 4: 385-395.
Steve
[email protected]
> On Jul 28, 2012, at 8:48 AM, Lee Savage wrote:
>
> The study is an analysis of government termination, estimated using a Cox
> proportional hazards model. The problem variable is 'Minority', this is a binary
> variable that indicates whether or not a government holds a parliamentary
> majority. The problem is that the standard error of the coefficient is extremely
> high. I have only seen this before when the coefficient was insignificant but in
> this case the coefficient is significant (as you can see below).
> Multicollinearity isn't a problem. I'm looking for advice on whether or not this
> is a problem or can I simply report the model and just state that the high SE of
> the 'Minority' variable means that it can't really be generalized?
>
> Here is the printout.
>
> Iteration 0: log pseudolikelihood = -40.288812
> Iteration 1: log pseudolikelihood = -28.304301
> Iteration 2: log pseudolikelihood = -26.968036
> Iteration 3: log pseudolikelihood = -26.902024
> Iteration 4: log pseudolikelihood = -26.901788
> Refining estimates:
> Iteration 0: log pseudolikelihood = -26.901788
>
> Cox regression -- Breslow method for ties
>
> No. of subjects = 19 Number of obs = 347
> No. of failures = 19
> Time at risk = 347
> Wald chi2(7) = 1603.76
> Log pseudolikelihood = -26.901788 Prob > chi2 = 0.0000
>
> Haz. Robust
> Ratio SE z P>z [95% Conf Int
> Minority 77.01 56.61 5.91 0.00 18.23 325.28
> Ideology 0.84 0.20 -0.73 0.47 0.52 1.35
> formdays 0.94 0.03 -2.16 0.03 0.90 0.99
> nogovtpart~s 1.51 1.68 0.37 0.71 0.17 13.34
> ciep12 1.36 1.25 0.34 0.74 0.23 8.21
> ConsNoCon 1.28 1.18 0.27 0.79 0.21 7.86
> tvc
> Unemployment 0.99 0.01 -2.31 0.02 0.98 1.00
> GDP 1.00 0.00 2.24 0.03 1.00 1.00
> Inflation 0.98 0.01 -4.38 0.00 0.97 0.99
>
>
>
>
> __________________________
>
>
> From: Steve Samuels <[email protected]>
> To: [email protected]
> Sent: Friday, 27 July 2012, 21:30
> Subject: Re: st: Large standard error, Cox PH
>
>
> To answer your questions, we'd need more detail. Describe the study and the
> problem variable in particular.
> As the FAQ request, "Say exactly what you typed and exactly what Stata typed (or
> did) in response".
>
> Steve
> [email protected]
>
>
>
>
> On Jul 27, 2012, at 2:20 PM, Lee Savage wrote:
>
> I have estimated a Cox PH model using a small sample (n=19, 347 months at risk).
> For one of my covariates I have found a large hazard ratio (77.01) with a
> correspondingly large standard error (56.61). I have seen this before but every
> time the covariate was insignificant, in the current model the covariate is
> significant (p=.001). I have tested the covariates for collinearity and
> everything looks fine. I think the probable cause is the small sample size.
>
>
> So my question is: is this a problem for my model overall model? My inclination
> is to report the model as it is and just state that the significant effect for
> the covariate in question should be treated with extreme caution, perhaps even
> ignored.
>
> I'd appreciate any advice on this.
>
> Thanks.
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/