Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Adam Olszewski <adam.olszewski@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | st: flexible parametric models in small datasets |
Date | Sat, 11 Aug 2012 12:45:22 -0400 |
Hello listers, I was wondering if anyone could clarify an issue for me. I was trying to fit a Royston-Parmar relative survival model using -stpm2- on a small dataset (580 observations, 43 events). The model will not converge and depending on the number of degrees of freedom it gives different types of errors: "cannot compute an improvement -- discontinuous region encountered" most commonly (or "initial values not possible"). This does not happen if I leave out the relative survival option, and goes away if I coarsen the expected mortality rate by rounding it. I was wondering - does this indicate that there are limits on cell sizes / degrees of freedom in such an analysis? Is there a rule (I don't find it explicitly discussed in the literature)? RP models are typically used for large datasets, but is there some kind of a "1 variable : 10 events" rule that can be utilized to judge when the dataset size becomes too small? Here is the code that will illustrate the issue (the dataset is 14KB, I hope it is not against the list rule to post a link): . use "http://dl.dropbox.com/u/7142569/Stats/rp.dta"; . stset surv, f(dead) exit(month60) . stpm2 treat, df(3) scale(hazard) eform . g roundrate=round(rate,0.01) . stpm2 treat, df(3) scale(hazard) eform bhaz(roundrate) . stpm2 treat, df(3) scale(hazard) eform bhaz(rate) Thanks for any insight, Adam Olszewski * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/