Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: centred mean age
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: centred mean age
Date
Thu, 31 Jan 2013 10:44:39 +0000
Sorry, neither yes nor no from me. I would never make such decisions
based on such criteria. What's a better model depends on more than
single figures of merit. In this case, everything of interest can be
graphed as curves relating predicted weight to age and which curves
looked a better match in terms of modelling both global and local
behaviour would weigh [pun intended] very heavily with me.
Besides, I've seen others burnt so badly by cubics that I would still
distrust them even when they seem to be doing a good job.
Nick
On Thu, Jan 31, 2013 at 10:30 AM, Thomas Norris <[email protected]> wrote:
> Nick,
>
> Thank you for this.
>
> Going back to your previous point regarding the instability may be due to over fitting with the cubic polynomial, when I was determining model fit, I was using AIC, BIC and residuals for the model and random effects. The cubic polynomial with 2 random age terms (AIC:-27621.8, BIC:-27539, sd residual random effects: 0.0594447, sd overall model resid: 0.0440447)
> was marginally better than the quadratic with 2 random age terms (AIC=-27611.16, BIC=-27535.92, sd residual random effects: 0.0594943, sd overall model resid: 0.0441144).
>
> I know the figures are all slightly better with cubic, but are they 'better enough' for you to have concluded that the cubic was better?
>
> Many thanks,
>
> Tom
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Nick Cox
> Sent: 31 January 2013 09:58
> To: [email protected]
> Subject: Re: st: centred mean age
>
> As said, this is just a matter of presentation, although powering very small numbers does strain the calculations. But the same point stands.
>
> Your problem sounds like one where I might use log10() rather than
> ln() for a mundane reason that it would make it a little easier to edit resulting graphs to show weights not log weights. Even you are using a log scale for fitting the reporting should make reference to weight.
>
> In that vein log10(weight in grams) = 3 + log10(weight in kilograms), so a change of units is still pertinent when using logarithms.
>
> It might be even better to use a generalised linear model with log link.
>
> Nick
>
> On Thu, Jan 31, 2013 at 9:44 AM, Thomas Norris <[email protected]> wrote:
>
>> Weight is actually on the log scale ( ln(weight) ), not kilograms, as it showed increasing variability with age. This wouldn't have an effect, would it?
>
> Nick Cox
>
>> It is difficult to give really good advice without being able to look at the data, but it seems unusual to me that a cubic polynomial outperforms competitors. Independently of your main issues I'd advise a change of units of measurement if only to ease presentation (e.g.
>> kilograms to grams).
>>
>> Very generally, instability of coefficients often signals possible over-fitting.
>
> On Thu, Jan 31, 2013 at 9:13 AM, Thomas Norris <[email protected]> wrote:
>
>>> Thank you very much for your advice. If I may clarify just so I can progress without doubt. I found that the best fitting multilevel model for my prenatal weight dataset was a cubic polynomial (tried fracpolys and spline). I then decided to centre the age term as it is not intuitive to have an intercept at 0 as, in prenatal life, there should be nothing at zero.
>>>
>>> I have created a dummy variable for ethnicity, to see if there are differences between two ethnic groups, and interacted this with age (pakage, pakage2,pakeage3) and centred age (in the centred model).
>>>
>>> The coefficients in the uncentred model were:
>>> Age: 0.256372
>>> Age2: -0.0009669
>>> Age3= -0.0000291
>>> Pak= -0.5843112
>>> Pakage= 0.0617149
>>> Pakage2= -0.0021505
>>> Pakage3= 0.0000234
>>>
>>> In the uncentred model:
>>> Age: 0.1062287
>>> Age2: -0.0037464
>>> Age3= -0.0000291
>>> Pak= -0.0427686
>>> Pakage= -0.0039254
>>> Pakage2= 0.0000899
>>> Pakage3= 0.0000234
>>>
>>> As people have since told me, it is fine that the coefficients change value after the centreing, but the interactions between age and age2 and ethnicity have switched from positive to negative and vice versa, after centreing. Is this what one would expect?
>
>
>>>> I was under the impression that the age coefficients in a centred model shouldn't be different to an uncentred model though, and mine change.
>>>>
>>>> Is this change therefore ok?
>
> Nick Cox
>
>>>> Whether or not it helps in your model, I see no problem in what you
>>>> describe. It's the way that linear, quadratic and cubic terms work
>>>> together in a model that's important.
>>>>
>>>> All that said, there are quite possibly better ways of doing what
>>>> you want, such as cubic splines or fractional polynomials, which are
>>>> well supported in Stata.
>
> On Wed, Jan 30, 2013 at 7:08 PM, Thomas Norris <[email protected]> wrote:
>
>>>>> I am having trouble with centering my independent variable (age) in a cubic polynomial.
>>>>>
>>>>> I have generated the centred age by using gen centrage= age-r(mean)
>>>>> and then to get the centred quadratic and cubic I simple raise
>>>>> centrage to ^2 and ^3 respectively (gen centrage2= centrage^2)(gen
>>>>> gentrage3=centrage^3)
>>>>>
>>>>> However, the negative centred age terms (ie those smaller than the mean) become positive when squaring them, which is what is mathematically correct, but it doesn't help my models.
>>>>>
>>>>> If for example the mean was 30 weeks and I had 2 separate obs, one at 25 weeks and one at 35 weeks, the centred age would be -5 and 5, but the centred age^2 are both 25.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/