Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Quantile regression
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Quantile regression
Date
Sat, 22 Sep 2012 10:52:20 +0100
I will number your commands for ease of discussion.
1. xtile bmi_q = bmi, nquantiles(4)
2. bysort bmi_q sex:sum glucose, detail
3. bysort sex: anova glucose_log bmi_q
4. bysort sex: qreg bmi glucose age
#2 gives descriptive statistics, which no doubt could be useful. I
would expect graphs to be as or more useful, e.g.
scatter glucose bmi || lowess glucose bmi, by(sex)
#1 and #3 are choices that seem very hard to defend in any statistical
discussion. You are throwing away information on variability within
quartile groups of -bmi- and degrading the data.
#4 is puzzling too. Why expect a linear relation between -bmi- and its
predictors? If there are different relationships according to -sex-,
the most usual tactic is not to fit separate models, but to fit a
joint model with interactions between age and sex.
If -glucose- is the response, it should not be the predictor in #4.
Why is glucose treated as linear in one model and logged in another?
This is not my field, but I find it difficult to imagine that the
science _demands_ thinking in terms of quartiles. Quartiles are a best
a convenient categorisation and at worst an arbitrary and inefficient
one.
Identifiying a best predictor is never easy and often futile.
Nick
On Sat, Sep 22, 2012 at 9:05 AM, Vasan Kandaswamy
<[email protected]> wrote:
> Thank you very much. I sincerely apologize for not having made my question clear.
>
> The scientific question that I would like to address are:
> 1. How much fold increase in outcome variable ( glucose) is observed from Quartile 1 to Quartile 4 of predictor variable (BMI) and want to see if this difference across quartiles is significant.
> 2. How much is the unit change observed in outcome variable.
> 3. With various predictors ( BMI, waist, body fat, weight etc) , I want to see which one best predicts the outcome variable
> 4. All analysis I would like to see seperately for men and women
>
> To address these : I went about this way
> 1. derived mean/median of outcome variable in each quartile
> 2. To compare the mean of glucose across quartiles of BMI for males ( not compare male mean and female mean in each quartile)- I intend to do an one way ANOVA ( but was suggested a two way)
> 3. To observe the unit change across quartiles, I wanted to do a regression model using qreg.
> 4. Finally, I am not sure as to how to go about with finding out which is the best predictor of the outcome. ( If I am not mistaken, I do not think I can do a standardized beta in qreg).
>
> The script I used are
> xtile bmi_q = bmi, nquantiles(4)
> bysort bmi_q sex:sum glucose, detail
> bysort sex: anova glucose_log bmi_q
> bysort sex: qreg bmi glucose age
>
> I hope I have made it more understandable now.
> Would be really very useful if I have your suggestions on these.
David Hoaglin [[email protected]]
> I'm puzzled. From the way in which you described your analysis in
> your first message, I don't understand why you would use quantile
> regression. As I recall, you wanted to compare the means of some
> variables across quartiles of BMI for males and females. In that
> description, it was not clear to me whether you wanted to compare the
> mean of a variable in data from males among the quartiles of BMI and
> similarly in data from females, or whether you wanted to compare the
> female mean and the male mean within each quartile of BMI, or whether
> you wanted to make both of these types of comparisons. I did not see
> any mention of the numbers of observations or the source of the data
> or, importantly, the scientific question that you are addressing.
>
> As I read the command below, you are asking -qreg- the fit a
> regression model to the median of BMI with predictors fast_glucose,
> etc. (the median is the default quantile in -qreg-). This seems far
> from what you set out to do.
>
> Those of us who are following this thread would be better able to
> advise you if you went back to the beginning and gave us more
> information on the data and the context. I do not know, for example,
> whether the data that you are analyzing are suitable for ANOVA. They
> may be (perhaps after a transformation), and you may have given up on
> ANOVA too quickly.
> On Wed, Sep 19, 2012 at 5:33 PM, Vasan Kandaswamy
>> Now, I have given up on ANOVA since I cannot derive p values for gender seperately, but did a regression.
>>
>> A quantile regression this way comes up this way
>> bysort bmi_q sex:sum g0mmol
>> bysort sex: qreg bmi fast_glucose age pr ( adjusted for age)
>>
>> I tabulate the output this way
>> BMI Q1 Q2 Q3 Q4 Beta (95%CI) P value
>> Male 5.3 5.4 5.5 5.6 2.61 (1.46, 3.76) 8.91 x 10^-06
>> Female 5.4 5.4 5.4 5.7 0.36 (-0.15, 0.86) 0.168
>>
>> IF you actually look at the mean glucose values in Q1-Q5, there is not much difference, but the regression shows a clear difference with p values of males significant, while females are not.
>>
>> Could you please explain of my approach is correct.
>> The basic question I would like to ask is if the fold change from Q1 to Q5 is significant.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/