Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Quantile regression

From	Nick Cox <[email protected]>
To	[email protected]
Subject	Re: st: Quantile regression
Date	Sat, 22 Sep 2012 10:52:20 +0100

I will number your commands for ease of discussion.

1. xtile bmi_q = bmi, nquantiles(4)

2. bysort bmi_q sex:sum glucose, detail

3. bysort sex: anova glucose_log bmi_q

4. bysort sex: qreg bmi glucose age

#2 gives descriptive statistics, which no doubt could be useful. I
would expect graphs to be as or more useful, e.g.

scatter glucose bmi || lowess glucose bmi, by(sex)

#1 and #3 are choices that seem very hard to defend in any statistical
discussion. You are throwing away information on variability within
quartile groups of -bmi- and degrading the data.

#4 is puzzling too. Why expect a linear relation between -bmi- and its
predictors? If  there are different relationships according to -sex-,
the most usual tactic is not to fit separate models, but to fit a
joint model with interactions between age and sex.

If -glucose- is the response, it should not be the predictor in #4.

Why is glucose treated as linear in one model and logged in another?

This is not my field, but I find it difficult to imagine that the
science _demands_ thinking in terms of quartiles. Quartiles are a best
a convenient categorisation and at worst an arbitrary and inefficient
one.

Identifiying a best predictor is never easy and often futile.

Nick

On Sat, Sep 22, 2012 at 9:05 AM, Vasan Kandaswamy
<[email protected]> wrote:

> Thank you very much. I sincerely apologize for not having made my question clear.
>
> The scientific question that I would like to address are:
> 1. How much fold increase in outcome variable ( glucose) is observed from Quartile 1 to Quartile 4 of predictor variable (BMI) and want to see if this difference across quartiles is significant.
> 2. How much is the unit change observed in outcome variable.
> 3. With various predictors ( BMI, waist, body fat, weight etc) , I want to see which one best predicts the outcome variable
> 4. All analysis I would like to see seperately for men and women
>
> To address these : I went about this way
> 1. derived mean/median of outcome variable in each quartile
> 2. To compare the mean of glucose across quartiles of BMI for males ( not compare male mean and female mean in each quartile)- I intend to do an one way ANOVA ( but was suggested a two way)
> 3. To observe the unit change across quartiles, I wanted to do a regression model using qreg.
> 4. Finally, I am not sure as to how to go about with finding out which is the best predictor of the outcome. ( If I am not mistaken, I do not think I can do a standardized beta in qreg).
>
> The script I used are
> xtile bmi_q = bmi, nquantiles(4)
> bysort bmi_q sex:sum glucose, detail
> bysort sex: anova glucose_log bmi_q
> bysort sex: qreg bmi glucose age
>
> I hope I have made it more understandable now.
> Would be really very useful if I have your suggestions on these.

David Hoaglin [[email protected]]

> I'm puzzled.  From the way in which you described your analysis in
> your first message, I don't understand why you would use quantile
> regression.  As I recall, you wanted to compare the means of some
> variables across quartiles of BMI for males and females.  In that
> description, it was not clear to me whether you wanted to compare the
> mean of a variable in data from males among the quartiles of BMI and
> similarly in data from females, or whether you wanted to compare the
> female mean and the male mean within each quartile of BMI, or whether
> you wanted to make both of these types of comparisons.  I did not see
> any mention of the numbers of observations or the source of the data
> or, importantly, the scientific question that you are addressing.
>
> As I read the command below, you are asking -qreg- the fit a
> regression model to the median of BMI with predictors fast_glucose,
> etc. (the median is the default quantile in -qreg-).  This seems far
> from what you set out to do.
>
> Those of us who are following this thread would be better able to
> advise you if you went back to the beginning and gave us more
> information on the data and the context.  I do not know, for example,
> whether the data that you are analyzing are suitable for ANOVA.  They
> may be (perhaps after a transformation), and you may have given up on
> ANOVA too quickly.

> On Wed, Sep 19, 2012 at 5:33 PM, Vasan Kandaswamy

>> Now, I have given up on ANOVA since I cannot derive p values for gender seperately, but did a regression.
>>
>> A quantile regression this way comes up this way
>> bysort bmi_q sex:sum g0mmol
>> bysort sex: qreg bmi fast_glucose age pr ( adjusted for age)
>>
>> I tabulate the output this way
>> BMI                Q1      Q2        Q3        Q4     Beta (95%CI)            P value
>> Male              5.3     5.4        5.5        5.6     2.61 (1.46, 3.76)     8.91 x 10^-06
>> Female         5.4      5.4       5.4         5.7    0.36 (-0.15, 0.86)     0.168
>>
>> IF you actually look at the mean glucose values in Q1-Q5, there is not much difference, but the regression shows a clear difference with p values of males significant, while females are not.
>>
>> Could you please explain of my approach is correct.
>> The basic question I would like to ask is if the fold change from Q1 to Q5 is significant.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- RE: st: Quantile regression
  - From: Vasan Kandaswamy <[email protected]>

References:
- RE: st: Quantile regression
  - From: Vasan Kandaswamy <[email protected]>
- Re: st: Quantile regression
  - From: David Hoaglin <[email protected]>
- RE: st: Quantile regression
  - From: Vasan Kandaswamy <[email protected]>

Prev by Date: Re: st: creating cross tables/ matrices with expected/ observed frequencies from long data set
Next by Date: Re: st: creating cross tables/ matrices with expected/ observed frequencies from long data set
Previous by thread: RE: st: Quantile regression
Next by thread: RE: st: Quantile regression
Index(es):
- Date
- Thread