Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Relative Importance of predictors in regression
From
Lucas <[email protected]>
To
[email protected]
Subject
Re: st: Relative Importance of predictors in regression
Date
Wed, 6 Nov 2013 07:26:11 -0800
David M.,
Thanks for weighing in. Maybe your doing so will help out. Indeed,
what you say is how I have interpreted this issue in the past.
Clearly, in some cases (e.g., X and X^2) one cannot hold one variable
constant and difference the other. In other cases, however, the held
constant interpretation seems completely reasonable (e.g.,
E(Y)=b1*YrsSchl+b2*Sex). [Parenthetically, this is structurally the
same as saying "change is relevant for some models, impossible to
reference for others"--i.e., content matters.]
What piqued my interest is David H. indicated he had a mathematical
expression that would straightforwardly show that "held constant" is
always wrong. Yet, after asking for it for a couple of days, it still
has neither been conveyed nor has a citation been provided (well, two
textbooks were cited, but it was unclear which, if either, had the
expression or just a differently interpretable derivations). That's
more than a little disappointing.
Perhaps someone else has the expression. If so, it'd be great to
either see it or be pointed to where it can be found.
Or, perhaps there is no such expression. No disrespect intended.
But, we cannot accept a claim--or expect our students or clients to
accept a claim--on the basis of someone saying, "I have the evidence
here, I just can't show it to you."
Sam
On Wed, Nov 6, 2013 at 6:38 AM, David Muller <[email protected]> wrote:
> I may be misunderstanding or mischaracterising David Hoaglin's
> problems with the term "holding constant" for describing adjustment
> for covariates in multiple regression, so forgive me for interjecting
> if I am off the mark.
>
> I think the main issue is that the data used to fit the model won't
> necessarily support a difference/change in one variable with all other
> variables held constant. This is trivially the case when, for
> instance, both x and x^2 are used as predictors. When data are sparse
> or continuous it is also unlikely that there will be observations that
> differ on one variable but are _identical_ on all others.
>
> Personally, I don't think this is a big deal. If one sees regression
> coefficients as differences in conditional expectations, then the
> "held constant" interpretation is just a model-based interpolation or
> extrapolation. It's up to the person fitting and interpreting the
> model to justify any such extrapolation.
>
> All the best,
> David Muller
>
>
> On 6 November 2013 01:19, Lucas <[email protected]> wrote:
>> Dear David,
>>
>> I am confused. You first write the following (emphasis capitalization added):
>>
>> "I would add a note of caution, however. Nathans et al. (and many
>> others) interpret a beta weight (or a regression coefficient more
>> generally) in a way that involves holding all the other predictor
>> variables constant. The "held constant" part of that interpretation
>> is not correct. STRAIGHTFORWARD MATHEMATICS shows that it does not
>> reflect the way that multiple regression actually works."
>>
>> In response I wrote:
>>
>> "What would be the mathematical expression for "held constant"? And
>> what is the mathematical expression to which you are comparing it that
>> leads you to reject "held constant"? Thanks a bunch!"
>>
>> It seemed to me both pieces of information would be necessary for
>> someone to rule that one is appropriate and the other wrong (or, at
>> least, it should be demonstrable that the wrong one has no formal
>> expression). To this David replied:
>>
>> "I'm not sure what you mean by "the mathematical expression for 'held
>> constant,'" other than setting each of the other predictors to some
>> particular value."
>>
>> This latter reply suggests David and I agree that a mathematical
>> expression will be an equation--not a derivation. I responded,
>> writing:
>>
>> "I presumed you had a mathematical representation of the two
>> interpretations and could then show that the former is wrong because
>> the actual regression model is accurately represented by the latter.
>> However, instead of a formula, you provided more text, which is
>> necessarily somewhat imprecise."
>>
>> In that message I introduced a critique of David's use of change when
>> difference is generally correct--the aim of doing so was to suggest
>> that maybe we all can cut each other some slack. I had expected David
>> to just say, "Sure, yeah, that's right, my bad" but David resists this
>> obvious fact. Okay, fine--it's a general discussion, but he prefers to
>> use the specific language. Anyway, David does address the request for
>> a mathematical expression by responding that:
>>
>> "I do have all the necessary mathematical expressions for the proper
>> general interpretation. A plain-text message, however, is not
>> suitable for displaying them. I am not aware of a mathematical
>> representation of the "held constant" interpretation in the
>> n-dimensional geometry in which ordinary least squares operates. It
>> is easy to represent the "held constant" interpretation in the
>> p-dimensional geometry, but that is not the relevant geometry. The
>> absence of a representation for the "held constant" interpretation in
>> the n-dimensional geometry is evidence for its lack of validity. If
>> you have a suitable representation in mind, I would be interested in
>> seeing it."
>>
>> I have not offered a representation because I have not maintained one
>> is right and the other wrong, so it seems I would not be required to
>> distinguish two things I am not sure can be distinguished. In an
>> effort to understand David's point, every response I have written
>> since has been asking for one simple thing: Where can I find this
>> point made in n-dimensional geometry?
>>
>> Other matters are not directly relevant--David won't accept that if
>> you have 2 terms, one general, and one specific, the general applying
>> everywhere, the specific applying in a smaller subset, one should use
>> the general language. Pedagogically and scientifically this seems
>> obvious. Okay. This just means this is not the ideal speech
>> community one might have hoped. Still, I ask--which of the two
>> textbooks David mentioned have the n-dimensional expression David
>> intimated existed? Do either of them have it? Both? Neither? If
>> neither, is there another citation to which I (we?) could turn? Just
>> answering this question with the relevant citation(s) would be
>> immensely helpful. Of course, it is not your job to be helpful. But
>> you've made this point several times on statalist, which led me to
>> think you might want people to get the point. I'm asking for help in
>> getting the point. Rather than more analogies and your plain text
>> derivations (which you indicate are intrinsically sub-optimal), a
>> citation I (and perhaps others) can peruse would be incredibly
>> helpful.
>>
>> Again, thanks a bunch!
>>
>> Sam
>>
>> On Tue, Nov 5, 2013 at 9:26 AM, David Hoaglin <[email protected]> wrote:
>>> Dear Sam,
>>>
>>> It would help communication if you explained, as specifically as
>>> possible, what sort of "mathematical expression" you are looking for.
>>>
>>> The material in my previous message that you reject as a "mathematical
>>> manipulation" needs only one further step, involving straightforward
>>> algebra: In the result of regressing the Y-residuals on the
>>> X2-residuals, multiply out the right-hand side, rearrange the equation
>>> to leave only Y on the left-hand side, and compare the result term by
>>> term against the original model. Since the adjustments for the
>>> contributions of the other predictors are shown explicitly, the
>>> interpretation of b2 is clear. Please explain how you would interpret
>>> the demonstration differently.
>>>
>>> The fact that regression coefficients are a type of slope does not
>>> provide any basis for the "held constant" interpretation. I do not
>>> see the connection between a regression model and your analogy of the
>>> position of two people on a hill. Please explain further.
>>>
>>> When you said that I "retain one mis-interpretation of the regression
>>> model that is extremely elementary and easily corrected," I assume you
>>> are referring to the distinction that you make between "change" and
>>> "difference." I explained earlier that I would use words appropriate
>>> to the particular context and application, so I am not making any
>>> mis-interpretation.
>>>
>>> I remind you that you have not offered any mathematical expression for
>>> the "held constant" interpretation.
>>>
>>> Regards,
>>>
>>> David Hoaglin
>>>
>>> On Tue, Nov 5, 2013 at 9:37 AM, Lucas <[email protected]> wrote:
>>>> Hi David,
>>>>
>>>> I am looking for the mathematical expression you indicated would make
>>>> it clear which interpretation is correct. The mathematical
>>>> manipulation isn't very helpful, because someone who interprets the
>>>> issue differently than you do before can interpret this demonstration
>>>> differently than you do. So, do either of those books have the
>>>> mathematical expression you mentioned? If so, I'll check it out.
>>>>
>>>> On change vs. difference, discrete things change or do not, and
>>>> non-discrete things change or do not. The distinction between "change
>>>> and difference" is orthogonal to the distinction between "discrete and
>>>> non-discrete."
>>>>
>>>> Indeed, the analogy you deploy to support the change interpretation,
>>>> using slopes and hills, is one reason people say "held constant." The
>>>> difference (slope) between my height on the hill and Joe's height on
>>>> the hill is distinct from (and independently estimable given) our
>>>> horizontal placement on the hill. Horizontal placement, thus, is "held
>>>> constant." If this is incorrect, it shows why analogies are less
>>>> helpful than mathematical expressions. Thus, my request for the
>>>> mathematical expression you indicated was available.
>>>>
>>>> I do not understand why you retain one mis-interpretation of the
>>>> regression model that is extremely elementary and easily corrected,
>>>> but are adamant that everyone else is wrong if they use (what you
>>>> call) another mis-interpretation of the model, a mis-interpretation
>>>> that 1)can be shown with straightforward mathematical expressions but
>>>> then 2)seems so complex that it cannot be written in plain text.
>>>>
>>>> Anyway, please let me know which of those textbooks have the
>>>> mathematical expression you referenced earlier. I'll pull it from the
>>>> library and take a look
>>>>
>>>> Thanks!
>>>>
>>>> Sam
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/