|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Graphing Quadratic Interactions
This request has generated no replies as yet, and it's not hard to see
why. Note, especially "I need a pretty graph"
I looked up "pretty graph" in Stata and there's no routine available,
so we'll have to think this one out. Apologies - this is a long post.
On 14 Noll 2008, at 16:41, Susanna Khavul wrote:
I would like to graph a quadratic interaction after running rreg
(Robust Regression). The model is as follows:
xi:rreg Y i.X1 X2 X3 X4 X5 X5squared X5*X4 X5squared*X4
(interaction terms are centered)
Where:
X5 -- independent variable
X5squared--square of the independent variable
X4-moderator
X5*X4--independent variable x moderator
X5squared*X4--independent variable squared x moderator
The rest (X1 X2 X3) are controls which I want to account for as well.
One standard deviation above and below the mean would be fine.
I want to include both components of the interaction (linear and
quadratic). Is there a module that will do it? If not, what is the
optimum graphing command.
Any help would be much appreciated. I need a pretty graph.
It appears that X2, X3 and X4 are binary and that X1 has more than two
categories. Even more difficult, X5 enters two interaction terms.
Of course, there's no general answer to this, and a graph can only be
constructed on the basis of the scientific question that the analysis
answers.
The first question is whether any of the predictor variables defines
subsets of the data which are of primary interest. There may be two
reasons for this
1. The relationship is different in each subset. For example, in our
National Health and Lifestyle Survey, we found that the relationship
of well-being to age was different in men and women. In men, it
declined continuously, while in women it declined sharply initially
than recovered in middle and later life.
2. There may be a priori reasons for testing the hypothesis in two
subsets. We've got a paper in press looking at the effect of worry on
quality of life. because of its association with depression, we have
graphed this relationship in the depressed and the non-depressed
participants, side by side, to show that the relationship is similar
in depressed and non-depressed people, but that the depressed have a
worse baseline quality of life. (Interestingly, non-depressed severe
worriers have a quality of life just as bad as non-worried depressed
people).
If either of these two conditions is true, you probably need to
construct a chart using -by- to make separate graphs for each subgroup
The second question is whether any of the covariates can be
'standardised out' By this I mean constructing the graphs at a chosen
value of the covariate.
For example, worry declines very significanly with age, as does
quality of life. For this reason, we constructed our graphs of quality
of life for a person aged 75 (using -adjust-). With continuous
predictors which are not of primary interest, but which must be
controlled as confounders, graphing relationships at fixed (sensible)
values of the predictors may be the best way of displaying the data.
The same logic may be applied to binary predictors, again using -
adjust- to generate predictions for fixed prevalences of the predictors.
This leaves us with the final question of the form of the graph.
Again, without knowing the science behind the question, it's
impossible to answer. However, adding one standard deviation above and
below the mean is probably not a good data display. The standard
deviation measures the scatter of observations, and if you're going to
show scatter, then you are probably better off graphing the actual
data, whose scatter can be quite different to that implied by the
standard deviation. Adding boxes or means plus confidence intervals
allows you to superimpose a data summary (both implemented in Nick
Cox's remarkably useful -stripplot-, which is the first user-written
graphic routine I require my students to download).
That's as much as I can say about the graph based on general
principle. A good graph shows something interesting about the data. To
make a good graph, you must first identify that interesting something.
And this, clearly, is impossible in the absence of knowing either the
hypothesis or the results of the analysis.
However, I would be wary of -rreg- as a primary analysis tool. It is a
good procedure for reassuring yourself that the results of your
analysis would not change substantively if influential observations
were down-weighted, but as a primary model-building tool it suffers, I
think, from a primary difficulty: that it violates the logic of
hypothesis testing.
In a hypothesis test, the investigator specifies the form of the model
and then calculates the model parameters. For example, the model
baby weight = a constant + (mother's height x something1) +
(gestational age x something2) + error
needs three parameters calculated. For each parameter, we can test
whether the proportional reduction in error is statistically
significant.
However, robust regression rebuilds the model as a complex equation in
which individual observations are entered in a reweighted form. Thus,
the investigator is leaving the selection of the model itself to
chance features of the data. As such, the hypothesis tests which
follow violate the central assumption of any such test - that the
model was specified independently of the data.
Perhaps I'm being a little jaundiced here, but I use -rreg- for
support, not illumination.
Ronan Conroy
=================================
[email protected]
Royal College of Surgeons in Ireland
Epidemiology Department,
Beaux Lane House, Dublin 2, Ireland
+353 (0)1 402 2431
+353 (0)87 799 97 95
+353 (0)1 402 2764 (Fax - remember them?)
http://rcsi.academia.edu/RonanConroy
P Before printing, think about the environment
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/