Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Clarification requested about the at() option of -margins-
From
Richard Williams <[email protected]>
To
[email protected], [email protected]
Subject
Re: st: Clarification requested about the at() option of -margins-
Date
Thu, 24 Oct 2013 13:30:30 -0500
At 12:03 PM 10/24/2013, Trevor Zink wrote:
Thanks, Richard.
I had actually run across some of your materials before; they were helpful.
My actual problem is obviously more complex than the simple example I
illustrated with. At what point (of complexity) is -margins- no longer
"just plugging numbers into formulas"? In my actual problem I'm not
using interactions, but I am using multiple regressors and factor variables.
I'm not sure that it ever stops. If you have a bunch of other
variables in the model, it may be plugging in means for them, or (if
using asobserved) it may be doing calculations on a case by case
basis and averaging them. If you give it a specific number it will do
the calculation with that number.
Whenever you do these calculations, you can add a qualifier like
"assuming the model is correct." The model may not be correct for all
numbers, especially numbers that fall out of sample. For example,
even if a model is correct for weights that fall between 2000 and
5000 pounds, there is no guarantee that it will be correct for, say,
a 10,000 pound car. If you actually had 10,000 pound cars in your
sample you might find that after 5,000 pounds the slope changes, or
that you need an x^2 term, or whatever.
There are also other weird sorts of calculations that margins can do,
like compare the predicted probability of success for a 70 year old
who is retired with the predicted probability of success for an 18
year old who also happens to be retired.
So again I would say, you can plug in whatever numbers you want into
a formula, but that doesn't mean the results will be right or
sensible. But margins is plugging numbers into formulas. You have to
think about whether the numbers you are feeding it are sensible.
Thanks
Trevor
On 10/24/2013 10:24 AM, Richard Williams wrote:
At 11:01 AM 10/24/2013, Trevor Zink wrote:
Paul,
Thanks very much for your detailed answer. If I may ask a few
follow-up questions to make sure I understand properly...
1) "-margins- converts log-odds (and their slopes) to
probabilities (and their slopes) for us". I have spent a lot of
time reading about -margins- over the past few weeks, and I can't
recall ever hearing it explained like this. Is this really what
-margins- does? Simply converts from log-odds back to probability?
If so, that is great news--it makes the interpretation of the
output much easier.
2) "Although the slope for the log-odds is fixed; that for the
probability is not. As 0 and 1 are approached, the slope tends to
0, and the possible values and SE are also constrained". So what
you're saying is the fact that the slope goes to 0 at the 0 and 1
isn't because of any extrapolation like I assumed, it's simply a
product of mapping onto the logic function?
3) You used both -margins, at()- and -margins, dydx() at()-. My
understanding of the difference after reading your answer is that
-margins, at()- gives the /probability/ of Y==1 at the specified
values of X. Whereas -margins, dydx() at()- gives the /change in
the probability/ of Y==1 from an infintesimal change in X at the
specified values of X. Correct?
(as a side note, I wouldn't have expected weight to predict
foreign vs domestic as well as it does)
Thanks again for your answer.
Trevor
Trevor, particularly for a simple problem like yours, margins is
just plugging numbers into formulas. So, for example, if you had a formula like
y = 2 + 3*x
you could plug in whatever value you wanted for x (including a
totally nonsensical one, e.g. a negative value for weight) and you
could get a value for y. Margins doesn't know or care whether the
numbers are sensible or not. Sometimes it is realistic to go a bit
outside the observed sample range, e.g. try a 5,000 pound car, but
in this case it would be silly to go up to 100,000 pounds. But you
have to figure that out, not margins.
If you want to know more about how margins works, see
http://www3.nd.edu/~rwilliam/xsoc73994/Margins01.pptx
http://www3.nd.edu/~rwilliam/xsoc73994/Margins02.pdf
http://www3.nd.edu/~rwilliam/xsoc73994/Margins03.pdf
On 10/24/2013 3:13 AM, Seed, Paul wrote:
Dear Statalist,
Trevor Zink asks why -margins- does not behave as he would expect following
logistic regression.
The answer is found only by going back to exactly what logistic regression
actually does; and how it compares to linear regression.
Linear regression is carried out under an assumption of constant
slope, and has no problem
therefore in estimating the slope at any value of the predictors.
With a single predictor,
the estimated slope does not change. (Point 1 of Stata output).
However, it is inappropriate for a binary outcome, as it can lead
to estimated proportions beyond 0 and 1.
(Point 2).
Logistic regression solves this by working with the log-odds,
rather than the probability. There are no
impossible values. Extreme log-odds correspond to probabilities
close to 0 or 1.
-margins- converts log-odds (and their slopes) to probabilities
(and their slopes) for us. (Point 3)
Although the slope for the log-odds is fixed; that for the
probability is not. As 0 and 1 are approached, the
slope tends to 0, and the possible values and SE are also
constrained. (Point 4)
Plotting the estimated values against weight reveals this quite
clearly. (Point 5)
The code below uses Trevor's example (amended and expanded).
************** Begin Stata code*******************
set more off
sysuse auto, clear
gen wt_tons = weight/2240
* Change units to make results easier to understand
summarize wt_tons
* maximum weight is 2.1607 tons
regress foreign wt_tons
margins, dydx(wt_tons) at(wt_tons=(0(0.2)2 20))
* Point 1
margins, at(wt_tons=(0(0.2)2 ))
* Point 2
logit foreign wt_tons
margins, at(wt_tons=(0(0.2)2 ))
* Point 3
margins, dydx(wt_tons) at(wt_tons=(0(0.2)2 ))
* Point 4
predict Foreign if foreign
predict USA if !foreign
label var Foreign Foreign
label var USA USA
label var wt_tons "Car weight (tons)"
gr7 Foreign USA wt_tons, xlab(0 1 2) ylab(0 .5 1.0)
l1title("Estimated probability of car being foreign")
* Point 5
**************** End Stata code *************
Best wishes,
Paul T Seed, Senior Lecturer in Medical Statistics,
Division of Women's Health, King's College London
Women's Health Academic Centre, King's Health Partners
(+44) (0) 20 7188 3642.
Date: Wed, 23 Oct 2013 23:21:13 -0700
From: Trevor Zink <[email protected]>
Subject: st: Clarification requested about the at() option of -margins-
Long-time lurker, first-time post. I couldn't find a good explanation in
the archives.
I'm confused about what, specifically, -margins- is doing with the at()
option, such that it can calculate margins for values of variable that
don't exist in the data. To articulate with an example:
sysuse auto
summarize weight //maximum weight is 4840
logit foreign weight //nonsensical, but ok for the example
margins, dydx(weight) at(weight=(0(1000)10000 100000))
Here I ask for the slope of the function at a variety of weights from 0
to 10,000 and also 100,000. The maximum weight observed in the
data is 4840.
My understanding of -margins- with at() was that it calculates the slope
of the function holding the specified variables constant at the
specified levels. But if the specified level doesn't appear in the data,
how can Stata determine what the slope is at this value? Ok, it's
clearly extrapolating, but based on what information? The only other
information included in the above model is a constant. When I try the
above but specifying the nocons option to -logit- Stata returns an
error, so it must be forecasting based on the constant; but specifically
how?
What's even more strange to me is that the standard errors *shrink* as
the estimates extend beyond the observed data. If Stata is forecasting
based on only the constant this seems counter-intuitive to me.
Thanks, and sorry if this is silly.
Trevor Zink
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
--
Trevor Zink, MBA, MA
Ph.D. Candidate
UC Regents Special Fellow
Bren School of Environmental Science and Management
University of California, Santa Barbara
[email protected] <mailto:[email protected]>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME: (574)289-5227
EMAIL: [email protected]
WWW: http://www.nd.edu/~rwilliam
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
--
Trevor Zink, MBA, MA
Ph.D. Candidate
UC Regents Special Fellow
Bren School of Environmental Science and Management
University of California, Santa Barbara
[email protected] <mailto:[email protected]>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME: (574)289-5227
EMAIL: [email protected]
WWW: http://www.nd.edu/~rwilliam
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/