Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Clarification requested about the at() option of -margins-
From
Trevor Zink <[email protected]>
To
[email protected]
Subject
Re: st: Clarification requested about the at() option of -margins-
Date
Thu, 24 Oct 2013 10:03:27 -0700
Thanks, Richard.
I had actually run across some of your materials before; they were helpful.
My actual problem is obviously more complex than the simple example I
illustrated with. At what point (of complexity) is -margins- no longer
"just plugging numbers into formulas"? In my actual problem I'm not
using interactions, but I am using multiple regressors and factor variables.
Thanks
Trevor
On 10/24/2013 10:24 AM, Richard Williams wrote:
At 11:01 AM 10/24/2013, Trevor Zink wrote:
Paul,
Thanks very much for your detailed answer. If I may ask a few
follow-up questions to make sure I understand properly...
1) "-margins- converts log-odds (and their slopes) to probabilities
(and their slopes) for us". I have spent a lot of time reading about
-margins- over the past few weeks, and I can't recall ever hearing it
explained like this. Is this really what -margins- does? Simply
converts from log-odds back to probability? If so, that is great
news--it makes the interpretation of the output much easier.
2) "Although the slope for the log-odds is fixed; that for the
probability is not. As 0 and 1 are approached, the slope tends to 0,
and the possible values and SE are also constrained". So what you're
saying is the fact that the slope goes to 0 at the 0 and 1 isn't
because of any extrapolation like I assumed, it's simply a product of
mapping onto the logic function?
3) You used both -margins, at()- and -margins, dydx() at()-. My
understanding of the difference after reading your answer is that
-margins, at()- gives the /probability/ of Y==1 at the specified
values of X. Whereas -margins, dydx() at()- gives the /change in the
probability/ of Y==1 from an infintesimal change in X at the
specified values of X. Correct?
(as a side note, I wouldn't have expected weight to predict foreign
vs domestic as well as it does)
Thanks again for your answer.
Trevor
Trevor, particularly for a simple problem like yours, margins is just
plugging numbers into formulas. So, for example, if you had a formula
like
y = 2 + 3*x
you could plug in whatever value you wanted for x (including a totally
nonsensical one, e.g. a negative value for weight) and you could get a
value for y. Margins doesn't know or care whether the numbers are
sensible or not. Sometimes it is realistic to go a bit outside the
observed sample range, e.g. try a 5,000 pound car, but in this case it
would be silly to go up to 100,000 pounds. But you have to figure that
out, not margins.
If you want to know more about how margins works, see
http://www3.nd.edu/~rwilliam/xsoc73994/Margins01.pptx
http://www3.nd.edu/~rwilliam/xsoc73994/Margins02.pdf
http://www3.nd.edu/~rwilliam/xsoc73994/Margins03.pdf
On 10/24/2013 3:13 AM, Seed, Paul wrote:
Dear Statalist,
Trevor Zink asks why -margins- does not behave as he would expect
following
logistic regression.
The answer is found only by going back to exactly what logistic
regression
actually does; and how it compares to linear regression.
Linear regression is carried out under an assumption of constant
slope, and has no problem
therefore in estimating the slope at any value of the predictors.
With a single predictor,
the estimated slope does not change. (Point 1 of Stata output).
However, it is inappropriate for a binary outcome, as it can lead to
estimated proportions beyond 0 and 1.
(Point 2).
Logistic regression solves this by working with the log-odds, rather
than the probability. There are no
impossible values. Extreme log-odds correspond to probabilities
close to 0 or 1.
-margins- converts log-odds (and their slopes) to probabilities (and
their slopes) for us. (Point 3)
Although the slope for the log-odds is fixed; that for the
probability is not. As 0 and 1 are approached, the
slope tends to 0, and the possible values and SE are also
constrained. (Point 4)
Plotting the estimated values against weight reveals this quite
clearly. (Point 5)
The code below uses Trevor's example (amended and expanded).
************** Begin Stata code*******************
set more off
sysuse auto, clear
gen wt_tons = weight/2240
* Change units to make results easier to understand
summarize wt_tons
* maximum weight is 2.1607 tons
regress foreign wt_tons
margins, dydx(wt_tons) at(wt_tons=(0(0.2)2 20))
* Point 1
margins, at(wt_tons=(0(0.2)2 ))
* Point 2
logit foreign wt_tons
margins, at(wt_tons=(0(0.2)2 ))
* Point 3
margins, dydx(wt_tons) at(wt_tons=(0(0.2)2 ))
* Point 4
predict Foreign if foreign
predict USA if !foreign
label var Foreign Foreign
label var USA USA
label var wt_tons "Car weight (tons)"
gr7 Foreign USA wt_tons, xlab(0 1 2) ylab(0 .5 1.0)
l1title("Estimated probability of car being foreign")
* Point 5
**************** End Stata code *************
Best wishes,
Paul T Seed, Senior Lecturer in Medical Statistics,
Division of Women's Health, King's College London
Women's Health Academic Centre, King's Health Partners
(+44) (0) 20 7188 3642.
Date: Wed, 23 Oct 2013 23:21:13 -0700
From: Trevor Zink <[email protected]>
Subject: st: Clarification requested about the at() option of
-margins-
Long-time lurker, first-time post. I couldn't find a good
explanation in
the archives.
I'm confused about what, specifically, -margins- is doing with the
at()
option, such that it can calculate margins for values of variable that
don't exist in the data. To articulate with an example:
sysuse auto
summarize weight //maximum weight is 4840
logit foreign weight //nonsensical, but ok for the example
margins, dydx(weight) at(weight=(0(1000)10000 100000))
Here I ask for the slope of the function at a variety of weights
from 0
to 10,000 and also 100,000. The maximum weight observed in the data
is 4840.
My understanding of -margins- with at() was that it calculates the
slope
of the function holding the specified variables constant at the
specified levels. But if the specified level doesn't appear in the
data,
how can Stata determine what the slope is at this value? Ok, it's
clearly extrapolating, but based on what information? The only other
information included in the above model is a constant. When I try the
above but specifying the nocons option to -logit- Stata returns an
error, so it must be forecasting based on the constant; but
specifically
how?
What's even more strange to me is that the standard errors *shrink* as
the estimates extend beyond the observed data. If Stata is forecasting
based on only the constant this seems counter-intuitive to me.
Thanks, and sorry if this is silly.
Trevor Zink
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
--
Trevor Zink, MBA, MA
Ph.D. Candidate
UC Regents Special Fellow
Bren School of Environmental Science and Management
University of California, Santa Barbara
[email protected] <mailto:[email protected]>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME: (574)289-5227
EMAIL: [email protected]
WWW: http://www.nd.edu/~rwilliam
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
--
Trevor Zink, MBA, MA
Ph.D. Candidate
UC Regents Special Fellow
Bren School of Environmental Science and Management
University of California, Santa Barbara
[email protected] <mailto:[email protected]>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/