Title | A terminology problem: odds ratio versus odds | |
Author |
William Gould, StataCorp James Hardin, StataCorp |
Unfortunately, the language used to describe statistical terms is not used uniformly across fields. One example of this is odds and odds ratio. Economists especially refer to what others call the odds as the odds ratio. Below, we will be careful to define our terms.
Let there be a binary outcome y; we will say y=0 or y=1, and let us assume that
Pr(y==1) = F(Xb)
where X and b are vectors and F() is some cumulative distribution.
If F() is the normal distribution, we have the probit estimator.
If F() is the logistic distribution, we have the logit (logistic) estimator.
The cumulative distribution for the logistic distribution is
F(Xb) = exp(Xb) / [1 + exp(Xb)]
Thus,
Pr(y==1) = exp(Xb) / [1 + exp(Xb)]
Let us write p for Pr(y==1)
p = exp(Xb) / [1 + exp(Xb)]
The odds p/(1−p) is therefore
p exp(Xb) / [1 + exp(Xb)] exp(Xb) / [1 + exp(Xb)] --- = ------------------------- = ----------------------- 1-p 1 - exp(Xb)/[1 + exp(Xb)] 1 / [1 + exp(Xb)] = exp(Xb)
Many authors present this formula as
log( p/[1-p] ) = Xb
which also means
p / (1-p) = exp(Xb)
The language here is sometimes confusing because some authors call this the odds ratio. Englishwise, they are correct: it is the odds and the odds are based on a ratio calculation. It is not, however, the odds ratio that is talked about when results are reported.
The odds ratio when results are reported refers to the ratio of two odds or, if you prefer, the ratio of two odds ratios.
That is, let us write
o(Xb) = exp(Xb)
The odds ratio is
o(evaluated at one place) ------------------------- o(evaluated at another)
In particular, we want to consider the ratio of the odds for a one-unit change in one of the components of X. Let us now write
Xb = b0 + b1*x1 + b2*x2 + ... + bk*xk
Let us arbitrarily consider what is called the odds ratio for x1:
o(b0 + b1*(x1+1) + b2*x2 + ... + bk*xk) --------------------------------------- o(b0 + b1*x1 + b2*x2 + ... + bk*xk) o(b0 + b1*x1 + b2*x2 + ... + bk*xk + b1) = ---------------------------------------- o(b0 + b1*x1 + b2*x2 + ... + bk*xk)
Now, remember, o() = exp(), so
exp(b0 + b1*x1 + b2*x2 + ... + bk*xk + b1) = ---------------------------------------- exp(b0 + b1*x1 + b2*x2 + ... + bk*xk) exp(b0 + b1*x2 + b2*x2 + ... + bk*xk) * exp(b1) = ----------------------------------------------- exp(b0 + b1*x2 + b2*x2 + ... + bk*xk) = exp(b1)
This is the standard result. The ratio of the odds for a one-unit increase in Xi is exp(bi).
This ratio is constant: it does not change according to the value of the other Xs because they cancel out in the calculation.
Be careful about language:
It is the language, and not the math, that leads to the confusion. When we say that in a logistic model, the odds ratio is constant, we mean
o(evaluated at one point) -------------------------- is constant. o(evaluated somewhere else)
We do not mean that
o(evaluated at one point) is constant.
(that is, we do not mean the odds are constant).
Here is an example of computing the odds ratio and the odds with a logistic regression
. sysuse auto (1978 Automobile Data) . * show odd ratios by defalut . logistic foreign price weight Logistic regression Number of obs = 74 LR chi2(2) = 54.11 Prob > chi2 = 0.0000 Log likelihood = -17.976341 Pseudo R2 = 0.6008
foreign | Odds ratio Std. err. z P>|z| [95% conf. interval] | |
price | 1.00093 .0003002 3.10 0.002 1.000342 1.001519 | |
weight | .9941387 .0016887 -3.46 0.001 .9908345 .9974539 | |
_cons | 8106.921 21301.56 3.43 0.001 47.01734 1397828 | |