Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Interpretation of categorical independent variable

From	Maarten buis <[email protected]>
To	[email protected]
Subject	Re: st: Interpretation of categorical independent variable
Date	Fri, 10 Sep 2010 08:23:52 +0000 (GMT)

--- On Fri, 10/9/10, Meng Zhao wrote:
> I use a three-category ID to predict a binomial dependent
> variable. I included dummy variables for the first two
> categories in the model. The result is:
>           
>             Odds Ratio 
>    p
> Category 1:  116.45     
> 0.000   
> Category 2:  17.76   
>    0.000
> 
> Is the following interpretion correct?
> 
> 1.compared to category 3, being category 1 increases the
> Odds Ratio by 116.45 for DV to happen (whatever it
> represents).
> 
> 2.compared to category 3, being category 2 increases the
> Odds Ratio by 17.76. 
> 
> 3.So category 1 has a stronger effect on DV than category
> 2, and category 2 is stronger than category 3

Not quite, an odds ratio is a ratio of odds, while the way you 
formulated the results suggests that it is a difference. So
the odds a the expected number of successes for every failure,
and the odds ratio is the ratio by which this odds changes.

When interpreting the odds ratios, I find it helpful to have
the baseline odds. Unfortunately, Stata supresses this by
default, but there is a trick you can use to get it displayed,
which I learned from (Newson 2003).

Consider the example below:

*--------------- begin example ------------------
sysuse auto, clear
recode rep78 1/2=3

gen byte baseline = 1

sum price if !missing(foreign, rep78), meanonly
gen c_price = price - r(mean)

logit foreign i.rep78 c_price baseline, noconst or
*--------------- end example ----------------------
(For more on examples I sent to the Statalist see: 
http://www.maartenbuis.nl/example_faq )

The coefficient reported for baseline is the odds of
being foreign when one belongs to category 3 of rep78
and one has an average price (I created c_price to be
0 when the price is average). So for this type of car
we expect to find 0.08 foreign car for every domestic
car. This odds of being a foreing car changes by a 
factor 12 (i.e. (12 - 1)*100% = 1100%) when the car
belongs to category 4, and by a factor 56 (i.e. 5500%)
when the car belongs to category 5.

To get more feeling for what that means I often find it
useful to look at the odds directly. To get these
you can leave the baseline category in your model and 
leave the constant (in our case the "variable" 
baseline) out. In this case the coefficients of your 
categories are now the odds of being a foreign car 
within each category for an average priced car. 

So for category 3 we already knew that that was 
0.08 foreign cars for every domestic car.

For category 4 cars the odds is 1 foreign car for every 
domestic car (which is fortunately 12 times larger than 
the odds for category 3 cars, so we are getting exactly
the same results as in our previous model). 

For category 5 cars we expect to find 4.5 foreign cars 
for every domestic car (which is 56 times larger than 
the odds for category 3 cars). 

*-------------- begin example -----------------
logit foreign ibn.rep78 c_price, noconst or
di exp(_b[4.rep78])/exp(_b[3bn.rep78])
di exp(_b[5.rep78])/exp(_b[3bn.rep78])
*---------------- end example ------------------
(For more on examples I sent to the Statalist see: 
http://www.maartenbuis.nl/example_faq )

Hope this helps,
Maarten

Roger Newson (2003) "Stata tip 1: The eform() option of 
regress". The Stata Journal, 3(4): 445.
<http://www.stata-journal.com/article.html?article=st0054>

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Interpretation of categorical independent variable
  - From: Ronan Conroy <[email protected]>
- Re: st: Interpretation of categorical independent variable
  - From: Meng Zhao <[email protected]>

References:
- st: Interpretation of categorical independent variable
  - From: Meng Zhao <[email protected]>

Prev by Date: Re: st: Deriving Bayes estimates from xtmelogit
Next by Date: AW: AW: st: Create variable as a copy of a dynamically calculated second variable
Previous by thread: st: Interpretation of categorical independent variable
Next by thread: Re: st: Interpretation of categorical independent variable
Index(es):
- Date
- Thread