Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Re: Stata logit interaction
From
Maarten buis <[email protected]>
To
stata list <[email protected]>
Subject
st: Re: Stata logit interaction
Date
Tue, 26 Apr 2011 15:40:04 +0100 (BST)
--- On Sat, 23/4/11, Z.L. Deng wrote me privately:
> I've read through your Stata Journal paper (2010) regarding how to
> interpret interaction terms in logit models. You mentioned in the
> final paragraph "I used in this tip a relatively simple example with
> only binary variables and no control variables. However, the basic
> argument still holds when using continuous variables and when control
> variables are added." However I'm a junior logit modeller and have two
> queries which need your kind help:
>
> (1) I don't know how to judge if an interaction between two continuous
> variables have a significant effect. For example
> Y(0/1) = a0+a1*X1+a2*X2+a3*X1*X2+....
> . logit y x1 x2 x1_x2 ...
> How could I judge if X1*X2 has a significant effect? Could you please
> offer me the subsequent commands as you did in your Stata Journal
> paper, e.g.
> . margins , over(black collgrad) expression(exp(xb())) post
> . lincom 0.black#1.collgrad - 0.black#0.collgrad
> . lincom 1.black#1.collgrad - 1.black#0.collgrad
>
> (2) Some scholars argue that by centering X1 and X2, we can
> allieviate multicolinearity problem. So, I wanted to try
> Y(0/1) = a0+a1*X1+a2*X2+a3*(X1-mean(X1))*(X2-mean(X2))+....
> . logit y x1 x2 x1centered_x2centered ...
>
> If your method still applicable here? As far as I tried, the command
> -inteff- command which was proposed by Norton et al (2004), failed in
> that case.
Ziliang:
I forwarded this answer to the Statalist as this type of question
pops up so every now and then there. I also forwarded it to
Edward Norton as part of the question involves his -inteff- program
and he is obviously much more an expert on that than I am.
To answer your first question, can I give an example of a continuous
by continuous interaction in a logit model with other control
variables, consider the example below.
As always I start with the baseline odds (a convenient trick to
introduce/refresh the readers memory on what an odds and an odds
ratio is). Within the group persons with an average education
(grade), experience (ttl_exp), and age, who are white widowed or
divorced and living in the south we expect to find .46 persons
with a good job for ever person with a bad job.
If one gets a year more education this odds changes by a ratio of
1.25, i.e. it increases with 25%. Similarly a decade increase in
experience (notice that in the data preparation stage I divided
ttl_exp by 10) leads to a 160% increase in the odds of getting a
good job.
The interaction effect says that the effect of education decreases
by a factor .90, i.e. -10%, when one gets a decade more experience.
The test noted next to that coefficient is the test of the null-
hypothesis that this factor by which the effect of education changes
equals 1, i.e. the "change" equals 0%. As such it can be meaningfully
interpreted as a test of one operationalization of the interaction
effect. In this case the interaction effect is negative and (just)
significant at the 5% level.
Norton et al. (2004) focus on a different operationalization of the
interaction effect, in terms of marginal effects rather than odds
ratios. These different operationalizations can lead to apparently
very different and even opposite conclusions. In my Stata tip, Buis
(2010), I tried to make the point that this is the result of whether
or not you want to control for the baseline odds or probability.
*-------------------- begin example ----------------------
//================================= data preparation
// load data
sysuse nlsw88, clear
/* Categories 1 & 2 are classified as 1, i.e.
"good occupations"
the rest is classified as 0, i.e. "bad occupation"
The categories for occupation are:
1 Professional/technical
2 Managers/admin
3 Sales
4 Clerical/unskilled
5 Craftsmen
6 Operatives
7 Transport
8 Laborers
9 Farmers
10 Farm laborers
11 Service
12 Household workers
13 Other
*/
gen byte good_occ = occupation < 3 ///
if occupation < .
// marital status is present in the data
// as two dummy variables, these are
// combined into one categorical variable
// marst so it will work more nicely with
// Stata's new factor variable notation
gen byte marst = never_married + 2*married
label define marst 0 "widowed/divorced" ///
1 "never married" ///
2 "married"
label value marst marst
label variable marst "marital status"
// a trick to report the baseline odds, see
// <http://www.maartenbuis.nl/example_faq/index.html#baseline>
gen byte baseline = 1
// center variables
sum grade, meanonly
gen c_grade = grade - r(mean)
sum ttl_exp, meanonly
gen c_ttl_exp = (ttl_exp - r(mean))/10
sum age, meanonly
gen c_age = age - r(mean)
//=============================== estimate the model
logit good_occ c.c_grade##c.c_ttl_exp ///
i.race i.south c_age i.marst baseline, ///
nocons or
*------------------------ end example ----------------------
(For more on examples I sent to the Statalist see:
http://www.maartenbuis.nl/example_faq )
As to your second question, I can use -inteff- with centered
variables as you can see in the example below. One thing I
can imagine is that you have an old version of -inteff-. It
appears that the most current version of that program can
be obtained from <http://www.unc.edu/~enorton/>.
*----------------------- begin example ----------------------
// load data
sysuse nlsw88, clear
gen byte good_occ = occupation < 3 ///
if occupation < .
// center variables
sum grade, meanonly
gen c_grade = grade - r(mean)
sum ttl_exp, meanonly
gen c_ttl_exp = (ttl_exp - r(mean))/10
sum age, meanonly
gen c_age = age - r(mean)
// -inteff- does not (yet) recognize Stata's new
// factor variable notation, so we need to make
// our own dummies and interactions
gen c_gradeXc_ttl = c_grade*c_ttl
gen black = race == 2 if race < .
gen other = race == 3 if race < .
// use inteff
logit good_occ c_grade c_ttl_exp c_gradeXc_ttl ///
black other south c_age married never_married
inteff good_occ c_grade c_ttl_exp c_gradeXc_ttl ///
black other south c_age married never_married
*------------------------- end example --------------------
(For more on examples I sent to the Statalist see:
http://www.maartenbuis.nl/example_faq )
Hope this helps,
Maarten
References:
Maarten L. Buis (2010) Stata tip 87: Interpretation of interactions
in non-linear models. The Stata Journal, 10(2), 305--308.
Edward C. Norton, Hua Wang, Chunrong Ai (2004) Computing interaction
effects and standard errors in logit and probit models. The Stata
Journal, 4(2):154--167.
--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany
http://www.maartenbuis.nl
--------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/