Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Maarten buis <maartenbuis@yahoo.co.uk> |
To | stata list <statalist@hsphsun2.harvard.edu> |
Subject | st: Re: Stata logit interaction |
Date | Tue, 26 Apr 2011 15:40:04 +0100 (BST) |
--- On Sat, 23/4/11, Z.L. Deng wrote me privately: > I've read through your Stata Journal paper (2010) regarding how to > interpret interaction terms in logit models. You mentioned in the > final paragraph "I used in this tip a relatively simple example with > only binary variables and no control variables. However, the basic > argument still holds when using continuous variables and when control > variables are added." However I'm a junior logit modeller and have two > queries which need your kind help: > > (1) I don't know how to judge if an interaction between two continuous > variables have a significant effect. For example > Y(0/1) = a0+a1*X1+a2*X2+a3*X1*X2+.... > . logit y x1 x2 x1_x2 ... > How could I judge if X1*X2 has a significant effect? Could you please > offer me the subsequent commands as you did in your Stata Journal > paper, e.g. > . margins , over(black collgrad) expression(exp(xb())) post > . lincom 0.black#1.collgrad - 0.black#0.collgrad > . lincom 1.black#1.collgrad - 1.black#0.collgrad > > (2) Some scholars argue that by centering X1 and X2, we can > allieviate multicolinearity problem. So, I wanted to try > Y(0/1) = a0+a1*X1+a2*X2+a3*(X1-mean(X1))*(X2-mean(X2))+.... > . logit y x1 x2 x1centered_x2centered ... > > If your method still applicable here? As far as I tried, the command > -inteff- command which was proposed by Norton et al (2004), failed in > that case. Ziliang: I forwarded this answer to the Statalist as this type of question pops up so every now and then there. I also forwarded it to Edward Norton as part of the question involves his -inteff- program and he is obviously much more an expert on that than I am. To answer your first question, can I give an example of a continuous by continuous interaction in a logit model with other control variables, consider the example below. As always I start with the baseline odds (a convenient trick to introduce/refresh the readers memory on what an odds and an odds ratio is). Within the group persons with an average education (grade), experience (ttl_exp), and age, who are white widowed or divorced and living in the south we expect to find .46 persons with a good job for ever person with a bad job. If one gets a year more education this odds changes by a ratio of 1.25, i.e. it increases with 25%. Similarly a decade increase in experience (notice that in the data preparation stage I divided ttl_exp by 10) leads to a 160% increase in the odds of getting a good job. The interaction effect says that the effect of education decreases by a factor .90, i.e. -10%, when one gets a decade more experience. The test noted next to that coefficient is the test of the null- hypothesis that this factor by which the effect of education changes equals 1, i.e. the "change" equals 0%. As such it can be meaningfully interpreted as a test of one operationalization of the interaction effect. In this case the interaction effect is negative and (just) significant at the 5% level. Norton et al. (2004) focus on a different operationalization of the interaction effect, in terms of marginal effects rather than odds ratios. These different operationalizations can lead to apparently very different and even opposite conclusions. In my Stata tip, Buis (2010), I tried to make the point that this is the result of whether or not you want to control for the baseline odds or probability. *-------------------- begin example ---------------------- //================================= data preparation // load data sysuse nlsw88, clear /* Categories 1 & 2 are classified as 1, i.e. "good occupations" the rest is classified as 0, i.e. "bad occupation" The categories for occupation are: 1 Professional/technical 2 Managers/admin 3 Sales 4 Clerical/unskilled 5 Craftsmen 6 Operatives 7 Transport 8 Laborers 9 Farmers 10 Farm laborers 11 Service 12 Household workers 13 Other */ gen byte good_occ = occupation < 3 /// if occupation < . // marital status is present in the data // as two dummy variables, these are // combined into one categorical variable // marst so it will work more nicely with // Stata's new factor variable notation gen byte marst = never_married + 2*married label define marst 0 "widowed/divorced" /// 1 "never married" /// 2 "married" label value marst marst label variable marst "marital status" // a trick to report the baseline odds, see // <http://www.maartenbuis.nl/example_faq/index.html#baseline> gen byte baseline = 1 // center variables sum grade, meanonly gen c_grade = grade - r(mean) sum ttl_exp, meanonly gen c_ttl_exp = (ttl_exp - r(mean))/10 sum age, meanonly gen c_age = age - r(mean) //=============================== estimate the model logit good_occ c.c_grade##c.c_ttl_exp /// i.race i.south c_age i.marst baseline, /// nocons or *------------------------ end example ---------------------- (For more on examples I sent to the Statalist see: http://www.maartenbuis.nl/example_faq ) As to your second question, I can use -inteff- with centered variables as you can see in the example below. One thing I can imagine is that you have an old version of -inteff-. It appears that the most current version of that program can be obtained from <http://www.unc.edu/~enorton/>. *----------------------- begin example ---------------------- // load data sysuse nlsw88, clear gen byte good_occ = occupation < 3 /// if occupation < . // center variables sum grade, meanonly gen c_grade = grade - r(mean) sum ttl_exp, meanonly gen c_ttl_exp = (ttl_exp - r(mean))/10 sum age, meanonly gen c_age = age - r(mean) // -inteff- does not (yet) recognize Stata's new // factor variable notation, so we need to make // our own dummies and interactions gen c_gradeXc_ttl = c_grade*c_ttl gen black = race == 2 if race < . gen other = race == 3 if race < . // use inteff logit good_occ c_grade c_ttl_exp c_gradeXc_ttl /// black other south c_age married never_married inteff good_occ c_grade c_ttl_exp c_gradeXc_ttl /// black other south c_age married never_married *------------------------- end example -------------------- (For more on examples I sent to the Statalist see: http://www.maartenbuis.nl/example_faq ) Hope this helps, Maarten References: Maarten L. Buis (2010) Stata tip 87: Interpretation of interactions in non-linear models. The Stata Journal, 10(2), 305--308. Edward C. Norton, Hua Wang, Chunrong Ai (2004) Computing interaction effects and standard errors in logit and probit models. The Stata Journal, 4(2):154--167. -------------------------- Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://www.maartenbuis.nl -------------------------- * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/