Hi,
I am puzzling from what I judge as diverging results (different sign) of
interaction terms in a multinomial logit model and predicted probabilities,
as generated through prtab and shown graphically through prgen and graph.
I am doing research on the returns of human capital investment in terms of
occupational attainment. For some theoretical reasons, my dependent variable
(occup_att_2, see below) is built as follows:
1. Managers
2. Professionals
3. Associate Professionals
4. Clerks,
5. Lower service and other occupations
?Clerks? is my reference category in the dependent variable.
I have applied a multinomial logit model to the sample of one of my national
cases of study. My data set is the result of merging different
cross-sectional surveys corresponding to eight different years; I am using
labour force surveys for up to eight years.
Since I am especially interested in looking at the TREND in the returns of
human capital investment, I have made interactions of the variable ?year?
(capturing the different years included in the data) and educational
attainment.
Here, I present the results of one my models. I have excluded the
coefficients corresponding to other indep vars I'm not so interested in.
. xi: mlogit occup_att_2 i.tert_ed*year_3 sex_2 mstatus_2 age national_2_2
national_2_3 tenure per
> m_2_2 perm_2_3, b(4) nolog
i.tert_ed _Itert_ed_1-5 (naturally coded; _Itert_ed_3 omitted)
i.tert~d*year_3 _IterXyear__# (coded as above)
Multinomial logistic regression Number of obs =
525579
LR chi2(68) =
432135.20
Prob > chi2 =
0.0000
Log likelihood = -414328.03 Pseudo R2 =
0.3427
----------------------------------------------------------------------------
--
occup_att_2 | Coef. Std. Err. z P>|z| [95% Conf.
Interval]
-------------+--------------------------------------------------------------
--
Managers |
_Itert_ed_1 | .8369161 .1164738 7.19 0.000 .6086317
1.065201
_Itert_ed_2 | -.5057224 .1448549 -3.49 0.000 -.7896329
-.221812
_Itert_ed_4 | -.1405644 .1441132 -0.98 0.329 -.4230211
.1418922
_Itert_ed_5 | .4043363 .1040787 3.88 0.000 .2003458
.6083269
year_3 | .0006106 .009962 0.06 0.951 -.0189145
.0201357
_IterXyear~1 | .024218 .0122985 1.97 0.049 .0001133
.0483227
_IterXyear~2 | .033616 .0151505 2.22 0.026 .0039216
.0633103
_IterXyear~4 | -.0013059 .0143873 -0.09 0.928 -.0295046
.0268927
_IterXyear~5 | .000611 .0112359 0.05 0.957 -.0214109
.0226329
_cons | -3.208653 .0965292 -33.24 0.000 -3.397847
-3.019459
-------------+--------------------------------------------------------------
--
Profession~s |
_Itert_ed_1 | 3.870921 .1599636 24.20 0.000 3.557398
4.184444
_Itert_ed_2 | -.5270488 .1966783 -2.68 0.007 -.9125312
-.1415665
_Itert_ed_4 | -.3029058 .2470132 -1.23 0.220 -.7870428
.1812312
_Itert_ed_5 | -2.130236 .24443 -8.72 0.000 -2.60931
-1.651162
year_3 | -.0705025 .0171069 -4.12 0.000 -.1040313
-.0369736
_IterXyear~1 | .058865 .0177997 3.31 0.001 .0239782
.0937517
_IterXyear~2 | .1749493 .021033 8.32 0.000 .1337253
.2161732
_IterXyear~4 | .0403727 .0249201 1.62 0.105 -.0084698
.0892152
_IterXyear~5 | .0838841 .0262562 3.19 0.001 .0324228
.1353453
_cons | -3.485732 .1563276 -22.30 0.000 -3.792128
-3.179335
-------------+--------------------------------------------------------------
--
Associate ~s |
_Itert_ed_1 | .5250349 .088949 5.90 0.000 .3506981
.6993717
_Itert_ed_2 | .2204853 .0970563 2.27 0.023 .0302584
.4107123
_Itert_ed_4 | .219423 .1074958 2.04 0.041 .0087351
.4301109
_Itert_ed_5 | -.2276642 .0876818 -2.60 0.009 -.3995174
-.0558111
year_3 | .0345487 .0073366 4.71 0.000 .0201691
.0489282
_IterXyear~1 | .0072084 .0093549 0.77 0.441 -.0111268
.0255436
_IterXyear~2 | .0260399 .0102047 2.55 0.011 .0060391
.0460406
_IterXyear~4 | -.0186982 .0106685 -1.75 0.080 -.0396081
.0022117
_IterXyear~5 | -.0133634 .0093698 -1.43 0.154 -.0317279
.0050011
_cons | -.8378662 .0728109 -11.51 0.000 -.980573
-.6951594
-------------+--------------------------------------------------------------
--
Low servic~r |
_Itert_ed_1 | -.6625195 .0883424 -7.50 0.000 -.8356674
-.4893716
_Itert_ed_2 | .7419491 .0849377 8.74 0.000 .5754743
.9084238
_Itert_ed_4 | 2.149201 .0871306 24.67 0.000 1.978429
2.319974
_Itert_ed_5 | 2.418502 .0701088 34.50 0.000 2.281091
2.555912
year_3 | .0643842 .0063551 10.13 0.000 .0519284
.07684
_IterXyear~1 | -.0171698 .0091566 -1.88 0.061 -.0351165
.0007769
_IterXyear~2 | -.0339005 .0089674 -3.78 0.000 -.0514763
-.0163247
_IterXyear~4 | -.1356851 .0087742 -15.46 0.000 -.1528822
-.1184881
_IterXyear~5 | -.0473144 .0075397 -6.28 0.000 -.0620919
-.0325369
_cons | .2592752 .0623827 4.16 0.000 .1370073
.3815432
----------------------------------------------------------------------------
--
(occup_att_2==Clerks is the base outcome)
As you see, the coefficient of the interaction of time (year_3) and the
dummy variable corresponding to the highest educational attainment
(university degree) has a positive sign for the category 'Professionals' in
the dependent variable. A university degree not only seems to increase the
likelihood of being in this category, vis-à-vis the category of reference,
but also that time seems to have an effect increasing this likelihood
(versus the likelihood of increasing the possibility of finding yourself in
the reference category (?Clerks?).
For the sake of presenting graphically this trend, a) I have run another
multinomial logistic model excluding interactions of time and educational
attainment dummies. Please, note that I have JUST excluded the interactions
of time and educational attainment from the previous model; apart from that,
both models are identical.
b) I have used the prgen command to generate the predicted probabilities
corresponding to the variable 'year_3' time when the dummy variable
corresponding to university degree (_Itert_ed_1) is 1, the other dummies
corresponding to other educational attainment levels are 0 and (by default)
the rest of independent variables are kept to the mean;
prgen year_3, x(_Itert_ed_1=1 _Itert_ed_2=0 _Itert_ed_4=0 _Itert_ed_5=0)
f(6) t(13) gen(univ)
and c) I have generated graph by means of...
graph twoway (scatter univp1 univp2 univp3 univp5 univp4 univx, connect(l l
l l l) xtitle(University) ytitle(probability))
Now, the trend devised by the graph (not show here) reveals a DECLINING
expected probability of being 'Professional' when you have a university
degree.
It corresponds to the decreasing predicted probabilities which appear when I
run the prtab command as follows
prtab _Itert_ed_1 year_3, x(_Itert_ed_2=0 _Itert_ed_4=0 _Itert_ed_5=0)
...I just show the predicted probabilities for the category 'Professionals'
in the dependent variable
mlogit: Predicted probabilities for occup_att_2
Predicted probability of outcome 2 (Professionals)
--------------------------------------------------------------------------
tert_ed== | year_3
1 | 6 7 8 9 10 11 12 13
----------+---------------------------------------------------------------
0 | 0.0248 0.0240 0.0232 0.0225 0.0217 0.0210 0.0203 0.0197
1 | 0.6741 0.6662 0.6580 0.6498 0.6414 0.6329 0.6242 0.6155
--------------------------------------------------------------------------
Now my question comes. I do not understand that such decreasing
probabilities appear when the interaction of year_3 and _Itert_ed_1 has
shown before (initial model) to be positive. How could I interpret this
discordance? How is it possible?
As suggested in the guidelines of Statalist, I have looked for help in the
Statalist itself, but I'm afraid I'm stuck with this problem.
I would very much appreciate your help on this.
In any case, my apologies for the query, if it results too long, and my
gratitude for your attention, if you have reached this point.
-.-.-.-.-.-.-.-
Luis Ortiz
Profesor Agregado
Departament de Ciencies Polítiques i Socials
Universitat Pompeu Fabra
Ramon Trias Fargas, 25-27
08005 Barcelona
Phone: +34-93-5422368
Fax: +34-93-5422372
http://www.upf.edu/dcpis/
http://sociodemo.upf.edu/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/