Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: xi3 / nlogit
From
Nils Wlömert <[email protected]>
To
[email protected]
Subject
Re: st: xi3 / nlogit
Date
Sat, 4 Dec 2010 17:58:21 +0100
Many thanks Michael, the definition of the effect coded variables
works great now. However, I think there is another issue related to
variable coding in the model (empty cells that are coded as zeros).
I estimated the asclogit choice model (6 attributes with 4 levels each
+ no-chocie base option) in Stata with linear coding of all attributes
and the results are identical to the results obtained for the 1
segment solution in Latent Gold Choice which als uses McFadden's
choice model:
Attributes
p_ppd (price I)
-0,4507
p_fl (price II)
-0,8754
p_drm
-0,1731
fl_drm
-0,5922
ad
-0,2764
cat
0,2817
nobuy
-1,6554
When it comes to effect coding, there are some difficulties in Stata.
I first define the effect coded variables as you suggested (price
attributes remain linear coded):
. xi3 e.p_drm e.fl_drm e.ad e.cat e.nobuy
e.p_drm _Ip_drm_0-4 (naturally coded; _Ip_drm_0
omitted)
e.fl_drm _Ifl_drm_0-4 (naturally coded; _Ifl_drm_0
omitted)
e.ad _Iad_0-4 (naturally coded; _Iad_0 omitted)
e.cat _Icat_0-4 (naturally coded; _Icat_0 omitted)
e.nobuy _Inobuy_0-1 (naturally coded; _Inobuy_0
omitted)
When trying to estimate asclogit, some levels are dropped because of
collinearity and the model does not coverge.
asclogit decision p_ppd p_fl _Ip_drm_1 _Ip_drm_2 _Ip_drm_3 _Ip_drm_4
_Ifl_drm_1 _Ifl_drm_2 _Ifl_drm_3 _Ifl_drm_4 _Iad_1 _Iad_2 _Iad_3
_Iad_4 _Icat_1 _Icat_2 _Icat_3 _Icat_4 _Inobuy_1, noconst case(case)
alternatives(alternative)
note: _Icat_4 dropped because of collinearity
note: _Iad_4 dropped because of collinearity
note: model has collinear variables; convergence may not be achieved
This is somewhat surprising as the model is estimated without any
difficulties in Latent Gold:
Attributes:
p_ppd (price I - linear)
-0,3540
p_fl (price II - linear)
-1,1689
p_drm (effect)
1 0,4179
2 -0,2083
3 0,2317
4 -0,4412
fl_drm (effect)
1 0,5906
2 0,3027
3 -0,2918
4 -0,6015
ad (effect)
1 0,4328
2 0,1001
3 -0,0839
4 -0,4489
cat (effect)
1 -0,5570
2 -0,0606
3 0,2858
4 0,3318
nobuy (effect)
0 0,4390
1 -0,4390
The same thing happens using nlogit:
nlogit decision p_ppd p_fl _Ip_drm_1 _Ip_drm_2 _Ip_drm_3 _Ip_drm_4
_Ifl_drm_1 _Ifl_drm_2 _Ifl_drm_3 _Ifl_drm_4 _Iad_1 _Iad_2 _Iad_3
_Iad_4 _Icat_1 _Icat_2 _Icat_3 _Icat_4 _Inobuy_1 || type: ||
alternative:, noconst case(case)
note: _Iad_2 dropped because of collinearity
note: _Icat_2 dropped because of collinearity
note: the model specified for level 2 has collinear variables;
convergence may not be achieved
I assume that this difference is related to the coding of the effect
coded attributes as the model is specified identically in Stata and
Latent Gold otherwise.
Note that the model includes alternative specific attributes (e.g.,
price I & II) which means that some attributes are NOT included in all
alternatives (i.e., there are empty cells in the data set if an
attribute is not included in an alternative). Also, all cells of the
attribute levels are 'empty' in the lines representing the 'no-choice'
alternative . Latent Gold treats these cells as 'empty' and returns
the part-worths for each level of the effect coded attributes without
omitting a level (see above). In Stata, these 'empty' cells are
replaced by zeros and the 0-levels are omitted when the model is
estimated (see above). I would be grateful if you could give some
advice on how to define the effect coded variables correctly.
Especially with regard to the empty cells that are coded as zeros and
the handling of collinerity.
Thanks & best,
Nils
Am 04.12.2010 um 01:31 schrieb Michael Mitchell:
Dear Nils
When I wrote -xi3-, I don't think these models existed (or if they
did, I did not have them in mind). So, unfortunately, -xi3- does not
immediately work with programs like -xtmixed-, or -nlogit-, because it
gets confused by the pipes -||-. However, you can still use -xi3- if
you do it in a two step process. For example, here is using -xi3- in a
one step process to do a regression
. sysuse auto
(1978 Automobile Data)
. xi3: regress price e.rep78 g.foreign
e.rep78 _Irep78_1-5 (naturally coded; _Irep78_1
omitted)
g.foreign _Iforeign_0-1 (naturally coded; _Iforeign_0
omitted)
Source | SS df MS Number of obs
= 69
-------------+------------------------------ F( 5, 63)
= 0.19
Model | 8372481.37 5 1674496.27 Prob > F
= 0.9670
Residual | 568424478 63 9022610.75 R-squared
= 0.0145
-------------+------------------------------ Adj R-squared
= -0.0637
Total | 576796959 68 8482308.22 Root MSE
= 3003.8
------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf.
Interval]
-------------
+----------------------------------------------------------------
_Irep78_2 | 188.879 1024.352 0.18 0.854 -1858.124
2235.882
_Irep78_3 | 646.8116 710.873 0.91 0.366 -773.7548
2067.378
_Irep78_4 | 274.3754 799.3802 0.34 0.733 -1323.059
1871.809
_Irep78_5 | 104.1799 1036.513 0.10 0.920 -1967.126
2175.486
_Iforeign_1 | 36.7572 1010.484 0.04 0.971 -1982.533
2056.048
_cons | 5797.125 581.597 9.97 0.000 4634.896
6959.353
------------------------------------------------------------------------------
Instead, we can first use -xi3- to create the coded variables....
. xi3 e.rep78 g.foreign
e.rep78 _Irep78_1-5 (naturally coded; _Irep78_1
omitted)
g.foreign _Iforeign_0-1 (naturally coded; _Iforeign_0
omitted)
And then we can include the coded variables into the model, as shown
below.
. regress price _Irep78_2 _Irep78_3 _Irep78_4 _Irep78_5 _Iforeign_1
Source | SS df MS Number of obs
= 69
-------------+------------------------------ F( 5, 63)
= 0.19
Model | 8372481.37 5 1674496.27 Prob > F
= 0.9670
Residual | 568424478 63 9022610.75 R-squared
= 0.0145
-------------+------------------------------ Adj R-squared
= -0.0637
Total | 576796959 68 8482308.22 Root MSE
= 3003.8
------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf.
Interval]
-------------
+----------------------------------------------------------------
_Irep78_2 | 188.879 1024.352 0.18 0.854 -1858.124
2235.882
_Irep78_3 | 646.8116 710.873 0.91 0.366 -773.7548
2067.378
_Irep78_4 | 274.3754 799.3802 0.34 0.733 -1323.059
1871.809
_Irep78_5 | 104.1799 1036.513 0.10 0.920 -1967.126
2175.486
_Iforeign_1 | 36.7572 1010.484 0.04 0.971 -1982.533
2056.048
_cons | 5797.125 581.597 9.97 0.000 4634.896
6959.353
------------------------------------------------------------------------------
I know it is a kludge, but I hope it works for you.
Best regards,
Michael N. Mitchell
Data Management Using Stata - http://www.stata.com/bookstore/dmus.html
A Visual Guide to Stata Graphics - http://www.stata.com/bookstore/vgsg.html
Stata tidbit of the week - http://
www.MichaelNormanMitchell.com
On Fri, Dec 3, 2010 at 4:07 PM, Nils Wlömert <[email protected]>
wrote:
Dear listers,
I would like to use effect-coding (via xi3) with nlogit:
xi3: nlogit decision p_ppd p_fl e.p_drm e.fl_drm e.ad e.cat e.nobuy
|| type:
|| alternative:, noconst case(case_1)
However, I get the following error:
"|" invalid name
r(198);
Does xi3 work with nested models at all?
Estimation works fine without effect-coding via xi3.
Many thanks!
Nils
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/