--"Russo, Giovanni" <[email protected]> wrote,
> [...]
> I have mlogit type of data. One observation per decision maker, the
> variables record the characterisitcs of the decision maker (her age, income,
> educational level, occupation....) plus one varaible recording the chosen
> alternative.
>
> I would like to know whether someone has developed a routine to estimate a
> nested logit model from this type of data.
I am not aware of a programmed routine, but here is an example on how to do it
manually using -reshape-:
----------------------------Begin Example----------------------------------
A dataset suitable for mlogit has one observation for each individual, and a
dependent variable indicates the choice for each individual. For example, we
have data on 295 consumers and their choice of automobile. In this dataset,
car is the dependent var iable, and indicates the origin of the car:
        id        car     income        sex
         1     Europe       46.7       male
         2   American       26.1       male
         3   American       32.7       male
         4      Japan       49.2     female
         5   American       24.3       male
         6   American         39     female
         7   American         33       male
         8   American       20.3       male
         9      Japan         38       male
        10   American       60.4     female
...
A dataset suitable for nlogit has one observation for each alternative within
each individual, and a dichotomous dependent variable indicates whether the
alternative is chosen. Based on the above example, the long form dataset would
look like
        id     choice        car     income        sex
         1          0   American       46.7       male
         1          0      Japan       46.7       male
         1          1     Europe       46.7       male
         2          1   American       26.1       male
         2          0      Japan       26.1       male
         2          0     Europe       26.1       male
         3          1   American       32.7       male
         3          0      Japan       32.7       male
         3          0     Europe       32.7       male
...
where choice is the dependent variable.
To reshape a dataset from mlogit (wide) form to nlogit (long) form, first
expand the dataset by the number of alternatives, and then generate a variable
that indicates the alternatives. The dependent variable indicates whether the
alternative is chosen.
Let's start with the wide form of our automobile purchasing example and
convert it to long form, which is suitable for nlogit:
. list id car income sex in 1/9, nolab
            id        car     income        sex
  1.         1          3       46.7          1
  2.         2          1       26.1          1
  3.         3          1       32.7          1
  4.         4          2       49.2          0
  5.         5          1       24.3          1
  6.         6          1         39          0
  7.         7          1         33          1
  8.         8          1       20.3          1
  9.         9          2         38          1
. expand 3
(590 observations created)
. sort id
. list id car income sex in 1/9, nolab
            id        car     income        sex
  1.         1          3       46.7          1
  2.         1          3       46.7          1
  3.         1          3       46.7          1
  4.         2          1       26.1          1
  5.         2          1       26.1          1
  6.         2          1       26.1          1
  7.         3          1       32.7          1
  8.         3          1       32.7          1
  9.         3          1       32.7          1
. by id: gen temp = _n
. gen choice = (temp==car)
. drop car
. rename temp car
. list id choice car income sex in 1/9
            id     choice        car     income        sex
  1.         1          0          1       46.7       male
  2.         1          0          2       46.7       male
  3.         1          1          3       46.7       male
  4.         2          1          1       26.1       male
  5.         2          0          2       26.1       male
  6.         2          0          3       26.1       male
  7.         3          1          1       32.7       male
  8.         3          0          2       32.7       male
  9.         3          0          3       32.7       male
Now, we have the dataset in long form.  How can we run nlogit from here?
Suppose a consumer makes a decision by first deciding whether to buy a
domestic car (car=1) or a foreign car (car=2, 3), and then, if a a foreign car
is chosen, the consumer chooses whether to buy a Japanese or European car.
. nlogitgen type=car(domestic:1, foreign: 2 3)
new variable type is generated with 2 groups
lb_type:
           1 domestic
           2 foreign
To run nlogit, we must interact the attributes with dummy variables for the choi
ces:
. gen sexJap = sex*(car==2)
. gen sexEur = sex*(car==3)
. gen incJap = income*(car==2)
. gen incEur = income*(car==3)
. gen consFor = (type==2)
. nlogit choice (car=sexJap sexEur incJap incEur) (type=consFor),group(id) nolog
tree structure specified for the nested logit model
         top-->bottom
        type         car
------------------------
    domestic    American
     foreign       Japan
                  Europe
Nested logit
Levels             =          2                 Number of obs      =       885
Dependent variable =     choice                 LR chi2(6)         =  170.2271
Log likelihood     = -238.97708                 Prob > chi2        =    0.0000
------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
car          |
      sexJap |  -.2050568   .1849569    -1.11   0.268    -.5675657    .1574521
      sexEur |   .4757615   .4718509     1.01   0.313    -.4490493    1.400572
      incJap |   .0088892   .0038311     2.32   0.020     .0013804     .016398
      incEur |  -.0187893   .0066684    -2.82   0.005    -.0318592   -.0057194
-------------+----------------------------------------------------------------
type         |
     consFor |   21.11492   18.99008     1.11   0.266    -16.10495     58.3348
-------------+----------------------------------------------------------------
(IV params)  |
             |
type         |
    /domesti |         .5          .        .       .            .           .
    /foreign |  -32.87729   32.42436    -1.01   0.311    -96.42787    30.67329 
------------------------------------------------------------------------------
LR test of homoskedasticity (iv = 1): chi2(1)=   29.50    Prob > chi2 = 0.0000
------------------------------------------------------------------------------
-----------------------------End Example-----------------------------------
Weihua Guan <[email protected]>
Stata Corp.
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/