--"Russo, Giovanni" <[email protected]> wrote,
> [...]
> I have mlogit type of data. One observation per decision maker, the
> variables record the characterisitcs of the decision maker (her age, income,
> educational level, occupation....) plus one varaible recording the chosen
> alternative.
>
> I would like to know whether someone has developed a routine to estimate a
> nested logit model from this type of data.
I am not aware of a programmed routine, but here is an example on how to do it
manually using -reshape-:
----------------------------Begin Example----------------------------------
A dataset suitable for mlogit has one observation for each individual, and a
dependent variable indicates the choice for each individual. For example, we
have data on 295 consumers and their choice of automobile. In this dataset,
car is the dependent var iable, and indicates the origin of the car:
id car income sex
1 Europe 46.7 male
2 American 26.1 male
3 American 32.7 male
4 Japan 49.2 female
5 American 24.3 male
6 American 39 female
7 American 33 male
8 American 20.3 male
9 Japan 38 male
10 American 60.4 female
...
A dataset suitable for nlogit has one observation for each alternative within
each individual, and a dichotomous dependent variable indicates whether the
alternative is chosen. Based on the above example, the long form dataset would
look like
id choice car income sex
1 0 American 46.7 male
1 0 Japan 46.7 male
1 1 Europe 46.7 male
2 1 American 26.1 male
2 0 Japan 26.1 male
2 0 Europe 26.1 male
3 1 American 32.7 male
3 0 Japan 32.7 male
3 0 Europe 32.7 male
...
where choice is the dependent variable.
To reshape a dataset from mlogit (wide) form to nlogit (long) form, first
expand the dataset by the number of alternatives, and then generate a variable
that indicates the alternatives. The dependent variable indicates whether the
alternative is chosen.
Let's start with the wide form of our automobile purchasing example and
convert it to long form, which is suitable for nlogit:
. list id car income sex in 1/9, nolab
id car income sex
1. 1 3 46.7 1
2. 2 1 26.1 1
3. 3 1 32.7 1
4. 4 2 49.2 0
5. 5 1 24.3 1
6. 6 1 39 0
7. 7 1 33 1
8. 8 1 20.3 1
9. 9 2 38 1
. expand 3
(590 observations created)
. sort id
. list id car income sex in 1/9, nolab
id car income sex
1. 1 3 46.7 1
2. 1 3 46.7 1
3. 1 3 46.7 1
4. 2 1 26.1 1
5. 2 1 26.1 1
6. 2 1 26.1 1
7. 3 1 32.7 1
8. 3 1 32.7 1
9. 3 1 32.7 1
. by id: gen temp = _n
. gen choice = (temp==car)
. drop car
. rename temp car
. list id choice car income sex in 1/9
id choice car income sex
1. 1 0 1 46.7 male
2. 1 0 2 46.7 male
3. 1 1 3 46.7 male
4. 2 1 1 26.1 male
5. 2 0 2 26.1 male
6. 2 0 3 26.1 male
7. 3 1 1 32.7 male
8. 3 0 2 32.7 male
9. 3 0 3 32.7 male
Now, we have the dataset in long form. How can we run nlogit from here?
Suppose a consumer makes a decision by first deciding whether to buy a
domestic car (car=1) or a foreign car (car=2, 3), and then, if a a foreign car
is chosen, the consumer chooses whether to buy a Japanese or European car.
. nlogitgen type=car(domestic:1, foreign: 2 3)
new variable type is generated with 2 groups
lb_type:
1 domestic
2 foreign
To run nlogit, we must interact the attributes with dummy variables for the choi
ces:
. gen sexJap = sex*(car==2)
. gen sexEur = sex*(car==3)
. gen incJap = income*(car==2)
. gen incEur = income*(car==3)
. gen consFor = (type==2)
. nlogit choice (car=sexJap sexEur incJap incEur) (type=consFor),group(id) nolog
tree structure specified for the nested logit model
top-->bottom
type car
------------------------
domestic American
foreign Japan
Europe
Nested logit
Levels = 2 Number of obs = 885
Dependent variable = choice LR chi2(6) = 170.2271
Log likelihood = -238.97708 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
car |
sexJap | -.2050568 .1849569 -1.11 0.268 -.5675657 .1574521
sexEur | .4757615 .4718509 1.01 0.313 -.4490493 1.400572
incJap | .0088892 .0038311 2.32 0.020 .0013804 .016398
incEur | -.0187893 .0066684 -2.82 0.005 -.0318592 -.0057194
-------------+----------------------------------------------------------------
type |
consFor | 21.11492 18.99008 1.11 0.266 -16.10495 58.3348
-------------+----------------------------------------------------------------
(IV params) |
|
type |
/domesti | .5 . . . . .
/foreign | -32.87729 32.42436 -1.01 0.311 -96.42787 30.67329
------------------------------------------------------------------------------
LR test of homoskedasticity (iv = 1): chi2(1)= 29.50 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
-----------------------------End Example-----------------------------------
Weihua Guan <[email protected]>
Stata Corp.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/