Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: RE: RE: RE: RE: Discrete choice in MATA
From
Henk-Wim de Boer <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
RE: st: RE: RE: RE: RE: Discrete choice in MATA
Date
Thu, 4 Jul 2013 14:06:06 +0200
Dear Mat,
Again, thanks a lot!
Yesterday, I did succeed in setting up the clogit in Mata and to derive the first derivative analytically (still struggling with the second order derivatives though).
Here, I have used the mixed logit code in the Stata journal "Adaptive Markov Chain Monte Carlo sampling and estimation in Mata" as an example. My code now looks as follow:
void logistic(todo, b, y, X, fv, g, H)
{
U=rowsum(X:*b)
expU = exp(U)
U = colshape(U,6)
num = exp(rowsum(U:*colshape(y,6)))
den = rowsum(exp(U))
fv = ln(num:/den)
}
Kind regards,
Henk-Wim
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Matthew Baker
Sent: 03 July 2013 17:33
To: [email protected]
Subject: Re: st: RE: RE: RE: Discrete choice in MATA
Henk-Wim --
I see; the conditional logit poses some different challenges the a
regular multinomial logit, I guess. I think there are two potential
things you might want to explore, and it depends upon how expansive
your data set is. I've tried to give two examples below, but I'm not
sure if these are right in all the details of how a conditional logit
should be estimated, and they give slightly different results than
using stata's clogit, but hopefully the ideas are clear. My example
works with a small number of observations (6, to be exact!)
Anyways - if you have a large data set with a lot of groups, you might
want to avoid looping over individuals/groups, and still employ a
reshape-wide type solution. As a first example, where the data are
reshaped first:
/* Begin example */
clear all
use "http://www.stata-press.com/data/r11/clogitid"
bysort id: egen fullcount=max(count)
drop count
keep if fullcount==6
bysort id: gen num=_n
reshape wide y x1 x2, i(id) j(num)
mata
void clog_with_6_alts_wide(M,b,fv)
{
beta1=moptimize_util_xb(M,b,1)
beta2=moptimize_util_xb(M,b,2)
Y=moptimize_util_userinfo(M,1)
X1=moptimize_util_userinfo(M,2)
X2=moptimize_util_userinfo(M,3)
eXb=J(rows(Y),6,.)
for (i=1;i<=6;i++) eXb[,i]=exp(beta1*X1[,i]+beta2*X2[,i])
numer=rowsum(Y:*eXb)
denom=rowsum(eXb)
fv=ln(numer:/denom)
}
st_view(y=.,.,"y1 y2 y3 y4 y5 y6")
st_view(X1=.,.,"x11 x12 x13 x14 x15 x16")
st_view(X2=.,.,"x21 x22 x23 x24 x25 x26")
M=moptimize_init()
moptimize_init_evaluator(M,&clog_with_6_alts_wide())
moptimize_init_evaluatortype(M,"lf")
moptimize_init_userinfo(M,1,y)
moptimize_init_userinfo(M,2,X1)
moptimize_init_userinfo(M,3,X2)
moptimize_init_eq_indepvars(M,1,"")
moptimize_init_eq_indepvars(M,2,"")
moptimize(M)
moptimize_result_display(M)
end
/* End Example */
However, if you don't have a huge amount of observations, you might
just loop over each group or panel. This is pretty easy when using
Mata's "panelsetup" tools. As a second example, using panelsetup:
/* Begin Example */
clear all
use "http://www.stata-press.com/data/r11/clogitid"
bysort id: egen fullcount=max(count)
drop count
keep if fullcount==6
mata:
void clog_with_6_alts_long(M,todo,b,fv,g,H)
{
Y=moptimize_util_depvar(M,1)
X=moptimize_util_xb(M,b,1)
panels=moptimize_util_userinfo(M,1)
fv=J(rows(panels),1,.)
for (i=1;i<=rows(panels);i++) {
Yp=panelsubmatrix(Y,i,panels)
Xp=panelsubmatrix(X,i,panels)
eXp=exp(Xp)
numerp=colsum(Yp:*eXp)
denomp=colsum(eXp)
fv[i]=ln(numerp/denomp)
}
}
M=moptimize_init()
moptimize_init_evaluator(M,&clog_with_6_alts_long())
moptimize_init_evaluatortype(M,"v0")
moptimize_init_depvar(M,1,"y")
moptimize_init_eq_indepvars(M,1,"x1 x2")
moptimize_init_eq_cons(M,1,"off")
st_view(id=.,.,"id")
panels=panelsetup(id,1)
moptimize_init_userinfo(M,1,panels)
moptimize(M)
moptimize_result_display(M)
end
/* End Example */
Hope that helps!
Mat
On Mon, Jul 1, 2013 at 10:38 AM, Henk-Wim de Boer <[email protected]> wrote:
> Dear Matt,
>
> Thanks for the advice!
>
> Unfortunately, I do not see how I can apply this code to my own problem.
> In the example, you estimate 2 equations for 3 alternatives (normalize first alternative) as is common in a multinomial logit.
> I want to estimate a conditional logit, so only 1 equation. There are 6 alternatives and 2 explanatory variables (lny and lnl).
> When I use the wide format, I end up with a matrix (n,12) with n = number of individuals en 2*6 = 12. Here is my updated code:
>
>
> /* Begin example */
> sort strata alternative
> reshape wide lny_1 lnl choice , i(strata) j(alternative)
>
> mata
>
> st_view(Z=0,.,("lny_11", "lny_12", "lny_13", "lny_14", "lny_15", "lny_16" , "lnl1", "lnl2", "lnl3", "lnl4", "lnl5", "lnl6" ))
> st_view(choice=0,.,("choice1","choice2","choice3","choice4","choice5","choice6" ))
>
> void logistic(M, b, fv)
> {
> real scalar i
> real matrix y,xb
>
> y=moptimize_util_depvar(M,1)
>
> xb=moptimize_util_xb(M,b,1)
>
> num=rowsum(y:*exp(xb))
> den=rowsum(exp(xb))
>
> fv=ln(num:/den)
> }
>
> M=moptimize_init()
> moptimize_init_evaluator(M,&logistic())
> moptimize_init_evaluatortype(M,"lf")
> moptimize_init_eq_cons(M, 1, "off")
> moptimize_init_depvar(M,1,choice)
> moptimize_init_eq_indepvars(M,1,Z)
>
> moptimize(M)
> moptimize_result_display(M)
> end
>
> /* End example */
>
> Kind regards,
>
> Henk-Wim
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Matthew Baker
> Sent: 25 June 2013 15:46
> To: [email protected]
> Subject: Re: st: RE: RE: Discrete choice in MATA
>
> Henk-Wim --
>
> I'm not sure if someone else took a crack at this, but I think that
> the following might help a little. First, I might make two suggestions
> that I have found useful in these cases. First, I would work with data
> in the wide form, not the long form, as you have it. You can convert
> it using reshape, or some of the functions that come with the
> "case2alt" package by Scott Long and Jeremy Freese. (net search
> case2alt). I just use reshape in the little example below.
>
> Second, I might suggest working with moptimize instead of optimize, as
> this eases transferring data from stata to mata, and having to parse
> all of the parameters in beta yourself. Here's a little example of a
> multinomial logit model. I "difference" the independent variable and
> normalize the utility from the first option to zero:
>
> /* Begin example */
>
> clear all
> webuse choice
>
> /* reshape data */
> bysort id: gen t=_n
> reshape wide car size choice dealer, i(id) j(t)
>
> /* differences */
> gen dealer2m1=dealer2-dealer1
> gen dealer3m1=dealer3-dealer1
>
> mata
> void logistic(M, b, fv)
> {
> real scalar i
> real matrix y1, y2, y3, xb2, xb3
> y1=moptimize_util_depvar(M,1)
> y2=moptimize_util_depvar(M,2)
> y3=moptimize_util_depvar(M,3)
> xb2=moptimize_util_xb(M,b,1)
> xb3=moptimize_util_xb(M,b,2)
> num=rowsum(y1:+y2:*exp(xb2):+y3:*exp(xb3))
> den=rowsum(1:+exp(xb2):+exp(xb3))
> fv=ln(num:/den)
> }
>
> M=moptimize_init()
> moptimize_init_evaluator(M,&logistic())
> moptimize_init_evaluatortype(M,"lf")
> moptimize_init_depvar(M,1,"choice1")
> moptimize_init_depvar(M,2,"choice2")
> moptimize_init_depvar(M,3,"choice3")
> moptimize_init_eq_indepvars(M,1,"dealer2m1")
> moptimize_init_eq_indepvars(M,2,"dealer3m1")
> moptimize(M)
> moptimize_result_display(M)
> end
>
> /* End example */
>
> Hope that helps!
>
> Matt
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Fri, Jun 21, 2013 at 6:53 AM, Henk-Wim de Boer <[email protected]> wrote:
>> Hi Tim,
>>
>> The reason for this is that I am using simulated maximum likelihood, where I use 10 draws from the wage distribution. In addition, I extended our basic labour supply model with 4 additional alternatives for childcare.
>> I end up with 24 alternatives for singles and 144 alternatives for couples. I estimate the model, using a large rich dataset, with a lot of alternatives and parameters. Consequently, the estimation for couples takes approximately 4 days. In the future, I want to increase the number of draws from the wage distribution as well and therefore I want to speed up the estimation.
>>
>> I tried so by deriving the first and second order derivatives analytically (ml model d1 and d2) but this did not work for multiple draws from the wage distribution. The model only converges if I allow for 1 draw and not for multiple draws. It seems that rounding errors prevent the model from converging (the first derivative is close to 0 and remains close to 0). Now I want to program it in Mata.
>>
>> Henk-Wim
>>
>> -----Original Message-----
>> From: [email protected] [mailto:[email protected]] On Behalf Of Timothy Mak
>> Sent: 21 June 2013 04:12
>> To: [email protected]
>> Subject: st: RE: Discrete choice in MATA
>>
>> Hi,
>>
>> Discrete choice models are well-covered by Stata commands such as -clogit- and -asclogit-.
>> http://www.ats.ucla.edu/stat/stata/seminars/stata10/choice_models.htm
>>
>> Why do you want to implement them yourself?
>>
>> Tim
>>
>> -----Original Message-----
>> From: [email protected] [mailto:[email protected]] On Behalf Of Henk-Wim de Boer
>> Sent: 20 June 2013 22:37
>> To: '[email protected]'
>> Subject: st: Discrete choice in MATA
>>
>> Hello,
>>
>> I am relatively new at MATA and I am trying to program maximum likelihood estimation for my discrete choice model.
>> The labour supply model allows for 6 labour supply alternatives at the individual level and an example is included below for an individual:
>>
>> Id h y_1 l
>> 1 0 2 80
>> 1 8 4 72
>> 1 16 6 64
>> 1 24 8 56
>> 1 32 10 48
>> 1 40 12 40
>>
>> The model can be estimated by maximum likelihood, where the log likelihood is as follows: ln(L) = ln{ exp(xb)/sum(exp(xb)) } over all individuals.
>> This works well in STATA by using ml model but I now want to program it in MATA. The difficulty here is constructing the denominator, i.e the summation of exp(xb) over all alternatives.
>>
>> My code is as follows:
>>
>> mata
>>
>> st_view(X=0,.,("lny_1", "lnl"))
>> st_view(y=0,.,("choice"))
>>
>> void logistic(todo, p, y, X, lf, g, H)
>> {
>> b = p[1, (1::cols(X))]' /* Transpose such that b is a column vector*/
>>
>> denom = colsum(exp(X*b)) /*HERE I NEED TO SUM OVER THE ALTERNATIVES PER INDIVIDUAL , WHICH DOES NOT WORK FOR ME*/
>>
>> lf = y' * ln(exp(X*b)/denom)
>>
>> }
>>
>>
>> S = optimize_init()
>> optimize_init_evaluator(S, &logistic())
>> optimize_init_params(S, (0, 0)) /* One extra element. Starting values are (0, 0)*/
>> optimize_init_argument(S, 1, y)
>> optimize_init_argument(S, 2, X)
>> p = optimize(S)
>>
>> p
>>
>> end
>>
>> Can anyone please help me out?
>>
>> Henk-Wim de Boer
>>
>>
>>
>> --
>> ================================================================================
>> Dit bericht kan informatie bevatten die niet voor u is bestemd. Indien u niet
>> de geadresseerde bent of dit bericht abusievelijk aan u is toegezonden, wordt
>> u verzocht dat aan de afzender te melden en het bericht te verwijderen.
>> De Staat aanvaardt geen aansprakelijkheid voor schade, van welke aard ook, die
>> verband houdt met risico's verbonden aan het elektronisch verzenden van
>> berichten.
>>
>> This message may contain information that is not intended for you. If you are
>> not the addressee or if this message was sent to you by mistake, you are
>> requested to inform the sender and delete the message. The State accepts no
>> liability for damage of any kind resulting from the risks inherent in the
>> electronic transmission of messages.
>> ================================================================================
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>>
>> --
>> ================================================================================
>> Dit bericht kan informatie bevatten die niet voor u is bestemd. Indien u niet
>> de geadresseerde bent of dit bericht abusievelijk aan u is toegezonden, wordt
>> u verzocht dat aan de afzender te melden en het bericht te verwijderen.
>> De Staat aanvaardt geen aansprakelijkheid voor schade, van welke aard ook, die
>> verband houdt met risico's verbonden aan het elektronisch verzenden van
>> berichten.
>>
>> This message may contain information that is not intended for you. If you are
>> not the addressee or if this message was sent to you by mistake, you are
>> requested to inform the sender and delete the message. The State accepts no
>> liability for damage of any kind resulting from the risks inherent in the
>> electronic transmission of messages.
>> ================================================================================
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>
>
>
> --
> Dr. Matthew J. Baker
> Department of Economics
> Hunter College and the Graduate Center, CUNY
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
>
> --
> ================================================================================
> Dit bericht kan informatie bevatten die niet voor u is bestemd. Indien u niet
> de geadresseerde bent of dit bericht abusievelijk aan u is toegezonden, wordt
> u verzocht dat aan de afzender te melden en het bericht te verwijderen.
> De Staat aanvaardt geen aansprakelijkheid voor schade, van welke aard ook, die
> verband houdt met risico's verbonden aan het elektronisch verzenden van
> berichten.
>
> This message may contain information that is not intended for you. If you are
> not the addressee or if this message was sent to you by mistake, you are
> requested to inform the sender and delete the message. The State accepts no
> liability for damage of any kind resulting from the risks inherent in the
> electronic transmission of messages.
> ================================================================================
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
--
Dr. Matthew J. Baker
Department of Economics
Hunter College and the Graduate Center, CUNY
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
--
================================================================================
Dit bericht kan informatie bevatten die niet voor u is bestemd. Indien u niet
de geadresseerde bent of dit bericht abusievelijk aan u is toegezonden, wordt
u verzocht dat aan de afzender te melden en het bericht te verwijderen.
De Staat aanvaardt geen aansprakelijkheid voor schade, van welke aard ook, die
verband houdt met risico's verbonden aan het elektronisch verzenden van
berichten.
This message may contain information that is not intended for you. If you are
not the addressee or if this message was sent to you by mistake, you are
requested to inform the sender and delete the message. The State accepts no
liability for damage of any kind resulting from the risks inherent in the
electronic transmission of messages.
================================================================================
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/