The dimensionality of my parameter space is 15; my sample is pretty
large. I have about 50,000 observations (with lot of variation in all
the regressors). Hence, I do not expect any problems on that count. Of
course, my model might be unidentified for some parameters (though I
don't think that is the case). I will think further about the
identifiability issue.
I have removed all temporary variables from the program now. I have used
scalars for all the temporary variables.
But I cannot clean up the expression for the log likelihood any further.
As I had said earlier, the log likelihood for each observation is of the
following form: ln(a+b); where 'a' and 'b' are huge expressions
involving exponentials, etc. I don't see how I can simplify it any
further; if you can suggest something, that will be great.
As far I can see, all the expressions in the re-worked likelihood
evaluator (given below) are well-behaved. I might be wrong, but let me
just go over each:
q1...q14 are defined in such a manner that they are always within 0 and
1, and sub-groups of them sum to 1. For instance, q1+q2=1; q3+q4+q5=1:
and so on.
x1 is bounded since ML_y1 lies between 2 and 6 in the dataset.
(Additionally, ML_y2...ML_y16 all lies between 0 and 1.)
x2 is just the factorial of ML_y1, and so it is bounded.
x3 is the product of x1 and x2.
x4 and x6 lie between 0 and 1; the limit of x4 is 0 when `theta1' goes
to + or - infinity, and the limit is (1/e) when `theta1' goes to 0.
Additionally, both are continuous functions of `theta1'. So, these
should not cause problems.
x5 is the normal cdf; hence it is always between 0 and 1.
So, the only way the log likelihood (i.e., `lnf') could blow up (to
-infinity) is if the expression to be logged comes very close to 0. But
if we start with initial values of 0 for all the parameters, we are
bounded away from 0 because of the first expression within the log
likelihood (which is (1/x3)*x4*x5). x3 is a finite number; x4 should be
the reciprocal of 'e' and x5 should be 0.5.
I don't see where the log likelihood could blow up. Any suggestions are
welcome. I give the re-worked program below.
___________________________________________________
clear
capture program drop gender1_lf
program gender1_lf
version 8.1
* set mem 100m
#delimit ;
args lnf theta1 theta2 theta3
theta5 theta6 theta7 theta8 theta9
theta10 theta11 theta12 theta13 theta14;
#delimit cr
quietly {
scalar q1 = 1/(1 + exp(`theta5'))
scalar q2 = exp(`theta5')/(1 + exp(`theta5'))
scalar q3 = 1/(1 + exp(`theta6') + exp(`theta7'))
scalar q4 = exp(`theta6')/(1 + exp(`theta6') + exp(`theta7'))
scalar q5 = exp(`theta7')/(1 + exp(`theta6') + exp(`theta7'))
scalar q6 = 1/(1 + exp(`theta8') + exp(`theta9') +
exp(`theta10'))
scalar q7 = exp(`theta8')/(1 + exp(`theta8') + exp(`theta9') +
exp(`theta10'))
scalar q8 = exp(`theta9')/(1 + exp(`theta8') + exp(`theta9') +
exp(`theta10'))
scalar q9 = exp(`theta10')/(1 + exp(`theta8') + exp(`theta9') +
exp(`theta10'))
scalar q10 = 1/(1 + exp(`theta11') + exp(`theta12') +
exp(`theta13') + exp(`theta14'))
scalar q11 = exp(`theta11')/(1 + exp(`theta11') + exp(`theta12') +
exp(`theta13') + exp(`theta14'))
scalar q12 = exp(`theta12')/(1 + exp(`theta11') + exp(`theta12') +
exp(`theta13') + exp(`theta14'))
scalar q13 = exp(`theta13')/(1 + exp(`theta11') + exp(`theta12') +
exp(`theta13') + exp(`theta14'))
scalar q14 = exp(`theta14')/(1 + exp(`theta11') + exp(`theta12') +
exp(`theta13') + exp(`theta14'))
scalar x1 = 2^($ML_y1)
scalar x2 = exp(lnfact($ML_y1))
scalar x3 = x1*x2
scalar x4 = ((exp(`theta1'))^($ML_y1)/exp(exp(`theta1')))
scalar x5 = norm(-`theta3')
scalar x6 =
((exp(`theta1'+`theta2'))^($ML_y1)/exp(exp(`theta1'+`theta2')))
}
#delimit ;
quietly replace `lnf' = ln(((1/x3)*x4*x5) +
((1-x5)*x6*( $ML_y2$ML_y3*q1
+ $ML_y4*q2 +
$ML_y5*q3 + $ML_y6*q4 + $ML_y7*q5 +
$ML_y8*q6 + $ML_y9*q7+ $ML_y10*q8 +
$ML_y11*q9 +
$ML_y12*q10 + $ML_y13*q11 + $ML_y14*q12
+
$ML_y15*q13 + $ML_y16*q14)));
#delimit cr
end
use india2.dta
#delimit ;
ml model lf gender1_lf (one: dfsize p21 p31 p32 p41 p42 p43 p51 p52 p53
p54 = v012 v025, nocons)
(two: )
(three: p61 p62 p63 p64 p65 = v012 v130, nocons)
(four:) (five:) (six:) (sev:) (eight:) (nine:) (ten:) (eleven:)
(twelve:) (thirteen:), tech(bhhh nr);
#delimit cr
____________________________________________________
On Wed, 2006-06-14 at 08:10 +0100, Nick Cox wrote:
> The limit you are approaching may just be that your
> model is not well suited to your data. Very simply,
> what is the dimensionality of your parameter space
> and are your data adequate for what you are trying
> to do?
>
> Nevertheless it seems that you have only partially acted on
> the earlier comments. Glancing at the code again
> I seem to see expressions which are logarithms of
> products, logarithms of ratios, logarithms
> of exponentials, etc. so there may be much more
> tbat you can do by way of cleaning up the expression.
>
> Also, it looks very much as if your thetas
> are all single-valued parameters, in which
> case scalars rather than variables are appropriate
> for what you are calling q1 ...
>
> Nick
> [email protected]
>
> Deepankar Basu
>
> > Nick,
> >
> > Thanks for your comments. I accept your point about a positive
> > likelihood. Actually, there was an error in one of the
> > statements; I had
> > put a "+" in place of a "*". so, now I have negative values of the log
> > likelihood. But, I still don't have convergence.
> >
> > About calculating the log likelihood directly: in my case, the log
> > likelihood for each observation is the log of a sum (of two
> > quantities).
> > Hence, I cannot simply things much using the log function
> > beyond taking
> > the log of the whole expression. That is what I have done
> > now. So, now I
> > am working with the log likelihood directly.
> >
> > Additionally, I have removed some of the temporary variables
> > that I was
> > using to define the log likelihood and have instead used the
> > expressions
> > directly. Since some of the temporary variables (like x3, x4
> > or x6) were
> > exp functions, there was a potential for them to become too large.
> > Hence, I have removed them. The temporary variables that I
> > use now (x1,
> > x2 and x5 apart from q*) are all bounded.
> >
> > Now, when I issue the - ml max - command (after ml check
> > shows that all
> > is well), I get the following error:
> >
> > initial: log likelihood = -2841.7221
> > rescale: log likelihood = -1595.8146
> > rescale eq: log likelihood = -1452.1748
> > (setting optimization to BHHH)
> > numerical derivatives are approximate
> > flat or discontinuous region encountered
> > numerical derivatives are approximate
> > flat or discontinuous region encountered
> > numerical derivatives are approximate
> > flat or discontinuous region encountered
> > could not calculate numerical derivatives
> > missing values encountered
> > r(430);
> >
> > Comments and suggestions are welcome. I give the re-worked do-file
> > below. Thanks,
> > Deepankar
> >
> > ____________________________________________________________
> >
> > clear
> > capture program drop gender1_lf
> > program gender1_lf
> > version 8.1
> > * set mem 100m
> > #delimit ;
> > args lnf theta1 theta2 theta3
> > theta5 theta6 theta7 theta8 theta9
> > theta10 theta11 theta12 theta13 theta14;
> >
> > tempvar q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 q11 q12 q13 q14
> > x1 x2 x5;
> > #delimit cr
> >
> > quietly {
> > gen double `q1' = 1/(1+exp(`theta5'))
> > gen double `q2' = exp(`theta5')/(1+exp(`theta5'))
> >
> > gen double `q3' = 1/(1+exp(`theta6')+exp(`theta7'))
> > gen double `q4' = exp(`theta6')/(1
> > +exp(`theta6')+exp(`theta7'))
> > gen double `q5' = exp(`theta7')/(1
> > +exp(`theta6')+exp(`theta7'))
> >
> > gen double `q6' = 1/(1
> > +exp(`theta8')+exp(`theta9')+exp(`theta10'))
> > gen double `q7' = exp(`theta8')/(1
> > +exp(`theta8')+exp(`theta9')+exp(`theta10'))
> > gen double `q8' = exp(`theta9')/(1
> > +exp(`theta8')+exp(`theta9')+exp(`theta10'))
> > gen double `q9' = exp(`theta10')/(1
> > +exp(`theta8')+exp(`theta9')+exp(`theta10'))
> >
> > gen double `q10' = 1/(1
> > +exp(`theta11')+exp(`theta12')+exp(`theta13')+ exp(`theta14'))
> > gen double `q11' = exp(`theta11')/(1
> > +exp(`theta11')+exp(`theta12')+exp(`theta13')+ exp(`theta14'))
> > gen double `q12' = exp(`theta12')/(1
> > +exp(`theta11')+exp(`theta12')+exp(`theta13')+ exp(`theta14'))
> > gen double `q13' = exp(`theta13')/(1
> > +exp(`theta11')+exp(`theta12')+exp(`theta13')+ exp(`theta14'))
> > gen double `q14' = exp(`theta14')/(1
> > +exp(`theta11')+exp(`theta12')+exp(`theta13')+ exp(`theta14'))
> >
> > gen double `x1' = 2^(-$ML_y1)
> > gen double `x2' = exp(lnfact($ML_y1))
> >
> > gen double `x5' = norm(-`theta3')
> >
> >
> > }
> > #delimit ;
> > quietly replace `lnf' =
> > ln((`x1'/`x2')*(exp(-(exp(`theta1'))))*(exp($ML_y1*`theta1'))*`x5' +
> >
> > ((1-`x5')*(exp(-(exp(`theta2'))))*((exp($ML_y1*`theta2'))))*( $ML_y2 +
> > $ML_y3*`q1' + $ML_y4*`q2' +
> > $ML_y5*`q3' + $ML_y6*`q4'+
> > $ML_y7*`q5'
> > +
> > $ML_y8*`q6' + $ML_y9*`q7'+
> > $ML_y10*`q8'+$ML_y11*`q9' +
> > $ML_y12*`q10' + $ML_y13*`q11' +
> > $ML_y14*`q12'+ $ML_y15*`q13'+$ML_y16*`q14'));
> > #delimit cr
> >
> > end
> >
> >
> > use india2.dta
> >
> > #delimit ;
> > ml model lf gender1_lf (one: dfsize p21 p31 p32 = v012 v025, nocons)
> > (two: p41 p42 p43 p51 p52 p53 p54 = v012 v025)
> > (three: p61 p62 p63 p64 p65 = v012 v130, nocons)
> > (four:) (five:) (six:) (sev:) (eight:) (nine:) (ten:) (eleven:)
> > (twelve:) (thirteen:), tech(bhhh nr);
> >
> > #delimit cr
> >
> >
> >
> > _____________________________________________________________________
> >
> >
> >
> >
> > On Tue, 2006-06-13 at 17:46 +0100, Nick Cox wrote:
> > > I am not an -ml- expert, but I sense some basic
> > > misunderstandings.
> > >
> > > A positive likelihood is not problematic as such.
> > > Although a probability cannot exceed 1,
> > > this is not true of a probability density. Whenever
> > > probability densities are typically above 1, their product
> > > will be too.
> > >
> > > However, your program calculates the likelihood directly,
> > > before logging it, and the numbers become far too big to handle.
> > > exp(35000) is big!!!
> > >
> > > Thus it is the likelihood that becomes missing, not your
> > > original data. Stata would just ignore any data missings.
> > >
> > > You must rewrite your program so that you
> > > are working in terms of the log likelihood throughout.
> > >
> > > A side-issue with your program is that you appear to
> > > be putting lots of constants into variables. Using
> > > scalars is much better style and less wasteful
> > > of storage.
> > >
> > > Nick
> > > [email protected]
> > >
> > > DEEPANKAR BASU
> > > >
> > > > I am facing problems while trying to do a maximum likelihood
> > > > estimation. I have written out the likelihood estimator; then
> > > > I have used "ml check". The "ml check" shows that my
> > > > likelihood evaluator has passed all tests. Then when I issue
> > > > the "ml max" command, I get the following error message:
> > > >
> > > > initial: log likelihood = 31058.016
> > > > rescale: log likelihood = 31058.016
> > > > rescale eq: log likelihood = 34639.752
> > > > (setting optimization to BHHH)
> > > > could not calculate numerical derivatives
> > > > missing values encountered
> > > > r(430);
> > > >
> > > > First thing that worries me is that the log likelihood is
> > > > positive; is that a sign that my program has some problem?
> > > > And second, the maximisation is interrupted before completion
> > > > because it could not calculate numerical derivatives (missing
> > > > values encountered). I re-checked my dataset; there are no
> > > > missing values for the variables that I am using. Probably
> > > > the error message means something else.
> > > >
> > > > I have tried (1) switching between techniques BHHH and NR;
> > > > (2) ml max, difficult. I still get the same error message
> > > > (though the log likelihood values where the execution of the
> > > > program stops are different in each case).
> > > >
> > > >
> > > > Below I give the do-file that has my likelihood evaluator
> > > > program and also a "ml model ..." command (as suggested in
> > > > the book by Gould, Pitbaldo and Sribney: page 165); any
> > > > suggestions are welcome.
> > > >
> > > > If someone can detect some error immediately, that will be
> > > > great. But, if someone needs to run the program, I can send a
> > > > small subset of my dataset for the purpose. Thanks in advance.
> > > >
> > > > Deepankar Basu
> > > >
> > > > ____________________________________________
> > > >
> > > >
> > > > clear
> > > > capture program drop gender_lf
> > > > program gender_lf
> > > > version 8.1
> > > > set mem 100m
> > > > #delimit ;
> > > > args lnf theta1 theta2 theta3
> > > > theta5 theta6 theta7 theta8 theta9
> > > > theta10 theta11 theta12 theta13 theta14;
> > > >
> > > > tempvar q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 q11 q12 q13 q14
> > > > x x1 x2 x3 x4 x5 x6 x7;
> > > > #delimit cr
> > > >
> > > > quietly {
> > > > gen double `q1' = 1/(1+exp(`theta5'))
> > > > gen double `q2' = exp(`theta5')/(1+exp(`theta5'))
> > > >
> > > > gen double `q3' = 1/(1+exp(`theta6')+exp(`theta7'))
> > > > gen double `q4' =
> > > > exp(`theta6')/(1+exp(`theta6')+exp(`theta7'))
> > > > gen double `q5' =
> > > > exp(`theta7')/(1+exp(`theta6')+exp(`theta7'))
> > > >
> > > > gen double `q6' =
> > > > 1/(1+exp(`theta8')+exp(`theta9')+exp(`theta10'))
> > > > gen double `q7' =
> > > > exp(`theta8')/(1+exp(`theta8')+exp(`theta9')+exp(`theta10'))
> > > > gen double `q8' =
> > > > exp(`theta9')/(1+exp(`theta8')+exp(`theta9')+exp(`theta10'))
> > > > gen double `q9' =
> > > > exp(`theta10')/(1+exp(`theta8')+exp(`theta9')+exp(`theta10'))
> > > >
> > > > gen double `q10' =
> > > > 1/(1+exp(`theta11')+exp(`theta12')+exp(`theta13')+
> > exp(`theta14'))
> > > > gen double `q11' =
> > > > exp(`theta11')/(1+exp(`theta11')+exp(`theta12')+exp(`theta13')
> > > > + exp(`theta14'))
> > > > gen double `q12' =
> > > > exp(`theta12')/(1+exp(`theta11')+exp(`theta12')+exp(`theta13')
> > > > + exp(`theta14'))
> > > > gen double `q13' =
> > > > exp(`theta13')/(1+exp(`theta11')+exp(`theta12')+exp(`theta13')
> > > > + exp(`theta14'))
> > > > gen double `q14' =
> > > > exp(`theta14')/(1+exp(`theta11')+exp(`theta12')+exp(`theta13')
> > > > + exp(`theta14'))
> > > >
> > > > gen double `x1' = 2^(-$ML_y1)
> > > > gen double `x2' = exp(lnfact($ML_y1))
> > > > gen double `x3' = exp(`theta1')
> > > > gen double `x4' = exp($ML_y1*`theta1')
> > > > gen double `x5' = norm(-`theta3')
> > > > gen double `x6' = exp(`theta2')
> > > > gen double `x7' = exp($ML_y1*`theta2')
> > > > #delimit ;
> > > > gen double `x' = `x1'*(1/`x2')*exp(-`x3')*`x4'*`x5' +
> > > > (1-`x5')*exp(-`x6')*`x7' +
> > > > $ML_y2 + ($ML_y3*`q1' + $ML_y4*`q2') +
> > > > ($ML_y5*`q3' + $ML_y6*`q4'+
> > $ML_y7*`q5') +
> > > > ($ML_y8*`q6' + $ML_y9*`q7'+
> > > > $ML_y10*`q8'+$ML_y11*`q9') +
> > > > ($ML_y12*`q10' + $ML_y13*`q11' +
> > > > $ML_y14*`q12'+ $ML_y15*`q13'+$ML_y16*`q14');
> > > > #delimit cr
> > > >
> > > > }
> > > >
> > > > quietly replace `lnf' = ln(`x')
> > > > end
> > > >
> > > >
> > > > use india1.dta
> > > >
> > > > #delimit ;
> > > > ml model lf gender_lf (one: dfsize p21 p31 p32 = v012
> > v025, nocons)
> > > > (two: p41 p42 p43 p51 p52 p53 p54 = v012 v025)
> > > > (three: p61 p62 p63 p64 p65 = v012 v025 v130, nocons)
> > > > (four:) (five:) (six:) (sev:) (eight:) (nine:) (ten:)
> > > > (eleven:) (twelve:) (thirteen:), tech(bhhh nr);
> > > >
> > > > #delimit cr
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/