[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: [Mata newbie] optimize() vs -ml-

From	[email protected] (Jeff Pitblado, StataCorp LP)
To	[email protected]
Subject	Re: st: [Mata newbie] optimize() vs -ml-
Date	Wed, 17 Oct 2007 11:30:15 -0500

Antoine Terracol <[email protected]> followed-up with some
timing comparisons between -ml- with the -d2- evaluator I posted and
-optimize()- with the -v2- evaluator:

> Playing with your code, I noticed that -optimize- is indeed faster than 
> -ml- for small sample sizes, but that the reverse seems to be true for 
> larger N
> 
> I ran a few simulations (-simulate-, 100 replications) with varying N.
> 
> Below is the results using -v2- and -d2- (the same pattern can be found 
> with v1/d1 and v0/d0):
> 
> N=100
>      Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>      optimize |       100      .00423    .0069947          0       .016
>            ml |       100      .05278    .0082298       .046       .078
> 
> N=1000
>      Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>      optimize |       100      .03333    .0061727       .015       .062
>            ml |       100      .07604    .0079085       .062       .094
> 
> N=5000
>      Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>      optimize |       100      .16598     .015475        .14       .218
>            ml |       100      .18982     .014276       .171        .25
> 
> N=10000
>      Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>      optimize |       100      .33812    .0282104       .296       .422
>            ml |       100      .33324    .0244937       .296       .406
> 
> N=50000
>      Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>      optimize |       100     2.00641    .2400445      1.625      2.875
>            ml |       100     1.73035    .2316291      1.406      2.719

I forgot to mention something in my previous email, which Bill touched upon in
his reply.  The type -v- evaluators of -optimize()- (v0, v1, and v2) are not
as comparable with the type -d- evaluators of -ml- because they carry extra
overhead.  This overhead is principally due to the parameter-level score
matrix.  You can image that as the dataset gets longer and wider that this
score matrix grows too, and each iteration requires a new score matrix to be
produced by the evaluator.

I originally kept with the type -v- evaluator for -optimize()- because I
assumed that Antoine might want to use the -bhhh- technique or produce robust
variance estimates (neither of which are possible with the type -d- evaluators
in -optimize()-).  -ml- can employ the -bhhh- technique and produce robust
variance estimates with -d1- and -d2- evaluators that return the equation
level scores, because it uses tools that employ the chain rule for the linear
predictions of each equation.  As Bill said, -optimize()- does not have a
concept of equations and thus cannot employ the chain rule for us.

I have a type -d- evaluator that Antoine can use with -optimize()- to get a
better timing comparison between -ml- and -optimize()- (when you do not need
the -bhhh- technique or a robust VCE); see the Stata code at the end of this
email.  This probit likelihood evaluator employs the chain rule to produce a
gradient similar to -ml-'s -mlvecsum- command.

Here are some results from a single timing on my machine:

For d2:
. timer list
   1:      1.48 /        1 =       1.4780
   2:      1.70 /        1 =       1.7000

For d1:
. timer list
   1:      4.32 /        1 =       4.3170
   2:      5.07 /        1 =       5.0660

For d0:
. timer list
   1:      6.08 /        1 =       6.0790
   2:      6.82 /        1 =       6.8190

--Jeff
[email protected]

***** BEGIN:
clear all

local evaltype d1

set mem 20m
set obs 50000

drawnorm x1 x2 eps

gen cons=1

gen y=(1+x1+x2+eps>0)

timer clear 1

timer clear 2

mata:
void lnprobit_d2(
	real scalar	todo,
	real rowvector	beta,
	real colvector	y,
	real matrix	X,
	real scalar	lnf,
	real rowvector	g,
	real matrix	H
)
{
	real colvector	pm
	real colvector	xb
	real colvector	lj
	real colvector	dllj
	real colvector	d2llj

	pm = 2*(y :!= 0) :- 1
	xb = X*beta'

	lj = normal(pm:*xb)
	if (any(lj :== 0)) {
		lnf = .
		return
	}
	lnf = quadcolsum(ln(lj))
	if (todo == 0 | missing(lnf)) return

	dllj = pm :* normalden(xb) :/ lj
	if (missing(dllj)) {
		lnf = .
		return
	}
	g = cross(dllj, X)
	if (todo == 1) return

	d2llj = dllj :* (dllj + xb)
	if (missing(d2llj)) {
		lnf = .
		return
	}
	H = - cross(X, d2llj, X)
}

timer_on(1)
y=x=.
st_view(y,  ., "y")
st_view(x,  ., ("x1","x2","cons"))

S = optimize_init()
optimize_init_evaluator(S, &lnprobit_d2())
optimize_init_evaluatortype(S, "`evaltype'")
optimize_init_params(S, (0.1,0.5,0))
optimize_init_argument(S, 1, y)
optimize_init_argument(S, 2, x)

betahat = optimize(S)
sigmahat=sqrt(diagonal(optimize_result_V_oim(S)))'
tstat=betahat:/sigmahat
pval=2:*(1:-normal(abs(tstat)))
(betahat\sigmahat\tstat\pval)'
timer_off(1)

end

program myprobit_d2
	version 10
	args todo b lnf g negH g1
	tempvar xb lj
	mleval `xb' = `b'
	quietly {
		gen double `lj'  = normal( `xb')  if $ML_y1 == 1
		replace    `lj'  = normal(-`xb')  if $ML_y1 == 0
		mlsum `lnf' = ln(`lj')
		if (`todo'==0 | `lnf' >= .) exit

		replace `g1' =  normalden(`xb')/`lj'  if $ML_y1 == 1
		replace `g1' = -normalden(`xb')/`lj'  if $ML_y1 == 0
		mlvecsum `lnf' `g' = `g1', eq(1)
		if (`todo'==1 | `lnf' >= .) exit

		mlmatsum `lnf' `negH' = `g1'*(`g1'+`xb'), eq(1,1)
	}
end

ml model `evaltype' myprobit_d2 (y=x1 x2)
mat I= (0.1 , 0.5,0)
ml init I, copy

timer on 2
ml max, search(off)
timer off 2

timer list

***** END:

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: [Mata newbie] optimize() vs -ml-
  - From: Antoine Terracol <[email protected]>

Prev by Date: Re: st: programming graphs in stata
Next by Date: st: Re: difference between estimated ratios of x/y and ax/ay
Previous by thread: Re: st: [Mata newbie] optimize() vs -ml-
Next by thread: Re: st: [Mata newbie] optimize() vs -ml-
Index(es):
- Date
- Thread