Dear Stata listers,
I am currently playing around with the new factor variable syntax and
the new margins command. I have re-specified my regression models by
using the new syntax (i.education, etc.) so that I can use the margins
command to compute average marginal effects. I have found that by doing
this, the computation is very slow compared, for example, to the margeff
command. Here is an example (you will need to have margeff installed
(ssc install margeff)):
. timer clear
. webuse union, clear
. probit union age grade i.not_smsa i.south i.black
. timer on 1
. margins, dydx(*)
. timer off 1
. probit union age grade not_smsa south black
. timer on 2
. margeff, dummies(not_smsa \ south \ black)
. timer off 2
. timer list
Which creates the following output (some output ommited):
OUTPUT MARGINS
------------------------------------------------------------------------
------
| Delta-method
| dy/dx Std. Err. z P>|z| [95% Conf.
Interval]
-------------+----------------------------------------------------------
------
age | .0019934 .0003908 5.10 0.000 .0012274
.0027593
grade | .0114783 .0010614 10.81 0.000 .009398
.0135585
1.not_smsa | -.0157848 .0057749 -2.73 0.006 -.0271033
-.0044663
1.south | -.140847 .0051475 -27.36 0.000 -.150936
-.1307581
1.black | .1496103 .0066016 22.66 0.000 .1366714
.1625493
------------------------------------------------------------------------
------
OUTPUT MARGEFF
------------------------------------------------------------------------
------
variable | Coef. Std. Err. z P>|z| [95% Conf.
Interval]
-------------+----------------------------------------------------------
------
age | .0019933 .0003908 5.10 0.000 .0012274
.0027593
grade | .0114771 .001061 10.82 0.000 .0093975
.0135567
not_smsa | -.0157848 .0057019 -2.77 0.006 -.0269603
-.0046093
south | -.140847 .0042921 -32.82 0.000 -.1492593
-.1324347
black | .1496103 .0071747 20.85 0.000 .1355482
.1636724
------------------------------------------------------------------------
------
. timer list [64-bit Stata/MP (4 cores) (WINXP)]
1: 18.38 / 1 = 18.3760 [TIME TO RUN MARGINS]
2: 0.55 / 1 = 0.5470 [TIME TO RUN MARGEFF]
While the coefficients are almost identical, the standard errors are
slightly different which leads to the question of which command computes
the "correct" ones. I understand that the margins command is more
convenient when computing marginal effects of interaction terms but is
there another advantage of using the slower margins command instead of
the margeff command. Is there a way to speed up the margins command?
A related question (probably targeted at the Stata employees on this
list):
Is there a command (maybe undocumented) that creates a set of "real"
variables from factor variable statements like i.education or
i.agegroups, so that the users do not have to create the variables
themselves when using older commands that do no support the new syntax?
If is answer is no, I would be interested in how the estimation commands
that support the new syntax work under the "hood". Do those commands
create "temporary" variables before performing the estimation? I am in
particular interested in how user written commands would handle the new
syntax?
OFF-TOPIC: It would be nice if margeff would support factor variables.
Tamas, what do you think?
Cheers,
Markus
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/