B) On why the difference between the standard errors of the APEs and the
marginal effects is so large, giving completely different results
regarding the significance of the estimates.
auto is kind of a wimpy dataset and I've found that it often doesn't
work too well with multi-equation models, as the data are being
spread too thin. Indeed, if you substitute oprobit for gologit2 in
your example, you find that the model doesn't converge. I'd suggest
using a larger data set for your testing. In the following example,
you'll see that the differences between the APES and the marginal
effects are much smaller than what you found. I don't know if that
is usually the case so you might try a couple more examples. Note
that I use my mfx2 command at the end which gives you all the
marginal effects (albeit at a painfully slow pace compared to margeff).