[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: curious behavior of glm

From	[email protected]
To	[email protected]
Subject	st: RE: curious behavior of glm
Date	Fri, 05 Jun 2009 12:09:22 -0400

Regarding the estimation of 1) a single observation logistic model, and2) a two observation logistic model, having the binomial form with a ybeing the binomial numerator and n the denominator:

When you use cii, or engage in a simple case where the estimatedcoefficient or odds ratio is computeddirectly from the binomial PDF you are of course more likely to get ameaningful result. Using maximum likelihood entails assumptions whichare not met in such a situation. In fact, you cannot even get resultsusing exact logistic regression via the -exlogistic- command. On theother hand, -exlogistic- estimates the second situation where you havetwo observations, each with response y, binomial denominator n, andbinary predictor x. However, you do not get exact values, but rathermedian unbiased estimates.


y   n    x
--------------
10 100  1
 0 100  0

Model the above using -exlogistic-:


. input r n x

             r          n          x

 1. 10 100 1
 2. 0 100 0
 3. end


. exlogistic y x, binomial(n) coef estc

Enumerating sample-space combinations:
observation 1:   enumerations =         11
observation 2:   enumerations =        101
observation 3:   enumerations =      10201
note: CMLE estimate for x is +inf; computing MUE
note: CMLE estimate for _cons is -inf; computing MUE
note: .975 quantile estimate for _cons failed to bracket the value

Exact logistic regression Number of obs =200Binomial variable: n Model score =10.47368Pr >= score =0.0015

-------------------------------------------------------------------------
--

y | Coef. Suff. 2*Pr(Suff.) [95% Conf.Interval]

-------------+-----------------------------------------------------------
--

x | 2.722305* 10 0.0015 .8727845+Inf_cons | 0* 10 0.0000 -Inf+Inf

-------------------------------------------------------------------------
--
(*) median unbiased estimates (MUE)

I requested estimation of a constant although it is obvious that it isnot meaningful in such a situation.

Compare the above with the clearly mistaken "estimated coefficients"that you provided in your output.


. glm r x, fam(bin n)

Generalized linear models No. of obs =2Optimization : ML Residual df =0Scale parameter =1Deviance = 2.00000e-08 (1/df) Deviance =.Pearson = 1.00000e-08 (1/df) Pearson =.


Variance function: V(u) = u*(1-u/n)                [Binomial]
Link function    : g(u) = ln(u/(n-u))              [Logit]

AIC =4.025974Log likelihood = -2.025973987 BIC =2.00e-08--------------------------------------------------------------------------

-----

            |                 OIM

r | Coef. Std. Err. z P>|z| [95% Conf.Interval]

--------------+-----------------------------------------------------------

-----

x | 23.87722 10000 0.00 0.998 -19575.7619623.52

_cons | -26.07444 10000 -0.00 0.998 -19625.7119573.56

--------------------------------------------------------------------------

-----

These coefficients indicate a problem with convergence. Exponentiate toobtain an odds ratio:


. di %12.0f exp(23.87722)
23428521860

We have an odds ratio here of some 23.4 billion. No surprise.

The problem is that the assumptions upon which ML estimation is basedare not met here. I triedyour examples with several other commercial applications, as well as R,with the same results.


The bottom line is that there is nothing wrong with -glm- here.

Joseph Hilbe






*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: RE: RE: curious behavior of glm
  - From: "Mak, Timothy" <[email protected]>

Prev by Date: st: Re: recover the dimensions of a panel
Next by Date: Re: st: RE: AW: Sample selection models under zero-truncated negative binomial models
Previous by thread: st: New version of -bpmedian- on SSC
Next by thread: st: RE: RE: curious behavior of glm
Index(es):
- Date
- Thread