Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: too good to be true : lr test in mlogit?
From
John Litfiba <[email protected]>
To
[email protected]
Subject
Re: st: too good to be true : lr test in mlogit?
Date
Mon, 16 May 2011 12:02:51 +0200
Dear Marten,
Thank you very much again for your support!
Well, when I run a xtlogit (I first type xtset id, where id is the
unique id for each of my individual in my database) with on year of
data (1million) observations I get the message
Yvar is categorical (Yes=1, No=0) and Xvar is also categorical
(type1=1, type2=2)
*********************************************************************************************************
. xtlogit Yvar Xvar, re
Fitting comparison model:
Iteration 0: log likelihood = -699882.93
Iteration 1: log likelihood = -669440.74
Iteration 2: log likelihood = -669402.92
Iteration 3: log likelihood = -669402.89
Fitting full model:
tau = 0.0 log likelihood = -669402.89
tau = 0.1 log likelihood = -460383.3
tau = 0.2 log likelihood = -425672.08
tau = 0.3 log likelihood = -409117.91
tau = 0.4 log likelihood = -398842.25
tau = 0.5 log likelihood = -391638.85
tau = 0.6 log likelihood = -386752.42
tau = 0.7 log likelihood = -384063.23
tau = 0.8 log likelihood = -383816.98
initial values not feasible
r(1400);
***********************************************************************************************
and when I run a random effect model I get
**************************************************************************************************
. xtlogit Yvar Xvar, fe
note: multiple positive outcomes within groups encountered.
note: 18475 groups (170046 obs) dropped because of all positive or
all negative outcomes.
Iteration 0: log likelihood = -1.#INF
Iteration 1: log likelihood = -1.#IND
Hessian is not negative semidefinite
r(430);
************************************************************************************************************************
However, lets say I only keep the last 100 000 observations of my
sample and then I get
************************************************************************************************************************
xtlogit Yvar Xvar, fe
note: multiple positive outcomes within groups encountered.
note: 11791 groups (49177 obs) dropped because of all positive or
all negative outcomes.
Iteration 0: log likelihood = -22470.418
Iteration 1: log likelihood = -22218.949
Iteration 2: log likelihood = -22218.885
Iteration 3: log likelihood = -22218.885
Conditional fixed-effects logistic regression Number of obs = 69669
Group variable: id2 Number of groups = 3794
Obs per group: min = 2
avg = 18.4
max = 876
LR chi2(1) = 5266.26
Log likelihood = -22218.885 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
Yvar| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Xvar| 5.315414 .1412066 37.64 0.000 5.038654 5.592174
------------------------------------------------------------------------------
and for the random effect I get :
***************************************************************************************************
. xtlogit Yvar Xvar, re
Fitting comparison model:
Iteration 0: log likelihood = -79872.579
Iteration 1: log likelihood = -70108.483
Iteration 2: log likelihood = -69952.535
Iteration 3: log likelihood = -69950.066
Iteration 4: log likelihood = -69950.06
Fitting full model:
tau = 0.0 log likelihood = -69950.06
tau = 0.1 log likelihood = -55891.467
tau = 0.2 log likelihood = -51186.623
tau = 0.3 log likelihood = -48260.258
tau = 0.4 log likelihood = -46086.379
tau = 0.5 log likelihood = -44358.837
tau = 0.6 log likelihood = -42957.577
tau = 0.7 log likelihood = -41790.563
tau = 0.8 log likelihood = -40944.535
Iteration 0: log likelihood = -41603.261
Iteration 1: log likelihood = -39231.257
Iteration 2: log likelihood = -38979.35
Iteration 3: log likelihood = -38947.091
Iteration 4: log likelihood = -38947.091 (backed up)
Iteration 5: log likelihood = -38947.026
Iteration 6: log likelihood = -38947.026
Random-effects logistic regression Number of obs = 118846
Group variable: id2 Number of groups = 15585
Random effects u_i ~ Gaussian Obs per group: min = 1
avg = 7.6
max = 1255
Wald chi2(1) = 2339.74
Log likelihood = -38947.026 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
Yvar | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Xvar| 9.349703 .1932923 48.37 0.000 8.970857 9.728549
_cons | -8.300978 .1855708 -44.73 0.000 -8.66469 -7.937266
-------------+----------------------------------------------------------------
/lnsig2u | 2.728813 .0338054 2.662555 2.79507
-------------+----------------------------------------------------------------
sigma_u | 3.913399 .066147 3.785877 4.045216
rho | .8231687 .0049208 .8133168 .8326077
------------------------------------------------------------------------------
Likelihood-ratio test of rho=0: chibar2(01) = 6.2e+04 Prob >= chibar2 = 0.000
Best Regards
On 16 May 2011 09:49, Maarten Buis <[email protected]> wrote:
> On Sat, May 14, 2011 at 11:31 AM, John Litfiba wrote:
>> 1) The log likelihood doesnt converge when I try to fit a random or
>> fixed effect with xtlogit on my entire dataset..
>> I have to chose a very "small" (well, compared to the total size of
>> the sample) of about 10000 observations in order to see the results...
>> otherwise I get an error message after 3 or 4 iterations
>
> If you do not tell use what the error message is than we obviously
> cannot help you. We need to know exactly what you typed and what Stata
> told you in return.
>
>> 2) The idea of running lets say M regressions over randomly chose
>> samples could be a solution, but it is statistically valid ? I mean if
>> I obtain the distribution of the parameters across my M simulation can
>> I infer something on the parameters of the simulation that should have
>> been done on the entire dataset ?
>
> No, but if you sample correctly a single random sample of higher level
> units will be just as valid a sample from your population as your
> large sample, just with a smaller N. The added value of additional
> observations tends to decrease with sample size, so going from 10 to
> 11 observations will have a much bigger effect on your inference than
> moving from 100 to 101 observations. There are many estimates for
> which the difference between 10000 and 10000000 observations is just
> negligible (but there are estimates where it will matter, for example
> higher order interaction terms or a categorical variables containing a
> rarely occurring category).
>
> Hope this helps,
> Maarten
>
> --------------------------
> Maarten L. Buis
> Institut fuer Soziologie
> Universitaet Tuebingen
> Wilhelmstrasse 36
> 72074 Tuebingen
> Germany
>
>
> http://www.maartenbuis.nl
> --------------------------
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/