Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: too good to be true : lr test in mlogit?
From
John Litfiba <[email protected]>
To
[email protected]
Subject
Re: st: too good to be true : lr test in mlogit?
Date
Mon, 16 May 2011 12:15:55 +0200
A little typo : Xvar is also binomial > (type1=0, type2=1) of course ;-)
On 16 May 2011 12:02, John Litfiba <[email protected]> wrote:
> Dear Marten,
>
> Thank you very much again for your support!
>
> Well, when I run a xtlogit (I first type xtset id, where id is the
> unique id for each of my individual in my database) with on year of
> data (1million) observations I get the message
>
> Yvar is categorical (Yes=1, No=0) and Xvar is also categorical
> (type1=1, type2=2)
>
> *********************************************************************************************************
> . xtlogit Yvar Xvar, re
>
> Fitting comparison model:
>
> Iteration 0: log likelihood = -699882.93
> Iteration 1: log likelihood = -669440.74
> Iteration 2: log likelihood = -669402.92
> Iteration 3: log likelihood = -669402.89
>
> Fitting full model:
>
> tau = 0.0 log likelihood = -669402.89
> tau = 0.1 log likelihood = -460383.3
> tau = 0.2 log likelihood = -425672.08
> tau = 0.3 log likelihood = -409117.91
> tau = 0.4 log likelihood = -398842.25
> tau = 0.5 log likelihood = -391638.85
> tau = 0.6 log likelihood = -386752.42
> tau = 0.7 log likelihood = -384063.23
> tau = 0.8 log likelihood = -383816.98
>
> initial values not feasible
> r(1400);
> ***********************************************************************************************
> and when I run a random effect model I get
>
> **************************************************************************************************
> . xtlogit Yvar Xvar, fe
> note: multiple positive outcomes within groups encountered.
> note: 18475 groups (170046 obs) dropped because of all positive or
> all negative outcomes.
>
> Iteration 0: log likelihood = -1.#INF
> Iteration 1: log likelihood = -1.#IND
> Hessian is not negative semidefinite
> r(430);
> ************************************************************************************************************************
>
> However, lets say I only keep the last 100 000 observations of my
> sample and then I get
>
> ************************************************************************************************************************
> xtlogit Yvar Xvar, fe
>
> note: multiple positive outcomes within groups encountered.
> note: 11791 groups (49177 obs) dropped because of all positive or
> all negative outcomes.
>
> Iteration 0: log likelihood = -22470.418
> Iteration 1: log likelihood = -22218.949
> Iteration 2: log likelihood = -22218.885
> Iteration 3: log likelihood = -22218.885
>
> Conditional fixed-effects logistic regression Number of obs = 69669
> Group variable: id2 Number of groups = 3794
>
> Obs per group: min = 2
> avg = 18.4
> max = 876
>
> LR chi2(1) = 5266.26
> Log likelihood = -22218.885 Prob > chi2 = 0.0000
>
> ------------------------------------------------------------------------------
> Yvar| Coef. Std. Err. z P>|z| [95% Conf. Interval]
> -------------+----------------------------------------------------------------
> Xvar| 5.315414 .1412066 37.64 0.000 5.038654 5.592174
> ------------------------------------------------------------------------------
>
>
> and for the random effect I get :
>
> ***************************************************************************************************
>
> . xtlogit Yvar Xvar, re
>
> Fitting comparison model:
>
> Iteration 0: log likelihood = -79872.579
> Iteration 1: log likelihood = -70108.483
> Iteration 2: log likelihood = -69952.535
> Iteration 3: log likelihood = -69950.066
> Iteration 4: log likelihood = -69950.06
>
> Fitting full model:
>
> tau = 0.0 log likelihood = -69950.06
> tau = 0.1 log likelihood = -55891.467
> tau = 0.2 log likelihood = -51186.623
> tau = 0.3 log likelihood = -48260.258
> tau = 0.4 log likelihood = -46086.379
> tau = 0.5 log likelihood = -44358.837
> tau = 0.6 log likelihood = -42957.577
> tau = 0.7 log likelihood = -41790.563
> tau = 0.8 log likelihood = -40944.535
>
> Iteration 0: log likelihood = -41603.261
> Iteration 1: log likelihood = -39231.257
> Iteration 2: log likelihood = -38979.35
> Iteration 3: log likelihood = -38947.091
> Iteration 4: log likelihood = -38947.091 (backed up)
> Iteration 5: log likelihood = -38947.026
> Iteration 6: log likelihood = -38947.026
>
> Random-effects logistic regression Number of obs = 118846
> Group variable: id2 Number of groups = 15585
>
> Random effects u_i ~ Gaussian Obs per group: min = 1
> avg = 7.6
> max = 1255
>
> Wald chi2(1) = 2339.74
> Log likelihood = -38947.026 Prob > chi2 = 0.0000
>
> ------------------------------------------------------------------------------
> Yvar | Coef. Std. Err. z P>|z| [95% Conf. Interval]
> -------------+----------------------------------------------------------------
> Xvar| 9.349703 .1932923 48.37 0.000 8.970857 9.728549
> _cons | -8.300978 .1855708 -44.73 0.000 -8.66469 -7.937266
> -------------+----------------------------------------------------------------
> /lnsig2u | 2.728813 .0338054 2.662555 2.79507
> -------------+----------------------------------------------------------------
> sigma_u | 3.913399 .066147 3.785877 4.045216
> rho | .8231687 .0049208 .8133168 .8326077
> ------------------------------------------------------------------------------
> Likelihood-ratio test of rho=0: chibar2(01) = 6.2e+04 Prob >= chibar2 = 0.000
>
>
> Best Regards
>
> On 16 May 2011 09:49, Maarten Buis <[email protected]> wrote:
>> On Sat, May 14, 2011 at 11:31 AM, John Litfiba wrote:
>>> 1) The log likelihood doesnt converge when I try to fit a random or
>>> fixed effect with xtlogit on my entire dataset..
>>> I have to chose a very "small" (well, compared to the total size of
>>> the sample) of about 10000 observations in order to see the results...
>>> otherwise I get an error message after 3 or 4 iterations
>>
>> If you do not tell use what the error message is than we obviously
>> cannot help you. We need to know exactly what you typed and what Stata
>> told you in return.
>>
>>> 2) The idea of running lets say M regressions over randomly chose
>>> samples could be a solution, but it is statistically valid ? I mean if
>>> I obtain the distribution of the parameters across my M simulation can
>>> I infer something on the parameters of the simulation that should have
>>> been done on the entire dataset ?
>>
>> No, but if you sample correctly a single random sample of higher level
>> units will be just as valid a sample from your population as your
>> large sample, just with a smaller N. The added value of additional
>> observations tends to decrease with sample size, so going from 10 to
>> 11 observations will have a much bigger effect on your inference than
>> moving from 100 to 101 observations. There are many estimates for
>> which the difference between 10000 and 10000000 observations is just
>> negligible (but there are estimates where it will matter, for example
>> higher order interaction terms or a categorical variables containing a
>> rarely occurring category).
>>
>> Hope this helps,
>> Maarten
>>
>> --------------------------
>> Maarten L. Buis
>> Institut fuer Soziologie
>> Universitaet Tuebingen
>> Wilhelmstrasse 36
>> 72074 Tuebingen
>> Germany
>>
>>
>> http://www.maartenbuis.nl
>> --------------------------
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>>
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/