Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: quasi-complete separation


From   Nick Cox <[email protected]>
To   [email protected]
Subject   Re: st: quasi-complete separation
Date   Sun, 28 Aug 2011 17:29:02 +0100

I don't know where 48.5 comes from so I can't comment on that.

. input y   x

             y          x
  1.  0   1
  2.  0   2
  3.  0   3
  4.  0   4
  5.  1   1
  6.  1   2
  7.  1   3
  8.  1   4
  9.  1   5
 10.  1   6
 11.  1   7
 12. end

. logit y x

Iteration 0:   log likelihood = -7.2102995
Iteration 1:   log likelihood = -6.3453449
Iteration 2:   log likelihood =   -6.31452
Iteration 3:   log likelihood =  -6.314268
Iteration 4:   log likelihood =  -6.314268

Logistic regression                               Number of obs   =         11
                                                  LR chi2(1)      =       1.79
                                                  Prob > chi2     =     0.1807
Log likelihood =  -6.314268                       Pseudo R2       =     0.1243

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
           x |   .5126268   .4305072     1.19   0.234    -.3311519    1.356405
       _cons |  -1.076253   1.436463    -0.75   0.454    -3.891669    1.739162
------------------------------------------------------------------------------

In my ignorance I was not aware until now of the terminology of
"quasi-complete separation" although Googling reveals several long
discussions. Evidently there are datasets which are difficult or
impossible to model with -logit- or -probit-. So, what else is new?
Whether it helps to use this terminology I don't know. It just sounds
like giving the problem a name to me. Others may be able to add deeper
comments.

Nick


On Sun, Aug 28, 2011 at 5:01 PM, Sabrina Helmut <[email protected]> wrote:
> Nick,
> thanks! You are right, logit works but the coefficient for the concerned variable is extremely high (48.5..) I will need an explanation for this. So, do you think my example shows quasi-complete separation which could be an explanation for the high coefficient?
>
> ----------------------------------------
>> Date: Sun, 28 Aug 2011 16:43:36 +0100
>> Subject: Re: st: quasi-complete separation
>> From: [email protected]
>> To: [email protected]
>>
>> Sabrina, and indeed anybody else: Please do not send, or attempt to
>> send, attachments to Statalist.
>> See http://www.stata.com/support/faqs/res/statalist.html#toask where
>> this is explained, twice over.
>>
>> Sabrina: -logit y x- will work with this dataset, but there is only a
>> weak relationship.
>>
>> Nick
>>
>> On Sun, Aug 28, 2011 at 4:24 PM, Sabrina Helmut <[email protected]> wrote:
>> > I am sorry, the scatter has not been send. Thus, an example for you:
>> >
>> > binary dependent variable y
>> > continuous variable x
>> >
>> > y   x
>> > 0   1
>> > 0   2
>> > 0   3
>> > 0   4
>> > 1   1
>> > 1   2
>> > 1   3
>> > 1   4
>> > 1   5
>> > 1   6
>> > 1   7
>> >
>> > Thus, values of the independent variable being higher than 4 are only captured by y=1.
>> > So, is this a problem of quasi-complete separation? Thank you very much.
>> >
>> >
>> > ----------------------------------------
>> >> From: [email protected]
>> >>
>> >> I provided a scatter for you. Am I right with the assumption that it shows the problem of quasi-complete separation? Thanks.
>>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index