Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: quasi-complete separation
From 
 
Nick Cox <[email protected]> 
To 
 
[email protected] 
Subject 
 
Re: st: quasi-complete separation 
Date 
 
Sun, 28 Aug 2011 17:29:02 +0100 
I don't know where 48.5 comes from so I can't comment on that.
. input y   x
             y          x
  1.  0   1
  2.  0   2
  3.  0   3
  4.  0   4
  5.  1   1
  6.  1   2
  7.  1   3
  8.  1   4
  9.  1   5
 10.  1   6
 11.  1   7
 12. end
. logit y x
Iteration 0:   log likelihood = -7.2102995
Iteration 1:   log likelihood = -6.3453449
Iteration 2:   log likelihood =   -6.31452
Iteration 3:   log likelihood =  -6.314268
Iteration 4:   log likelihood =  -6.314268
Logistic regression                               Number of obs   =         11
                                                  LR chi2(1)      =       1.79
                                                  Prob > chi2     =     0.1807
Log likelihood =  -6.314268                       Pseudo R2       =     0.1243
------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
           x |   .5126268   .4305072     1.19   0.234    -.3311519    1.356405
       _cons |  -1.076253   1.436463    -0.75   0.454    -3.891669    1.739162
------------------------------------------------------------------------------
In my ignorance I was not aware until now of the terminology of
"quasi-complete separation" although Googling reveals several long
discussions. Evidently there are datasets which are difficult or
impossible to model with -logit- or -probit-. So, what else is new?
Whether it helps to use this terminology I don't know. It just sounds
like giving the problem a name to me. Others may be able to add deeper
comments.
Nick
On Sun, Aug 28, 2011 at 5:01 PM, Sabrina Helmut <[email protected]> wrote:
> Nick,
> thanks! You are right, logit works but the coefficient for the concerned variable is extremely high (48.5..) I will need an explanation for this. So, do you think my example shows quasi-complete separation which could be an explanation for the high coefficient?
>
> ----------------------------------------
>> Date: Sun, 28 Aug 2011 16:43:36 +0100
>> Subject: Re: st: quasi-complete separation
>> From: [email protected]
>> To: [email protected]
>>
>> Sabrina, and indeed anybody else: Please do not send, or attempt to
>> send, attachments to Statalist.
>> See http://www.stata.com/support/faqs/res/statalist.html#toask where
>> this is explained, twice over.
>>
>> Sabrina: -logit y x- will work with this dataset, but there is only a
>> weak relationship.
>>
>> Nick
>>
>> On Sun, Aug 28, 2011 at 4:24 PM, Sabrina Helmut <[email protected]> wrote:
>> > I am sorry, the scatter has not been send. Thus, an example for you:
>> >
>> > binary dependent variable y
>> > continuous variable x
>> >
>> > y   x
>> > 0   1
>> > 0   2
>> > 0   3
>> > 0   4
>> > 1   1
>> > 1   2
>> > 1   3
>> > 1   4
>> > 1   5
>> > 1   6
>> > 1   7
>> >
>> > Thus, values of the independent variable being higher than 4 are only captured by y=1.
>> > So, is this a problem of quasi-complete separation? Thank you very much.
>> >
>> >
>> > ----------------------------------------
>> >> From: [email protected]
>> >>
>> >> I provided a scatter for you. Am I right with the assumption that it shows the problem of quasi-complete separation? Thanks.
>>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/