Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: problem with predicted probabilities
From
Richard Goldstein <[email protected]>
To
[email protected]
Subject
Re: st: problem with predicted probabilities
Date
Tue, 04 Feb 2014 08:17:22 -0500
My view of the classification table is slightly different; I certainly
agree that automatically using a cutoff of .5 is not always a good idea;
in particular, if the prevalence of the event in the data is very
different from .5 (e.g., .12 or .88 or ...), it is a bad idea; as an
alternative, use the cutoff option and start by using the prevalence as
a cutoff; substantive experts in your area may suggest other reasonable
cutoffs
Rich
On 2/4/14, 12:04 AM, Witness Chirinda wrote:
> Thanks Nick and Richard for you help!
>
> On Sun, Feb 2, 2014 at 11:23 PM, Richard Williams
> <[email protected]> wrote:
>> Getting every case classified as 0 (or 1) is not unusual. For relatively
>> rare events, the highest predicted probability for every case may be less
>> than .5, so every case gets classified as 0. My own experience is that the
>> classification table tends not to be that helpful, especially for events
>> that are very rare or very common.
>>
>>
>> At 04:58 AM 2/2/2014, Witness Chirinda wrote:
>>>
>>> Dear Statalist
>>> I want to obtain some predicted probabilities after logistic
>>> regression, as attached. I want to use the predicted probabilities in
>>> my next step instead of observed prevalence since the latter are
>>> adjusted for other (socio-demographic) factors.
>>> My problem is that the when I run - estat classification- it giving 0s
>>> for + classification. I am sure I am doing it the wrong way somewhere.
>>> Please see below output. All variables used in the model have been
>>> recorded to be binary 1/0
>>>
>>> Thanks for any help!
>>> ------------------
>>>
>>>
>>> . logistic Health_stat age maried wealth educat place sex
>>>
>>> Logistic regression Number of obs =
>>> 2339
>>> LR chi2(6)
>>> = 50.61
>>> Prob > chi2
>>> = 0.0000
>>> Log likelihood = -996.02516 Pseudo R2 =
>>> 0.0248
>>>
>>> ------------------------------------------------------------------------------
>>> Health_stat | Odds Ratio Std. Err. z P>|z| [95%
>>> Conf. Interval]
>>>
>>> -------------+----------------------------------------------------------------
>>> age | 1.109083 .0342696 3.35 0.001 1.043909 1.178326
>>> maried | 1.2134 .1962535 1.20 0.232 .8837556 1.666004
>>> Wealth | 1.430957 .1784661 2.87 0.004 1.120641 1.827203
>>> educat | 1.670411 .2010455 4.26 0.000 1.319397 2.11481
>>> place | .9334522 .1223134 -0.53 0.599 .7220318 1.206779
>>> sex | 1.129008 .1324642 1.03 0.301 .8970722 1.420911
>>>
>>> . estat class
>>>
>>> Logistic model for poorSRHS
>>> -------- True --------
>>> Classified | D ~D | Total
>>> -----------+--------------------------+-----------
>>> + | 0 0 | 0
>>> - | 370 1969 | 2339
>>> -----------+--------------------------+-----------
>>> Total | 370 1969 | 2339
>>>
>>> Classified + if predicted Pr(D) >= .5
>>> True D defined as poorSRHS != 0
>>> --------------------------------------------------
>>> Sensitivity Pr( +| D) 0.00%
>>> Specificity Pr( -|~D) 100.00%
>>> Positive predictive value Pr( D| +) .%
>>> Negative predictive value Pr(~D| -) 84.18%
>>> --------------------------------------------------
>>> False + rate for true ~D Pr( +|~D) 0.00%
>>> False - rate for true D Pr( -| D) 100.00%
>>> False + rate for classified + Pr(~D| +) .%
>>> False - rate for classified - Pr( D| -) 15.82%
>>> --------------------------------------------------
>>> Correctly classified 84.18%
>>> --------------------------------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/