<>
The problem with such an unequal distribution of the dependent variable is
that you would have a hard time beating the naive model, i.e. saying that no
one ever tests for STD, w/o reference to any covariates. That would classify
98.4% of your population correctly...
BTW, whether you model "test" or "no test" simply inverts the coefficients
in the -logit-
*************
clear*
inp std resid freq
0 1 419
0 2 4269
1 1 46
1 2 30
end
logit std resid [fweight = freq]
recode std (1=0) (0=.)
recode std (.=1)
logit std resid [fweight = freq]
*************
HTH
Martin
-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Chao Yawo
Gesendet: Sonntag, 8. März 2009 14:34
An: [email protected]
Betreff: st: Logistic Regression_Unequal Ns (outcomes)
Hello, I'm preparing to run a logit model predicting the odds of NOT
testing for an STD. As you can see from the table below, 4688 (about
98%) of respondents have my outcome of interest (i.e., have not tested
for an STD). I realized that because of this unequal groupings, all
crosstabulations have higher proportions within the untested category.
I have a feeling that these could bias my estimates in a way. For
example, given the unequal groupings, I think I am only restricted to
modeling failure to test (the zero outcome), as modeling for ever
tested (1) could lead to unstable estimates. So my question is what
possible impact will this have on my model, and what can I do about
it? Thanks - Chao
(Ever |
been | Type of place of
tested | residence
for STD | 1 2 Total
----------+------------------------------------------------
0 | 7.973 92.03 100
| 419 4269 4688
|
1 | 62.5 37.5 100
| 46 30 76
| -------------------------------------------------
Total | 8.806 91.19 100
| 465 4299 4764
-------------------------------
Key: row percentages
number of observations
------------------------------------------------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/