Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Significance of categorical variables in Logistic Regression
From
Marcello Pagano <[email protected]>
To
<[email protected]>
Subject
st: Significance of categorical variables in Logistic Regression
Date
Thu, 21 Mar 2013 10:29:12 -0400
For Michael:
I am running logistic regressions with a number of categorical iv's. I
am building the model by starting out with variables that have a pval
.15 when used in a single variable regression.
Then I put them all together, and weed out variables that are not
significant in the multi-variable regression. When I get to a "core"
model of significant predictors, I add back excluded variables one at a
time to see if they are significant in the context of the smaller set of
predictors.
If a categorical variable has at least one category that is significant,
I keep the whole variable.
I have excluded a categorical variable, but noticed that if I base the
variable on a different category than the default category, I suddenly
see significant categories in the regression.
I.E.:
logistic yvar xvar1 xvar2 i.xvar3
results in every category of xvar having a high pval, but:
logistic yvar xvar1 xvar2 ib2.xvar3
results in several of xvar3's categories having a pval near 0.
From looking at marginplots I understand how this can happen, but I
would like to know if there's a way of detecting this during the model
building without looking at marginplots?
Many thanks,
Michael Cook
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/