I can't see that using a probit would make any difference at all.
Despite the appeal to latent normality the response is still
binary and in any case focus here is on the predictors.
Nick
[email protected]
Soremekun, Seyi
Because it is a logistic and not probit regression you are attempting,
it actually does not matter if your variables are normal or not. The
main assumption that you might want to test is that the relationship
between the logit of your outcome and your predictor variables is linear
and that all the relevant predictors are included linktest is a basic
way of testing this- the predicted variable (_hat) should be significant
while its square (hatsq) should not be- if you have specified the right
link and variables. But if the box-tidwell test tells you the same thing
I wouldn't worry about the normality issue.
Cheers,
Brendan
I am working with a dataset containing 30000 observations. Some of the
explanatory variables are continuous. If I perform usual tests for
normality the numbers are too great for swilk or for sfrancia, and if I
use sktest the result is "absurdly" large values and rejects the
hypothesis of normal distribution. The frequency histogram, cumulative
frequency plot and normal plot all look normal with no outliers. I
presume that with such large numbers even very small deviations from
normal will lead to a significant result. The box- tidwell test
indicates that the model relationship is linear for all these continuous
variables. Is it safe to ignore the sktest results?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/