I need some advice on the use ogistic regression for large data set. I have
around 20 variables (most of the binary including the dependent)and one lakh
observations. What is the right procedure ? Do I use the entire data set or
a samlpe? What about outliers? Will there be over fitting? How best do I
evaluate the fit of the model and the predictions from it? These are things
on which I need some advice.
Thanking you in advance.
Regards,
Suresh
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/