I generally agree with this. There was an old article in 1961
Biometrics by Cochran and Hopkins who noted that about 90% of the
information was retained if you cut the variable at 6 points (I think
equidistant, but my recollection may be faulty).
I am particularly interested in this since I'm looking at some data for
a multiple imputation in which we would like the continuous variables to
be approximately normally distributed. Many are not. In looking for
transformations to normality (boxcox), nothing seems to work. So my
solution has been to group them into 5 or 6 categories and use ologit
for imputation. The problem has been a huge excess of zeros.
Tony
Peter A. Lachenbruch
Department of Public Health
Oregon State University
Corvallis, OR 97330
Phone: 541-737-3832
FAX: 541-737-4001
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Nick Cox
Sent: Tuesday, July 07, 2009 9:52 AM
To: [email protected]
Subject: RE: st: RE: Converting a continuous var into a binary var
I am happy that any Stata Journal columns of mine are useful, but that
really wasn't the point I was making. Dichotomising continuous variables
throws away information. Usually that's a bad, or at least a dubious,
idea.
Nick
[email protected]
Pancho Villa
On Tue, Jul 7, 2009 at 9:35 AM, Nick Cox<[email protected]> wrote:
> That aside, the mechanics of how to do this have been thoroughly
> ventilated, but its meaning has not been.
Yes, I'm reading the column on *for*, which seems like written with me
in mind. I'm one of those who've postponed learning about macros,
etc.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/