Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Austin Nichols <austinnichols@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: van der Waerden transformation |
Date | Fri, 13 Apr 2012 12:18:11 -0400 |
Maarten-- A complete answer requires complete exposition of IRT, but the quick answer is yes, more or less. If you think underlying "achievement" is normally distributed, and you used a reasonably well-designed test, you should convert the scores back into a normal distribution as done via more sophisticated methods on virtually every standardized test; the measure of latent "achievement" is typically called theta. Given that tests do not uniformly cover the difficulty space, there will be skew and other nonnormality in scores, but a perfect test (where the definition of perfect depends on what the test is to be used for) might show a uniform distribution in percent correct from zero to 100, which one could then turn back into a normal distribution easily enough. The distances then might give a reasonable measure of how much harder it is to go from 98 to 99 than from 49 to 50 on this hypothetical perfect test. I have argued in print elsewhere that "achievement" is not normally distributed, but let's leave that aside for now... as no more objectionable than assumptions in many -xt- commands on normality of e.g. random effects/coefs. On Fri, Apr 13, 2012 at 3:33 AM, Maarten Buis <maartenlbuis@gmail.com> wrote: > On Thu, Apr 12, 2012 at 7:01 PM, Austin Nichols <austinnichols@gmail.com> wrote: >> Maarten-- >> how about test scores? > > Why would you want to make up distances between ranks in test scores? > I can see why many of these do not have a natural unit, so some form > of standardization is called for, but that does not mean that they > should be forced into a normal/Gaussian distribution. If you find > considerable skewness in your raw scores, would the forced to be > normal variable really be a better represenation of what you found? > > -- Maarten > >> >> On Thu, Apr 12, 2012 at 12:42 PM, Maarten Buis <maartenlbuis@gmail.com> wrote: >>> On Thu, Apr 12, 2012 at 6:11 PM, Scott Merryman wrote: >>>> Isn't the van der Waerden transformation just inverse_normal(rank/(N +1)) ? >>> >>> That sounds like an awful idea. That way you are just "inventing" >>> distances between ranks that have nothing to do with what you >>> observed. If you (generally speaking, not Scott specifically) really >>> want to get rid of the skewness that badly, than just use the >>> percentile rank and be honest about the fact that you have thrown away >>> the information on the distances between the ranks rather than making >>> those distances up. In general, I would _not_ try to get rid of the >>> skewness, but rather use it. If it is a dependent variable that might >>> suggest a -glm- with maybe a log link function. If it is an >>> independent variable it might suggest a non-linear effect possibly to >>> be modeled with splines (see: -mkspline-). >>> >>> I would be interested to hear if someone knows of an application where >>> this transformation would make some sense. I cannot imagine one, but >>> that may just be due to my lack of imagination. >>> >>> -- Maarten * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/