Control over how percentile ranks (also no doubt known under other
names) are calculated is easily possible, as detailed within
FAQ Calculating percentile ranks or plotting positions
7/02 How can I calculate percentile ranks?
How can I calculate plotting positions?
http://www.stata.com/support/faqs/stat/pcrank.html
Then you need just one more function to get normal scores.
Maarten buis
--- Mark Lunt <[email protected]> wrote:
> ICE assumes that continuous variables are normally distributed: if
> that is not the case, impossible values can appear. In particular, if
> you have lots of companies with a few employees and a few companies
> with lots of employees, ICE will impute negative numbers of
> employees. One possible solution is to use the "match" option of ICE.
Good point. An alternative would be to take the logarithm of the number
of employees.
> Alternatively, I have written some ado-files which convert variables
> to normal-scores and back: you can convert to normal scores (which
> are normally distributed), perform the imputation on these
> variables, then convert back to your original distribution.
I have had a quick look at this command and it would seem that you use
the rank of each observation and transform that as if it came from a
normal distribution. I think that that is too strong a transformation,
as you throw away all information about the distances between values
and only use the rank. This is most clearly visible when two or more
observations have the same value. In the way you programed this
procedure they are given different ranks, and thus different values on
your new variable:
*--------- begin example ---------
sysuse auto, clear
nscore rep78, gen(gauss)
twoway scatter gauss rep78
*---------- end example ----------
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/