Steven is correct. This isn't mentioned in -transint-.
-transint- (on SSC) is a slightly unusual package. It is just a help
file written because I wanted my (geography) students to have something
better than the rather poor coverage of transformations in the books
available to them. In fact, I haven't been able to find many accounts of
transformations that were very concise, covered the really important
ideas, but were also light on the mathematics, which is of course a
contradictory desire. Anyway, it then seemed that it might be useful a
little more widely.
Variance-stabilisation is as Steven says the motive for the angular: it
is difficult
to imagine it arising except out of an algebraic argument, which I think
goes back
to Fisher. So, next time around, that might merit an explanation.
Nick
[email protected]
Steven Samuels
Not mentioned in -transint- is the variance-stabilizing property of
the angular transformation: it has asymptotic variance 1/4n, which is
not a function of p (Anscombe, 1948). If the observed proportion is r/
n, Anscombe showed that the arcsine of [(r + 3/8)/(n + 3/4)]^.5 is
even better at stabilizing the variance, for moderate sample size.
The second version has variance 1/(4n + 2).
The arcsine-transformation used to be recommended because transformed
proportions could be analyzed via standard ANOVA programs. I once
found it useful in a variance components analysis. The 'error'
variance was a mixture of a between-sample and within sample
(binomial) variance. With the arcsine transformation, I could
subtract out the part attributable to binomial variation.
-Steve
FJ Anscombe 1948. The transformation of Poisson, Binomial, and
negative-binomial data. Biometrika 35:246-254
On Mar 10, 2008, at 6:02 PM, Nick Cox wrote:
> By arcsin I guess you mean the angular transformation (arcsine of
> square
> root).
> Its use seems to have faded dramatically in recent years.
>
> Tukey showed that this is very close to p^0.41 - (1 - p)^0.41. That
> makes it weaker
> than the logit. My guess is that it would be an unusual dataset in
> which
> the angular
> was much better than leaving data as is and also much better than the
> logit. It could happen,
> but it seems to be rare.
>
> The Tukey reference is given in -transint- from SSC.
>
> Nick
> [email protected]
>
> David Airey
>
> Maybe I should not have said it was pilot data! I won't disagree, but
> when cluster number is too small (< 20) to invoke xtgee or xtmelogit
> on the observed yes/no data, or glm on the summary statistics with
> binomial family and logit link, what do you do? It seems to me there
> is a sample size between 10 and 30 clusters of yes/no data that may be
> better suited to some of the older approaches like arcsin transformed
> proportions and then ttest or ANOVA/regress. I guess that was my
> question.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/