Well, sure, there are a lot of possible transformations e.g.
arctangent or cube root, but what is the purpose of the
transformation? Are you regressing y on X and thinking the errors
won't be normal? In that case, you may not want to transform y.
Also, have you considered that the y~=0 obs might be somehow
qualitatively different? Note that the sd of return should be
conditioned on size of investment, at least...
clear all
set seed 1
set obs 1000
g s=ceil(_n/250)
g x=tan(_pi*uniform()*s-.5)
tw kdensity x, name(x)
g tx=atan(x)
tw kdensity tx, name(atan)
g cx=sign(x)*abs(x)^(1/3)
tw kdensity cx, name(croot)
g x2=(invnormal(uniform())*s)^3
tw kdensity x2, name(x2)
g tx2=atan(x2)
tw kdensity tx2, name(atan2)
g cx2=sign(x2)*abs(x2)^(1/3)
tw kdensity cx2, name(croot2)
tw kdensity cx2, by(s)
On Tue, Sep 8, 2009 at 4:27 PM, Dalhia<[email protected]> wrote:
> Dear Statalist,
>
> I have a dependent variable, Return on Assets, which is highly skewed - a high peak (I have lots of companies having ROA close to zero) and highly dispersed (I have about 200 values that are greater than 2.00 and lesser than -3.00). None of the usual corrections for skewed data (log, inverse, square) work because they pull in the values, hence increasing the peak.
>
> Is there some kind of data transformation available in stata that will make this data more normal?
>
> Thanks for your help. I appreciate it.
> dalhia
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/