[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: transformations for highly skewed dependent variable

From	Austin Nichols <[email protected]>
To	[email protected]
Subject	Re: st: transformations for highly skewed dependent variable
Date	Wed, 9 Sep 2009 10:50:36 -0400

Well, sure, there are a lot of possible transformations e.g.
arctangent or cube root, but what is the purpose of the
transformation?  Are you regressing y on X and thinking the errors
won't be normal?  In that case, you may not want to transform y.
Also, have you considered that the y~=0 obs might be somehow
qualitatively different?  Note that the sd of return should be
conditioned on size of investment, at least...

clear all
set seed 1
set obs 1000
g s=ceil(_n/250)
g x=tan(_pi*uniform()*s-.5)
tw kdensity x, name(x)
g tx=atan(x)
tw kdensity tx, name(atan)
g cx=sign(x)*abs(x)^(1/3)
tw kdensity cx, name(croot)
g x2=(invnormal(uniform())*s)^3
tw kdensity x2, name(x2)
g tx2=atan(x2)
tw kdensity tx2, name(atan2)
g cx2=sign(x2)*abs(x2)^(1/3)
tw kdensity cx2, name(croot2)
tw kdensity cx2, by(s)

On Tue, Sep 8, 2009 at 4:27 PM, Dalhia<[email protected]> wrote:
> Dear Statalist,
>
> I have a dependent variable, Return on Assets, which is highly skewed - a high peak (I have lots of companies having ROA close to zero) and highly dispersed (I have about 200 values that are greater than 2.00 and lesser than -3.00). None of the usual corrections for skewed data (log, inverse, square) work because they pull in the values, hence increasing the peak.
>
> Is there some kind of data transformation available in stata that will make this data more normal?
>
> Thanks for your help. I appreciate it.
> dalhia

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: transformations for highly skewed dependent variable
  - From: Dalhia <[email protected]>

Prev by Date: Re: st: -foreach-, local macro, -split- command
Next by Date: Re: st: Weighted Euclidean distances with panel data
Previous by thread: st: transformations for highly skewed dependent variable
Next by thread: Re: st: transformations for highly skewed dependent variable
Index(es):
- Date
- Thread