| |
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: RE: Yeo-Johnson Power Transformation
Kit, Nick, and others:
Thanks for your suggestions as well as patience. I am a complete
novice with Stata (and with this arena of statistics -- my earlier
work has mainly been based on SEM and case studies), and please
excuse my ignorance.
Based on how the percentages are computed [100*(x-y)/(w+x+y)], Yeo-
Johnson transformation does seem appropriate.
I follow most of what Kit suggested. Thanks again. However, based on
the Weisberg paper on Yeo-Johnson transformation (www.stat.umn.edu/
arc/yjpower.pdf), I have a different interpretation on four aspects.
1. I believe I should be using 2-`theta' instead of 2*`theta' at both
places toward the end of the code you suggested.
2. I believe Equation 2 on page 1 of the above PDF file is the one
being modeled. This includes two possibilities for y<0, one when
lambda <> 2 (I believe this captured in the line two above else in
your suggested code), and the other when lambda = 2 (which I am don't
think is captured).
3. There should be a negative sign prior to ( ( (abs($ML_y1)+1)^(2-
`theta')-1)/(2-`theta' )
4. In the line after else, I believe there should be a +1 within
parentheses.
Assuming I am right on the above points, should the last block of
code be as follows?
qui gen double `yt' = .
if `diffL'> 1e-10 {
qui replace `yt' =( ( ($ML_y1+1)^`theta'-1)/ `theta' ) if $ML_y1 >= 0
qui replace `yt' = -( ( (abs($ML_y1)+1)^(2-`theta')-1)/(2-`theta' )
if ($ML_y1 < 0 and `diffL' <>2)
qui replace `yt' = -ln((abs($ML_y1)+1) if ($ML_y1 < 0 and `diffL' =2)
}
else {
qui replace `yt' = ln( $ML_y1+1 )
}
Please advise.
Thanks, and best wishes,
Rajiv
On Jan 21, 2007, at 12:20 PM, Nick Cox wrote:
Kit Baum has already replied on the assumption that
you want to estimate the parameter in this transformation
by maximum likelihood, in which case his advice is in
effect change the parts of -boxcox- that do not apply
to this transformation until they do apply.
On a quite different point: I think the assumption in
your post is dubious. Given the percent flavour, you
may need a generalisation of the logit-and-folded-power
family, not a generalisation of power-and-logarithm family.
But that depends on your data generation process.
Nick
[email protected]
Rajiv Sabherwal
How can I perform Yeo-Johnson Power transformation in STATA? It is
similar to Box-Cox transformation, but can be used with negative
variables as well, unlike Box-Cox transformation which can only be
used for positive variables. Please see
www.stat.umn.edu/arc/yjpower.pdf and
rweb.stat.umn.edu/R/library/alr3/html/powtran.html.
My dependent variable is a percentage that varies from -100 to +100,
and hence Box-Cox transformation would be inappropriate, but Yeo-
Johnson Power transformation would be perfect.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/