I agree with Joseph's general stance on this question.
Also, consider the alternative of a non-identity link
and a -glm-, which often offers the advantages of
a transformation without the disadvantages.
However, more generally, I will add a plug for
my package -transint- from SSC, which is just
a help file with various comments on transformations.
You can install it using -ssc-.
Nick
[email protected]
Joseph Coveney
> Woong Chung wrote:
>
> I need following help. I have panel dataset for estimating a
> simple linear
> equation.
> The problem is that my all variables have sknewness and big
> variation(large
> std).
> In particualr, the dependent variable and one of independant
> variables have
> a negative sknewness, while all other independant variables
> are shown by
> positive sknewness. My first intension is using a log
> transformation of all
> variables but seems not to be a good idea since all
> variables have negative
> values (around 20%)
> Besides, all variables except one of independant variables
> are ratio, thus
> that idea would make worse.
>
> I would be so glad if anyone has suggestions to solve this problem
>
> --------------------------------------------------------------
> ------------------
>
> It's not clear that you actually have a problem.
>
> It shouldn't be a problem that your independent variables are
> skewed or have
> a wide distribution. There isn't any assumption their
> distribution, and it
> is considered better to for them to cover more ground.
> They're only assumed
> not to comprise a linear combination within machine
> precision. (There are
> other assumptions about them, in particular, about their
> relation to the
> random effects, but that's another matter.)
>
> Fit the model as-is. Examine the residuals and empirical
> Bayes predictions.
> If these do not have a reasonably normal-appearing distribution, then
> transform the dependent variable in accordance with shaping-up their
> distributions, and not the dependent variable's distribution per se.
>
> Also, from your description, it seems that your dependent
> variable is a
> ratio. Consider sticking its denominator in the model as a
> predictor and
> using its numerator as the dependent variable.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/