I (again) have a question about rescaling. I have a panel data set in which
two variables (the dependent variable & one independent variable) are
expressed thousands of dollars and the other independent variables are all
index numbers or percentages. As I'm taking logs but had several negative
numbers, I rescaled the whole dataset by adding a value to all variables
such that the biggest negative value equals 1. The problem there was that I
had to add a very large value (over 23367 thousand) to all variables, which
meant that after taking logs all variables were nearly the same, as the
index numbers were mostly below 1. This meant then that I could not test
for the endogeneity of one of the variables due to collinearity problems.
What I then did was to go back to the original dataset and converted the
variables expressed in thousands into millions. This also meant that the
biggest negative value occurred in a different variable and I only had to
add a value of 45 to each variable to get positive values for all variables
(in order to log them). When I did the regression then, I got different
results. In particular, the suspected endogenous variable became
insignificant which kind of made the endogeneity test redundant (I did it
anyhow, but the F-test for the overall regression became insignificant,
which is not surprising I guess).
My question is whether my second approach to the rescaling is ok to do or
whether I cannot do it like that.
Sorry for the rather long mail, but I don't know how to describe it in a
shorter manner.