Age and age-squared need not be correlated very highly. If you calculate
agex-=age-mean_age, and then calculate agexsq=(agex*agex), then agex and
agexsq will be weakly correlated if at all. Maybe that was already done
in the dataset you are using.
HTH
Sam
On Thu, 17 May 2007, Michael Blasnik wrote:
> ...
>
> I view this issue as more about interpretability of the coefficient(s). You
> don't really need to worry about highly correlated terms, like age and age
> squared, if they are control variables in your model, but you would if either is
> the primary effect of interest.
>
> Models that include age and age squared are typically interested in controlling
> for age and want to allow for some nonlinearity. The analyst is not usually
> trying to interpret either of the coefficients, but is actually interested in
> other coefficients in the model. On the other hand, if you have highly
> correlated terms about the primary effects of interest, then collinearity can be
> a significant problem -- you can't really measure the effect of either of the
> correlated terms very well and the standard errors should show this. How you
> should proceed depends on the subject matter -- you may want to think about
> somehow combining the two terms or you *could" just drop one. If you drop a
> term, you need to think about how to interpret the remaining term since the
> coefficient now will include the effects of the dropped term as well.
>
> Michael Blasnik
>
> ----- Original Message -----
> From: "Alexandra Wilson" <[email protected]>
> To: <[email protected]>
> Sent: Thursday, May 17, 2007 1:05 AM
> Subject: st: Correlation b/w independent variables in xtlogit
>
>
> > Dear Statalisters.
> > I have a simple question: if the answer is well known to everyone but me,
> > apologies, but I am living in Tanzania where there is a dearth of
> > statisticians and stats books, and I have trawled the internet and the
> > statalist archives to no avail.
> > I am running a panel regression with a dichotomous variable using xtlogit.
> > I was getting strange (unexpected) results, and realized 2 of my independent
> > variables were highly correlated (correlation coefficient 0.92). So I
> > omitted one and the results were much more in line with other tests. But in
> > my list of independent variables I still have a variable for age (of panel
> > subject) and a variable for the square of age. These 2 variables are, of
> > course, also highly correlated. So why is it correct to leave both these
> > highly correlated variables in the regression, and yet to exclude the other
> > highly correlated variable?
> > Any enlightenment much appreciated.
> > Alexandra Wilson
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/