Michael I. Lichter
> Could Nick or someone else explain (a) what is meant by
> "continua," and
> (b) how you justify & properly handle
> categorical/dichotomous variables
> in PCA? Thanks.
>
> -ml
>
> Quoting Nick Cox <[email protected]>:
>
> > 1. You are interested, naturally enough, in correlations
> > among predictors. Whether observations are clustered
> > together is a different issue. It is easy to think
> > of continua with high correlations, continua with
> > low correlations, cluster structure with high
> > correlations and cluster structure with low
> > correlations.
> >
> > 2. If cluster structure exists, it will be evident
> > in plots of the first few principal components.
> > The fact that some of your variables
> > are categorical or binary would complicate a PCA without
> > making it impossible.
"continua" is the plural of "continuum".
PCA to me is a transformation procedure.
What is "justified" or "proper" in PCA
may differ for you if you have different
expectations of what it can achieve.
Some analysts seem determined to try
to turn it into a modelling or inferential
procedure.
I don't see why e.g. a 0-1 variable
can't be an input variable to PCA. This to
me is no more and no less problematic
than putting such a variable as a predictor
into a regression model. I would imagine
that a purely nominal variable should be
best inserted as a series of indicators.
However, I do think that PCA works best
when all variables are measured.
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/