I didn't mean much more than I said. I've done this a few times:
1. Given a bunch of variables, sometimes it helps to look at loadings
and scores on the first few principal components and identify subsets of
variables. Ideally, but not necessarily, those subsets have similar or
overlapping interpretations.
2. Then re-order the variables so that the same groupings are more
evident in shuffled correlation and scatter plot matrices.
3. Now discard the PC results, but proceed better informed in some
regression-type modelling on which variables deserve or don't deserve
protection.
This is at best an informal method calling on scientific judgment. It is
not automatic or programmable.
Others may have had loosely similar experiences from quite different
fields. (There is probably some psychological name for this.)
For example, once you have climbed a hill or mountain one way you
realise that there are better ways to get to the top. (Usually, the
guidebooks did tell you that.)
Once you have found one algebraic derivation, you can then see your way
to an even easier one.
Nick
[email protected]
Michael I. Lichter 23 December 2009 18:47
Thanks Jay and Nick for your helpful comments and references.
If I can continue beating this poor, dead horse just a little longer ...
Back in August, Nick described his method of "disposable principal
component analysis" (see list of steps below), which he concluded with
"discard PC results and proceed with modeling." Clearly, he wouldn't
have done the PCA if it didn't guide his modeling in some way. Does he
use it to determine which variables to retain in his model and which to
discard? Just curious.
Nick Cox
> I've found occasional use of PCA in the following way.
>
> 1. Plot the data.
> 2. Calculate correlations, etc.
> 3. Look at the results: get some ideas.
> 4. Calculate PCs.
> 5. Use PCs to help structure understanding of #1 and #2 in terms of
> variables that go together, variables that are singletons, etc.
> Sometimes, results of #1 and #2 now make more sense in their own
terms.
> (For example, a reordering of a scatter plot matrix or correlation
> matrix makes it easier to see what is going on.) Often it is useful
here
> to look at a table of correlations between original variables and new
> PCs. -cpcorr- from SSC helps with that.
> 6. Now discard PC results and proceed with modelling.
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/