Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <n.j.cox@durham.ac.uk> |
To | "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu> |
Subject | RE: st: principal component analysis-creating linear combinations |
Date | Thu, 10 Mar 2011 16:12:19 +0000 |
What was wrong? In your own calculation, you were using unstandardised variables, but you needed to standardise them. However, as said, none of it is necessary as -predict- does all the work for you. Nick n.j.cox@durham.ac.uk Nick Cox This is largely superseded by answers already sent. But yes, something was wrong with your home-made attempt to create PC1, as the correlation with the PC1 from -predict- is not even 1. James Wu Thank you, but "-predict-" generates only the first component scores. (1) By the way, would it be wrong to construct the linear combinations as I described earlier? such as, Y1=0.3894*x1+0.4517*x2+0.5733*x3+0.5619*x4. Here is the comparison: . predict pc1 (ommission the output) . sum pc1 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- pc1 | 659 2.97e-09 1.558505 -3.00555 6.801751 . gen Y1=0.3894*x1+0.4517*x2+0.5733*x3+0.5619*x4 . pwcorr pc1 Y1 | pc1 Y1 -------------+------------------ pc1 | 1.0000 Y1 | 0.9724 1.0000 (2) As one can see from the original PCA, the second component have positive signs on x1 and x2. So I want to create the second component scores. How can I obtain (if I do not create it by Y2=0.8726*x1+0.0966*x2-0.3179*x3-0.3580*x4)? On Thu, Mar 10, 2011 at 9:46 AM, Maarten buis <maartenbuis@yahoo.co.uk> wrote: > --- On Thu, 10/3/11, James Wu wrote: >> Suppose we ran pca on four variables, x1, x2, x3, x4 as >> follows: >> Now, suppose that you decide to retain the firs two >> principal components, and then you want to create two >> variables that are linear combinations of the original >> four variables. > > Then you need to use -predict-, see help -pca postestimation-. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/