Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: principal component analysis-creating linear combinations
From 
 
Nick Cox <[email protected]> 
To 
 
"'[email protected]'" <[email protected]> 
Subject 
 
RE: st: principal component analysis-creating linear combinations 
Date 
 
Thu, 10 Mar 2011 16:12:19 +0000 
What was wrong? In your own calculation, you were using unstandardised variables, but you needed to standardise them. However, as said, none of it is necessary as -predict- does all the work for you. 
Nick 
[email protected] 
Nick Cox 
This is largely superseded by answers already sent. But yes, something was wrong with your home-made attempt to create PC1, as the correlation with the PC1 from -predict- is not even 1. 
James Wu
Thank you, but "-predict-" generates only the first component scores.
(1) By the way, would it be wrong to construct the linear combinations
as I described earlier?
such as, Y1=0.3894*x1+0.4517*x2+0.5733*x3+0.5619*x4.
Here is the comparison:
. predict pc1
(ommission the output)
. sum pc1
    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
         pc1 |       659    2.97e-09    1.558505   -3.00555   6.801751
. gen Y1=0.3894*x1+0.4517*x2+0.5733*x3+0.5619*x4
. pwcorr  pc1 Y1
             |      pc1       Y1
-------------+------------------
         pc1 |   1.0000
          Y1 |   0.9724   1.0000
(2) As one can see from the original PCA, the second component have
positive signs on x1 and x2.
So I want to create the second component scores.
How can I obtain (if I do not create it by
Y2=0.8726*x1+0.0966*x2-0.3179*x3-0.3580*x4)?
On Thu, Mar 10, 2011 at 9:46 AM, Maarten buis <[email protected]> wrote:
> --- On Thu, 10/3/11, James Wu wrote:
>> Suppose we ran pca on four variables, x1, x2, x3, x4 as
>> follows:
>> Now, suppose that you decide to retain the firs two
>> principal components, and then you want to create two
>> variables that are linear combinations of the original
>> four variables.
>
> Then you need to use -predict-, see help -pca postestimation-.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/