Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: principal component analysis-creating linear combinations
From
Nick Cox <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
RE: st: principal component analysis-creating linear combinations
Date
Thu, 10 Mar 2011 16:12:19 +0000
What was wrong? In your own calculation, you were using unstandardised variables, but you needed to standardise them. However, as said, none of it is necessary as -predict- does all the work for you.
Nick
[email protected]
Nick Cox
This is largely superseded by answers already sent. But yes, something was wrong with your home-made attempt to create PC1, as the correlation with the PC1 from -predict- is not even 1.
James Wu
Thank you, but "-predict-" generates only the first component scores.
(1) By the way, would it be wrong to construct the linear combinations
as I described earlier?
such as, Y1=0.3894*x1+0.4517*x2+0.5733*x3+0.5619*x4.
Here is the comparison:
. predict pc1
(ommission the output)
. sum pc1
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
pc1 | 659 2.97e-09 1.558505 -3.00555 6.801751
. gen Y1=0.3894*x1+0.4517*x2+0.5733*x3+0.5619*x4
. pwcorr pc1 Y1
| pc1 Y1
-------------+------------------
pc1 | 1.0000
Y1 | 0.9724 1.0000
(2) As one can see from the original PCA, the second component have
positive signs on x1 and x2.
So I want to create the second component scores.
How can I obtain (if I do not create it by
Y2=0.8726*x1+0.0966*x2-0.3179*x3-0.3580*x4)?
On Thu, Mar 10, 2011 at 9:46 AM, Maarten buis <[email protected]> wrote:
> --- On Thu, 10/3/11, James Wu wrote:
>> Suppose we ran pca on four variables, x1, x2, x3, x4 as
>> follows:
>> Now, suppose that you decide to retain the firs two
>> principal components, and then you want to create two
>> variables that are linear combinations of the original
>> four variables.
>
> Then you need to use -predict-, see help -pca postestimation-.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/