Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: PCA


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: RE: PCA
Date   Fri, 16 Jan 2009 18:03:43 -0000

I don't know what Nm3 are, even guessing that m3 means cubic metres. But the key is that you have different units of measurement for your variables. As that is so, -pca- on the covariance matrix is likely to be meaningless. Even applied to the correlation matrix for this kind of data, you may need to consider gross skewness and whether there are trace zeros. 
Skewness implies logarithms, but zeros make that difficult! 

Nick 
[email protected] 

[email protected]

I want to use a PCA with a dataset of  17 variables (congeners) 
measured in 120 food and air samples.
The concentrations are expressed in 
different units of measurement (ng/Nm3, ng/kg)
Is it correct assign to each 
variable a value that corrispond to the concentration measured for the 
congener, reported in relation to the sum of the 17 congener concentrations?

I 
use the covariance matrix in PCA, with this output:

    
--------------------------------------------------------------------------

       Component |   Eigenvalue   Difference         Proportion   Cumulative

    -------------+------------------------------------------------------------

           Comp1 |     .0474947     .0380502             0.6869       0.6869

           Comp2 |    .00944454    .00531611             0.1366       0.8234

           Comp3 |    .00412843   .000944856             0.0597       0.8831

           Comp4 |    .00318357    .00184526             0.0460       0.9292

           Comp5 |    .00133831   .000309329             0.0194       0.9485

           Comp6 |    .00102898   .000329951             0.0149       0.9634

           Comp7 |    .00069903   .000190709             0.0101       0.9735

           Comp8 |   .000508321    .00011746             0.0074       0.9809

           Comp9 |   .000390861  .0000988767             0.0057       0.9865

          Comp10 |   .000291985   .000103161             0.0042       0.9908

          Comp11 |   .000188823  .0000422023             0.0027       0.9935

          Comp12 |   .000146621  .0000242402             0.0021       0.9956

          Comp13 |   .000122381  .0000341471             0.0018       0.9974

          Comp14 |  .0000882337  .0000235538             0.0013       0.9986

          Comp15 |  .0000646799  .0000359076             0.0009       0.9996

          Comp16 |  .0000287723  .0000287712             0.0004       1.0000

          Comp17 |  1.03334e-09            .             0.0000       1.0000

    --------------------------------------------------------------------------


2 principal components accounted for a total of 82% of variance in the data set


Is it correct?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index