@ Jay
The theoretical reason for this aggregation is that the different variables
indicate different types of health knowledge.
The following are the results of tetrachoric correlation:
Var1 Var2 Var3 Var4
Var1 1
Var2 .1819233 1
Var3 .3699331 .25242738 1
Var4 .18371493 .27407531 .40299934 1
I was specifically asked whether I could justify my choice of one single
factor on the basis of the variance explained. Following your reasoning, I
could argue that with more than 1 factor it would be unidentified. Just to
be sure about the procedure I am following, I have tried to get results
keeping the 4 factors:
factormat R, n(6926) ipf factor(4)
Factor analysis/correlation Number of obs = 6926
Method: iterated principal factors Retained factors = 3
Rotation: (unrotated) Number of params = 6
--------------------------------------------------------------------------
Factor | Eigenvalue Difference Proportion Cumulative
-------------+------------------------------------------------------------
Factor1 | 1.28200 1.06199 0.8049 0.8049
Factor2 | 0.22001 0.12912 0.1381 0.9431
Factor3 | 0.09089 0.09108 0.0571 1.0001
Factor4 | -0.00019 . -0.0001 1.0000
--------------------------------------------------------------------------
Could I state that the first factor explains 80% of the common variance?
Finally, I have tried to add one or two further indicators to improve the
analysis. However, I had some theoretical doubts on the inclusion of these
variables, and the factor analysis with tetrachoric correlations gave me
loadings for these variables much lower than 0.1, thus I was convinced to
use only 4 variables.
Thanks,
Francesco
-----Messaggio originale-----
Da: [email protected]
[mailto:[email protected]] Per conto di Verkuilen, Jay
Inviato: lunedì 21 dicembre 2009 19.54
A: '[email protected]'
Oggetto: st: RE: RE: Factor Analysis: which explained variance?
Nick Cox wrote:
>P.P.S. the whole notion of variance is perhaps a little suspect when the
originals are indicator variables. <
@Nick: I don't know, you have variances, they're just functions of the mean
(proportion)! However, there are covariances that aren't redundant.
@The original poster:
With four indicators, you really can only afford a one dimensional factor
analysis. Anything higher dimension will be, essentially, unidentified, and
thus even more indeterminate than usual for factor analysis. Three
indicators is exactly identified. Four indicators with correlated factors
that have two indicators per factor is also identified, but if the solution
says that you have three and one you're really out of luck.
Without knowing the tetrachoric correlation matrix (these are indicators,
i.e., binary, so polychoric is just tetrachoric anyhow) it's very hard to
say on any statistical grounds.
Is there a theoretical reason to form a sum score from these indicators? For
instance, do they operate like items on a quiz where you want to know the
total score?
Jay
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/