|
|
Stata’s factor command allows you to fit common-factor models; see also principal components.
By default, factor produces estimates using the principal-factor method (communalities set to the squared multiple-correlation coefficients). Alternatively, factor can produce iterated principal-factor estimates (communalities re-estimated iteratively), principal-components factor estimates (communalities set to 1), or maximum-likelihood factor estimates.
After you fit a factor model, Stata allows you to rotate the factor-loading matrix using the varimax (orthogonal) and promax (oblique) methods. Stata can score a set of factor estimates using either rotated or unrotated loadings. Both regression and Bartlett scorings are available.
Below we fit a maximum-likelihood factor model on eight medical symptoms from a medical outcomes study (Tarlov et al. 1989) using three factors:
. factor joints-throat, ml factors(3) protect(5) (obs=3046) Likelihood verification 0, maximum = -21.8257 Likelihood verification 1, maximum = -21.8257 Likelihood verification 2, maximum = -21.8257 Likelihood verification 3, maximum = -18.4300 Likelihood verification 4, maximum = -21.8257 Likelihood verification 5, maximum = -18.4300 Differing maxima obtained. Iteration 0: log likelihood =-1925.2187 Iteration 1: log likelihood =-40.623068 Iteration 2: log likelihood = -27.38831 Iteration 3: log likelihood =-26.291917 Iteration 4: log likelihood = -18.49983 Iteration 5: log likelihood = -18.43281 Iteration 6: log likelihood =-18.430164 Iteration 7: log likelihood =-18.429999 Iteration 8: log likelihood =-18.429988 Iteration 9: log likelihood =-18.429988 Iteration 10: log likelihood =-18.429988 (maximum likelihood factors; 3 factors retained) Factor Variance Difference Proportion Cumulative ------------------------------------------------------------------ 1 2.36049 1.64310 0.6892 0.6892 2 0.71739 0.37019 0.2095 0.8986 3 0.34720 . 0.1014 1.0000 Test: 3 vs. no factors. Chi2( 24) = 4718.59, Prob > chi2 = 0.0000 Test: 3 vs. more factors. Chi2( 7) = 36.79, Prob > chi2 = 0.0000 Factor Loadings Variable | 1 2 3 Uniqueness ----------+------------------------------------------- joints | 0.62749 -0.07856 0.26240 0.53124 cough | 0.29859 0.14908 0.05009 0.88611 backache | 0.82633 -0.33130 -0.11018 0.19530 nausea | 0.49540 0.49656 -0.25307 0.44396 indigest | 0.46711 0.39728 -0.06671 0.61953 hvyfeel | 0.57369 0.21220 0.42173 0.44798 headache | 0.50816 0.25731 -0.12097 0.66092 throat | 0.37922 0.25219 0.05205 0.78988
To obtain these results, we typed
factor joints-throat, ml factors(3) protect(5)
All Stata commands share the same syntax: the command name is followed by the dependent variable; and then the independent variables; and then, optionally, a comma and any options. We specified factor's ml option, producing estimates by maximum likelihood. We also typed factors(3) to indicate that we wanted to keep the first three factors.
This is an interesting problem because there are two distinct local maxima. Stata has a unique feature to ensure that you have found the global maximum by using different starting points to search out different solutions. protect(5) indicated that this search was to be performed five times.
We find that most of the explained variance can be attributed to the first factor. Stata also shows the unique variance attributed to each variable.
The researcher actually fitting this model interpreted the first factor as a measure of the general level of sickness and the second factor as a difference between musculoskeletal problems and other types of problems. If he had wanted to rotate the factor loadings to search for different interpretations, he could now type rotate to examine an orthogonal varimax rotation; rotate, promax to examine an oblique promax rotation; or, for instance, rotate, promax(4) to examine a promax rotation with promax power 4 (producing simpler loadings but at a cost of more correlation between factors).
Stata’s score command produces estimates of the factors after factor or rotate:
. score f1 (based on unrotated factors) (2 scorings not used) Scoring Coefficients Variable | 1 ----------+---------- joints | 0.15644 cough | 0.04463 backache | 0.56038 nausea | 0.14779 indigest | 0.09986 hvyfeel | 0.16960 headache | 0.10183 throat | 0.06359
Typing score f1 produced estimates of the first factor. Typing score f1 f2 would produce estimates of the first two factors, and typing score f1 f2 f3 (or score f1-f3) would produce estimates of the first three factors. The names f1, f2, etc., are arbitrary; the score command allows you to create new variables that could then be used in analysis.
Stata also has a command for Cronbach’s alpha, providing a simpler way of combining the eight symptoms, assuming that all have equal weight:
. alpha joints-throat, generate(symplev) Scale = sum(unstandardized variables) Average interitem covariance: .3783125 Number of items in the scale: 8 Scale reliability coefficient: 0.7591 . summarize f1 symplev Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------- f1 | 3046 4.86e-10 .9314048 -1.254182 3.1028 symplev | 3320 2.021112 .7290644 1 5 . correlate f1 symplev (obs=3046) | f1 symplev --------+------------------ f1| 1.0000 symplev| 0.9343 1.0000
It turns out that the scale created by alpha and the first factor score estimate are highly correlated with each other.
See New in Stata 18 to learn about what was added in Stata 18.