Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: More re factor loadings

From	[email protected] (Kristin MacDonald, StataCorp LP)
To	[email protected]
Subject	Re: st: More re factor loadings
Date	Tue, 01 Oct 2013 08:31:27 -0500
Dave Garson <[email protected]> asks how to obtain the coefficients that
SPSS refers to as "factor score weights" and that SAS labels "latent variable
scores regression coefficients".  

Fist, let me discuss the terminology we use in our documentation.  I
recognize that different groups may call coefficients by different names, so I
want to make sure that there is no confusion.  When we use the term "factor
loading", we are referring to the coefficients on paths from latent variables
to observed variables.  These may be in the standardized or unstandardized
metric.

I believe that Dave would instead like the coefficients that can be used to
create a linear combination of the observed variables corresponding to the
predicted value of the latent variable.  In Stata, we call these "scoring
coefficients" in the '[MV] factor postestimation' manual entry where we
discuss predictions of factors with exploratory factor analysis.

There is not an option to automatically obtain a matrix of regression scoring
coefficients after fitting a model with -sem-.  However, if Dave is interested
in obtaining the predicted factor scores, he can use the -predict, latent-
command.  For example, 

  webuse sem_1fmm, clear
  sem (X -> x1 x2 x3 x4)
  predict xpred, latent(X)

This creates a new variable, xpred, containing the predicted value of X.

If Dave is interested in the actual coefficients used in the linear
combination that produces these predictions, he can create them manually using
the matrices returned by -estat framework- after -sem-.  In the case of a
standard CFA model, the coefficients are a function of the -r(Sigma)- matrix.
These coefficients are applied to the observed variables after they have been
centered.  The -r(mu)- matrix contains the means of each variable which we can
use to center the observed variables.  The code below demonstrates how to
predict the value of the latent variable X manually, for the above model:


  estat framework, fitted
  mat mu = r(mu)
  mat sigma = r(Sigma)
  mat sigma_zz = sigma[1..4,1..4]
  mat inv_sigma_zz = syminv(sigma_zz)
  mat sigma_zl = sigma[5,1..4]

  mat scoef = inv_sigma_zz*sigma_zl'
  mat list scoef

  forvalues i = 1/4 {
    gen x`i'_cent = x`i' - mu[1,`i']
  }

  gen mypred = scoef[1,1]*x1_cent + scoef[2,1]*x2_cent + ///
               scoef[3,1]*x3_cent + scoef[4,1]*x4_cent

  list xpred mypred in 1/10


The coefficients are stored in the scoef matrix and are then used to predict
the value of X in a new variable called mypred.  These are equivalent to the
values produced by the -predict- command above.  The output for the full set
of commands is given below my signature.

More complicated models containing structural paths not included in a CFA
model will require more matrix calculations that involve the fitted structural
path coefficients. 

--Kristin
[email protected]



. use sem_1fmm, clear
(single-factor measurement model)

. sem (X -> x1 x2 x3 x4)

Endogenous variables

Measurement:  x1 x2 x3 x4

Exogenous variables

Latent:       X

Fitting target model:

Iteration 0:   log likelihood = -2081.0258  
Iteration 1:   log likelihood =  -2080.986  
Iteration 2:   log likelihood = -2080.9859  

Structural equation model                       Number of obs      =       123
Estimation method  = ml
Log likelihood     = -2080.9859

 ( 1)  [x1]X = 1
------------------------------------------------------------------------------
             |                 OIM
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
Measurement  |
  x1 <-      |
           X |          1  (constrained)
       _cons |   96.28455   1.271963    75.70   0.000     93.79155    98.77755
  -----------+----------------------------------------------------------------
  x2 <-      |
           X |   1.172364   .1231777     9.52   0.000     .9309398    1.413788
       _cons |   97.28455   1.450053    67.09   0.000      94.4425    100.1266
  -----------+----------------------------------------------------------------
  x3 <-      |
           X |   1.034523   .1160558     8.91   0.000     .8070579    1.261988
       _cons |   97.09756   1.356161    71.60   0.000     94.43953    99.75559
  -----------+----------------------------------------------------------------
  x4 <-      |
           X |   6.886044   .6030898    11.42   0.000     5.704009    8.068078
       _cons |   690.9837   6.960137    99.28   0.000     677.3421    704.6254
-------------+----------------------------------------------------------------
    var(e.x1)|   80.79361   11.66414                      60.88206    107.2172
    var(e.x2)|   96.15861   13.93945                      72.37612    127.7559
    var(e.x3)|   99.70874   14.33299                      75.22708    132.1576
    var(e.x4)|   353.4711   236.6847                      95.14548    1313.166
       var(X)|   118.2068   23.82631                      79.62878    175.4747
------------------------------------------------------------------------------
LR test of model vs. saturated: chi2(2)   =      1.78, Prob > chi2 = 0.4111

. predict xpred, latent(X)

. 
. 
. estat framework, fitted

Endogenous variables on endogenous variables

                 | observed                                   
            Beta |        x1         x2         x3         x4 
    -------------+--------------------------------------------
    observed     |                                            
              x1 |         0                                  
              x2 |         0          0                       
              x3 |         0          0          0            
              x4 |         0          0          0          0 
    ----------------------------------------------------------

Exogenous variables on endogenous variables

                 | latent    
           Gamma |         X 
    -------------+-----------
    observed     |           
              x1 |         1 
              x2 |  1.172364 
              x3 |  1.034523 
              x4 |  6.886044 
    -------------------------

Covariances of error variables

                 | observed                                   
             Psi |      e.x1       e.x2       e.x3       e.x4 
    -------------+--------------------------------------------
    observed     |                                            
            e.x1 |  80.79361                                  
            e.x2 |         0   96.15861                       
            e.x3 |         0          0   99.70874            
            e.x4 |         0          0          0   353.4711 
    ----------------------------------------------------------

Intercepts of endogenous variables

                 | observed                                   
           alpha |        x1         x2         x3         x4 
    -------------+--------------------------------------------
           _cons |  96.28455   97.28455   97.09756   690.9837 
    ----------------------------------------------------------

Covariances of exogenous variables

                 | latent    
             Phi |         X 
    -------------+-----------
    latent       |           
               X |  118.2068 
    -------------------------

Means of exogenous variables

                 | latent    
           kappa |         X 
    -------------+-----------
            mean |         0 
    -------------------------

Fitted covariances of observed and latent variables

                 | observed                                   | latent    
           Sigma |        x1         x2         x3         x4 |         X 
    -------------+--------------------------------------------+-----------
    observed     |                                            |           
              x1 |  199.0004                                  |           
              x2 |  138.5813   258.6263                       |           
              x3 |  122.2876   143.3656   226.2181            |           
              x4 |  813.9769   954.2769   842.0779   5958.551 |           
    -------------+--------------------------------------------+-----------
    latent       |                                            |           
               X |  118.2068   138.5813   122.2876   813.9769 |  118.2068 
    ----------------------------------------------------------------------

Fitted means of observed and latent variables

                 | observed                                   | latent    
              mu |        x1         x2         x3         x4 |         X 
    -------------+--------------------------------------------+-----------
              mu |  96.28455   97.28455   97.09756   690.9837 |         0 
    ----------------------------------------------------------------------

. mat mu = r(mu)

. mat sigma = r(Sigma)

. mat sigma_zz = sigma[1..4,1..4]

. mat inv_sigma_zz = syminv(sigma_zz)

. mat sigma_zl = sigma[5,1..4]

. 
. mat scoef = inv_sigma_zz*sigma_zl'

. mat list scoef

scoef[4,1]
                latent:
                     X
observed:x1  .06875754
observed:x2  .06772851
observed:x3  .05763739
observed:x4  .10822142

. 
. forvalues i = 1/4 {
  2.   gen x`i'_cent = x`i' - mu[1,`i']
  3. }

. 
. gen mypred = scoef[1,1]*x1_cent + scoef[2,1]*x2_cent + ///
>              scoef[3,1]*x3_cent + scoef[4,1]*x4_cent

. list xpred mypred in 1/10

     +-----------------------+
     |     xpred      mypred |
     |-----------------------|
  1. | -26.55233   -26.55233 |
  2. |  11.92044    11.92044 |
  3. |  8.319204    8.319203 |
  4. |  -7.50836    -7.50836 |
  5. |  -3.87875   -3.878749 |
     |-----------------------|
  6. |  .9258427    .9258427 |
  7. | -4.445202   -4.445201 |
  8. |  3.599469    3.599469 |
  9. | -4.307086   -4.307086 |
 10. |  6.506975    6.506975 |
     +-----------------------+


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
Prev by Date: st: Paul Allison's SEM model with fixed effects, reciprocal effects, and lagged predictors
Next by Date: Re: st: Using OBDC to reference a Stata file?
Previous by thread: st: Paul Allison's SEM model with fixed effects, reciprocal effects, and lagged predictors
Next by thread: Re: Re: st: RE: Plotting interactions
Index(es):
- Date
- Thread