Principal components

Order

<- See Stata's other features

Stata’s pca allows you to estimate parameters of principal-component models.

. webuse auto
(1978 Automobile Data)

. pca price mpg rep78 headroom weight length displacement

Principal components/correlation            Number of obs    =        69
                                            Number of comp.  =         7
                                            Trace            =         7
    Rotation: (unrotated = principal)       Rho              =    1.0000
    


   Component     Eigenvalue   Difference         Proportion   Cumulative

   

       Comp1        4.34052      3.32161             0.6201       0.6201

       Comp2        1.01891      .184366             0.1456       0.7656

       Comp3        .834546      .443705             0.1192       0.8849

       Comp4        .390842      .116964             0.0558       0.9407

       Comp5        .273877      .162216             0.0391       0.9798

       Comp6        .111662     .0820227             0.0160       0.9958

       Comp7       .0296392            .             0.0042       1.0000




Principal components (eigenvectors)
    


    Variable      Comp1     Comp2     Comp3     Comp4     Comp5     Comp6 

   

       price     0.2761    0.6781   -0.2652    0.5810   -0.1570    0.1653

         mpg    -0.4186    0.0202    0.1017    0.3700    0.7906    0.2281

       rep78    -0.2222    0.7039    0.4923   -0.4419    0.0433   -0.1222

    headroom     0.2713   -0.2016    0.8172    0.4367   -0.1624    0.0068

      weight     0.4660    0.0442   -0.0304   -0.1611    0.2893    0.1408

      length     0.4525   -0.0128    0.0808   -0.3368    0.2070    0.6132

displacement     0.4513    0.0388   -0.0420    0.0116    0.4421   -0.7141



    

    Variable      Comp7    Unexplained 

     

       price     0.0632             0 

         mpg     0.0050             0 

       rep78    -0.0259             0 

    headroom    -0.0293             0 

      weight    -0.8065             0 

      length     0.5063             0 

displacement     0.2961             0

We typed pca price mpg ... displacement. All Stata commands share the same syntax: the names of the variables (dependent first and then independent) follow the command's name, and they are, optionally, followed by a comma and any options. In this case, we did not specify any options.

Having estimated the principal components, we can at any time type pca by itself to redisplay the principal-component output. We can also type screeplot to obtain a scree plot of the eigenvalues, and we can use the predict command to obtain the components themselves.

screeplot, typed by itself, graphs the proportion of variance explained by each component:

. screeplot

Typing screeplot, yline(1) ci(het) adds a line across the y-axis at 1 and adds heteroskedastic bootstrap confidence intervals.

. screeplot, yline(1) ci(het)

We can obtain the first two components by typing

. predict pc1 pc2, score
(5 components skipped)

Scoring coefficients
    sum of squares(column-loading) = 1   


    Variable      Comp1     Comp2     Comp3     Comp4     Comp5     Comp6 

   

       price     0.2761    0.6781   -0.2652    0.5810   -0.1570    0.1653

         mpg    -0.4186    0.0202    0.1017    0.3700    0.7906    0.2281 

       rep78    -0.2222    0.7039    0.4923   -0.4419    0.0433   -0.1222 

    headroom     0.2713   -0.2016    0.8172    0.4367   -0.1624    0.0068 

      weight     0.4660    0.0442   -0.0304   -0.1611    0.2893    0.1408 

      length     0.4525   -0.0128    0.0808   -0.3368    0.2070    0.6132 

displacement     0.4513    0.0388   -0.0420    0.0116    0.4421   -0.7141 


   


    Variable      Comp7      

   

       price     0.0632      

         mpg     0.0050    

       rep78    -0.0259    

    headroom    -0.0293 

      weight    -0.8065 

      length     0.5063 

displacement     0.2961

The score option tells Stata's predict command to compute the scores of the components, and pc1 and pc2 are the names we have chosen for the two new variables. We could have obtained the first three factors by typing, for example, predict pc1 pc2 pc3, score.

An important feature of Stata is that it does not have modes or modules. We typed pca to estimate the principal components. We then typed screeplot to see a graph of the eigenvalues — we did not have to save the data and change modules. Similarly, we typed predict pc1 pc2, score to obtain the first two components. The new variables, pc1 and pc2, are now part of our data and are ready for use; we could now use regress to fit a regression model.

The two components should have correlation 0, and we can use the correlate command, which like every other Stata command, is always available for use. To verify that the correlation between pc1 and pc2 is zero, we type

. correlate pc1 pc2
(obs=69)

                    pc1      pc2

   

         pc1     1.0000

         pc2     0.0000   1.0000

Products

New in Stata 19

Why Stata

All features

Disciplines

Stata/MP

StataNow

Order Stata

Purchase

Order Stata

Bookstore

Stata Press

Stata Journal

Gift Shop

Learn

Free webinars

NetCourses

Classroom and web training

Organizational training

Video tutorials

Third-party courses

Web resources

Teaching with Stata

Support

Training

Video tutorials

FAQs

Statalist: The Stata Forum

Resources

Technical support

Customer service

Alerts

Company

News and events

Customer service

Careers

We use cookies

We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.

Cookie Settings

Privacy policy

Last updated: 16 November 2022

StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.

Required cookies

Advertising cookies

Required cookies

These cookies are essential for our website to function and do not store any personally identifiable information. These cookies cannot be disabled.
Advertising and performance cookies

This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.

Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.

Accept Cookies


Component		Eigenvalue Difference Proportion Cumulative

Comp1		4.34052 3.32161 0.6201 0.6201
Comp2		1.01891 .184366 0.1456 0.7656
Comp3		.834546 .443705 0.1192 0.8849
Comp4		.390842 .116964 0.0558 0.9407
Comp5		.273877 .162216 0.0391 0.9798
Comp6		.111662 .0820227 0.0160 0.9958
Comp7		.0296392 . 0.0042 1.0000


Variable		Comp1 Comp2 Comp3 Comp4 Comp5 Comp6

price		0.2761 0.6781 -0.2652 0.5810 -0.1570 0.1653
mpg		-0.4186 0.0202 0.1017 0.3700 0.7906 0.2281
rep78		-0.2222 0.7039 0.4923 -0.4419 0.0433 -0.1222
headroom		0.2713 -0.2016 0.8172 0.4367 -0.1624 0.0068
weight		0.4660 0.0442 -0.0304 -0.1611 0.2893 0.1408
length		0.4525 -0.0128 0.0808 -0.3368 0.2070 0.6132
displacement		0.4513 0.0388 -0.0420 0.0116 0.4421 -0.7141


Variable	Comp7	Unexplained

price	0.0632	0
mpg	0.0050	0
rep78	-0.0259	0
headroom	-0.0293	0
weight	-0.8065	0
length	0.5063	0
displacement	0.2961	0

Principal components

<- See Stata's other features

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies