Stata | FAQ: Stata 6: Simulating multivariate normal observations

Home / Resources & support / FAQs / Stata 6: Simulating multivariate normal observations

Note: This FAQ is for users of Stata 6. It is not relevant for more recent versions.

The following material is based on a question and answer that appeared on Statalist.

Stata 6: How can I simulate random multivariate normal observations from a given correlation matrix?

Title		Stata 6: Simulating multivariate normal observations
Author		William Gould, StataCorp

Pretend one had random variates C = (c1, c2, ..., ck) that are all orthogonal. One rotates this space via

                        a11   a12   ...     a1k
	Y = AC,    A =  a21   a22   ...     a2k
			...
			ak1   ak2   ...     akk

where A is an arbitrary but full-rank matrix. Thus we could

	. gen c1 = invnorm(uniform())   
	. gen c2 = invnorm(uniform())   
	etc.

and then

	gen y1 = a11*c1 + a12*c2 + ... + a1k*ck
	gen y2 = a21*c1 + a22*c2 + ... + a2k*ck

if only we knew the values of A. We want to find values of A such that

	cov(y1,y2) = as desired
	cov(y1,y3) = as desired
	etc.

Let P be the desired covariance matrix. The value of A we seek is the square root (Cholesky decomposition) of P because we want

	P = A'Var(C)A
	  = A'A

P can just as well be the correlation matrix—we just have to put 1s on the diagonal.

Thus I could generate 2 variables with correlation .5:

 . clear

 . matrix P = (1,.5 \.5, 1)

 . mat A = cholesky(P)

 . mat list A

 A[2,2]
          c1        c2
 r1         1         0
 r2        .5  .8660254

 . set obs 4000
 obs was 0, now 4000

 . gen c1= invnorm(uniform())

 . gen c2= invnorm(uniform())

 . gen y1 = c1

 . gen y2 = .5*c1 + .8660254*c2

 . corr y1 y2
 (obs=4000)

       	 |       y1       y2
 --------+------------------
       y1|   1.0000
       y2|   0.5079   1.0000

Now the names c1, c2,, ..., ck for the orthogonal random variates were not chosen at random. Since that is the way Stata, by default, labels the columns of matrices, we could have generated y1 and y2 using the matrix score function:

. matrix a1 = A[1,.]

. matrix score y1 = a1 

. matrix a2 = A[2,.]

. matrix score y2 = a2

This is handy because (1) we do not have to type the coefficients, and (2) how much we have to type on a line is not a function of the number of correlated variables we are generating. In fact, given a matrix P, we could make a crude program to generate y1, y2, ...:

 program define mkcorr
	 version 4.0 

	 local k = rowsof(P)
	 matrix A = cholesky(P)

	 local i 1 
	 quietly {
		 while `i'<=`k' {
			 gen c`i' = invnorm(uniform())
			 local i=`i'+1
		 }
		 local i 1
		 while `i'<=`k' {
			 matrix row = A[`i',.]
			 matrix score y`i' = row
			 local i=`i'+1
		 }
		 local i 1 
		 while `i' <= `k' {
			 drop c`i'
			 local i=`i'+1
		 }
	 }
 end

This program is crude. The name of the covariance (correlation) matrix is prerecorded—it’s P; the names of the output variables are unsettable—they are y1, y2, ...; and the names of the working variables are preset -- they are c1, c2, ....

It does, however, work.

 . clear 

 . matrix P = (1,.25,.5 \ .25, 1, .75 \ .5, .75, 1)

 . set obs 4000
 obs was 0, now 4000

 . mkcorr

 . summ

 Variable |     Obs        Mean   Std. Dev.       Min        Max
 ---------+-----------------------------------------------------
       y1 |    4000    .0095433   .9973219  -3.053706   4.167702  
       y2 |    4000   -.0059072   .9962081  -3.341381   3.419445  
       y3 |    4000   -.0086885   1.009847  -3.448567   4.026851  

 . corr
 (obs=4000)

         |       y1       y2       y3
 --------+---------------------------
       y1|   1.0000
       y2|   0.2664   1.0000
       y3|   0.5206   0.7441   1.0000

Obviously, someone could make mkcorr fancier.

We use cookies

We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.

Cookie Settings

Last updated: 16 November 2022

StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.

Advertising and performance cookies

This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.

Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.

Stata 6: How can I simulate random multivariate normal observations from a given correlation matrix?

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies

Stata/MP4 Annual License (download)

Stata 6: How can I simulate random multivariate normal observations from a given correlation matrix?

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies