Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: What have I forgotten...?

From   Herb Smith <[email protected]>
To   [email protected]
Subject   Re: st: RE: What have I forgotten...?
Date   Fri, 20 Oct 2006 13:49:42 -0400 (EDT)


	Wonderful explanation!  I will study again at greater length, but
am relieved as well as informed...



Professor of Sociology and
Director, Population Studies Center
230 McNeil Building
3718 Locust Walk CR
University of Pennsylvania
Philadelphia, PA  19104-6298

[email protected]

215.898.7768 (office)
215.898.2124 (fax)

On Fri, 20 Oct 2006, German Rodriguez wrote:

> Herb,
> The short answer is that there's nothing wrong with your code, and the
> regression coefficients need just the right standardization to evolve into
> partial correlations.
> Let (y,x,z) ~ MVN(m,V). Partition m=(m1\m2) and V=(V11, V12 \ V21, V22) so y
> has mean m1 and variance V11 and the column vector x\z has mean m2 and
> variance V22.
> Then the conditional distribution of y|x\z is MVN with mean m1 - V12 V22^-1
> (x\z-m2) and variance V11 - V12 V22^-1 V21 [nicer-looking formulas in
> Wikipidea, see link below].
> We can do these calculations in Mata. In your example the unconditional
> means are all zero so we work just with V
> : V = (1, .2, .5 \ .2, 1, .2 \ .5, .2,  1)
> : b = V[1,(2,3)] * invsym( V[(2\3),(2,3)] )
> : b
>                  1             2
>     +-----------------------------+
>   1 |  .1041666667   .4791666667  |
>     +-----------------------------+
> The two regression coefficients are .104 and .479, just like your simulation
> shows. So the question now is why one agrees with the partial correlation
> and the other doesn't.
> The partial correlation yx.z comes from the conditional distribution of y
> and x given z, which has variance (I'll type rather than extract the values
> for clarity)
> : gz = (1, .2 \ .2, 1) - (.5 \ .2) * (.5 , .2)
> : gz
> [symmetric]
>          1     2
>     +-------------+
>   1 |  .75        |
>   2 |   .1   .96  |
>     +-------------+
> : corr(gz)[1,2]
>   .1178511302
> So the partial correlation is indeed 0.118. Note that given z the
> (conditional) variances of y and x are different.
> Now look at yz.x, which requires a different conditional distribution
> : gx = (1, .5 \ .5, 1) - (.2 \ .2) * (.2 , .2)
> : gx
> [symmetric]
>          1     2
>     +-------------+
>   1 |  .96        |
>   2 |  .46   .96  |
>     +-------------+
> : corr(gx)[1,2]
>   .4791666667
> And the partial correlation is indeed .479. Note that given x, the
> (conditional) variances of y and z happen to be the same. And therein lies a
> clue.
> Suppose we standardize the regression coefficients by the ratio of the
> standard deviations of the outcome and the predictor given the other
> predictor.
> For yz.x we do noting because the ratio is one. For yx.z we compute
> : b[1] * sqrt(gz[2,2]/gz[1,1])
>   .1178511302
> And we have the partial correlation! So all is well.
> As an aside, my favorite way of computing partial correlations like yx.z is
> to regress y on z and compute residuals y.z, then regress x on z and compute
> residuals x.z (read the dot as 'net of'). If you regress y.z on x.z you get
> a constant of zero and a slope equal to the coefficient of x in the
> regression of y on both x and z. And the correlation between y.z and x.z is
> the same as the partial correlation yx.z.
> Cheers,
> Germ�n
> P.S. for more readable MVN formulas see
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Herb Smith
> Sent: Friday, October 20, 2006 7:11 AM
> To: [email protected]
> Subject: st: What have I forgotten...?
> I have simulated three variables, X, Y, and Z, with means of 0, variances
> of 1, and a correlation matrix of
> 	Y	z
> X	.2	.2
> Y		.5
> I calculate (pen and paper, or -dis-) partial correlations of r_sub_yz.x =
> .479167 and r_sub_yx.z = .117851
> If I generate a large enough sample, I can reproduce my correlation matrix
> with -corr- and the anticipated partial correlations with -pcorr- (not to
> mention the anticipated means and standard deviations, as per -summ-)
> But, when I -regress- y x z (with or without -, beta-) I get
> b_sub_yz.x ~ .479 (as I rather imagined I would), but
> b_sub_yx.z ~ .104 (not ~.118)
> I am forgetting something elementary about the (non?)-correspondence
> between partial correlation coefficients and standardized regression
> coefficients (I should think); else there is something weird in my code...
> Thanks in advance,
> --Herb
> Herbert L. Smith
> Professor of Sociology and
> Director, Population Studies Center
> 230 McNeil Building
> 3718 Locust Walk CR
> University of Pennsylvania
> Philadelphia, PA  19104-6298
> [email protected]
> 215.898.7768 (office)
> 215.898.2124 (fax)
> *
> *   For searches and help try:
> *
> *
> *
> *
> *   For searches and help try:
> *
> *
> *

*   For searches and help try:

© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index