Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Strange behaviour of -correlate- command
From
Zurab Sajaia <[email protected]>
To
statalist <[email protected]>
Subject
st: Strange behaviour of -correlate- command
Date
Thu, 9 Dec 2010 19:23:38 -0500
Dear all,
I've encountered a problem for which I can't find an explanation so far, it seems that I'm getting wrong estimates of covariance, results differ if I use -correlate- command or do calculations manually (I tried exporting data to Excel and used COVAR() function there and it seems that Excel is on my side),
so I was wandering whether something is indeed wrong in Stata, or I'm doing it incorrectly (perhaps it's time to stop working and go home?)...
So here the deal, I've uploaded an example dataset to the web (30kb):
.use http://www.adeptanalytics.org/download/temp/corr_bug.dta, clear
.corr y r, c
(obs=2419)
| y r
-------------+------------------
y | 2.8e+07
r | 1142.05 .083368
but if I do it manually:
.summarize y, meanonly
.generate double y1 = y - r(mean)
.summarize r, meanonly
generate double r1 = r - r(mean)
generate double prod = y1 * r1
summarize prod
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
prod | 2419 1141.579 2152.761 -53.76514 47015.59
The same result (1141.579) I get using Excel's COVAR() function.
Do you have any ideas what can be happening here?
Thanks,
Zurab
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/