Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: AW: correlate lag variables
From
"Nick Cox" <[email protected]>
To
<[email protected]>
Subject
st: RE: AW: correlate lag variables
Date
Mon, 10 May 2010 11:38:30 +0100
The reason for differences is that -correlate- will only correlate variables for observations for which _all_ variables specified are non-missing. As Martin is implying, -pwcorr- is more indulgent, which is not necessarily a feature.
The output for -correlate- made it clear that different numbers of observations were being used.
At a guess, Julia's data are panel data, so every extra lag bites hard, meaning that for any increase in lag by 1, one more observation is necessarily lost at the end of each panel. So, the last observation in each panel cannot be used with lag one, the previous one with lag two, and so forth.
Nick
[email protected]
Martin Weiss
Try -pwcorr- instead:
*************
clear*
set obs 100
gen y=1
replace y =.6*y[_n-1]+rnormal() in 2/l
gen byte time=_n
tsset time
corr y L.y L2.y
pwcorr y L.y
pwcorr y L.y L2.y
*************
Julia
I would like to calculate the correlation between a variable and its
past values. Thus, I use the following command:
. correlate BI L1.BI L2.BI
(obs=225)
| L. L2.
| BI BI BI
-------------+---------------------------
BI|
--. | 1.0000
L1. | 0.0111 1.0000
L2. | 0.0647 0.0161 1.0000
However, if I only ask the correlation for the first lag, my result
differs....
. correlate BI L1.BI
(obs=265)
| L.
| BI BI
-------------+------------------
BI|
--. | 1.0000
L1. | 0.0174 1.0000
Why does excluding the second lag affect the correlation between the
variable and its first lag?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/