Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Question on Dfactor and Gaps in Time Series
From
Richard Gates <[email protected]>
To
[email protected]
Subject
Re: st: Question on Dfactor and Gaps in Time Series
Date
Mon, 11 Oct 2010 16:01:16 -0500
Degas Wright is getting the "gaps in the time series" error from -dfactor-.
Degas wrote that
> I am using the dfactor command and have run into the gap in time series
> error. My data is price (p), volume (v) and earnings yield (ep) and I am
> trying to develop a dynamic factor model using the dfactor command. My
> code is:
>
> tsset
> time variable: date, 2008w25 to 2010w40
> delta: 1 week
>
> . dfactor(D.(p v ep)=,noconstant)(f=,ar(1/2))
> gaps in the time series are not allowed
> r(459);
>
-dfactor- cannot use datasets that contain gaps in the data.
A gap in the data occurs when there is a missing observation in the middle of a
time series.
-tsfill- fills in gaps in the time variable, not in the other variables in the
dataset. I explain the difference below.
Degas can use -dfactor-, if he is willing to impose an additional assumption
which removes the gaps, as explained below.
Now I fill in the details.
I have some simulated data. The variables have the same names as those in
Degas' example, but the values are arbitrary.
I begin by using the data and running -tsset- on the time variable -t-.
. use mydata
. tsset
time variable: t, 2008w25 to 2010w40, but with gaps
delta: 1 week
The output from -tsset- informs us that there are gaps in the data. We just
happen to know that the missing observations occur in week 52 of each year.
(In my simulated data, the research team takes vacation the last week of the
year, so there is no data for week 52.) We use this knowledge to list out the
data around the missing observations.
. list t if week(dofw(t)) > 50 | week(dofw(t)) < 2 , separator(2)
+---------+
| t |
|---------|
27. | 2008w51 |
28. | 2009w1 |
|---------|
78. | 2009w51 |
79. | 2010w1 |
+---------+
(The week() function displays the week from a time variable stored in daily
format. The dofw() function converts a time variable in weekly format to a
time variable in daily format.)
We cannot use -dfactor- on this data because there are gaps in the data.
If we use -tsfill- on this data, it inserts observations for the missing
periods, but only the time variable will be nonmissing. We illustrate this
point below.
. tsfill, full
. list t p v ep if week(dofw(t)) > 50 | week(dofw(t)) < 2 , separator(2)
+----------------------------------------------+
| t p v ep |
|----------------------------------------------|
27. | 2008w51 11.143581 16.772175 -9.7874321 |
28. | 2008w52 . . . |
|----------------------------------------------|
29. | 2009w1 11.894791 19.141332 -10.752946 |
79. | 2009w51 11.682969 28.361535 -13.24529 |
|----------------------------------------------|
80. | 2009w52 . . . |
81. | 2010w1 10.297958 27.970307 -13.730162 |
+----------------------------------------------+
. tsset
time variable: t, 2008w24 to 2010w40
delta: 1 week
There are still gaps in this data, so we still cannot use -dfactor- on this
data.
Now, I suppose that week 1 actually comes after week 51. In my example, this
assumption holds because the researchers take off the last week in the year.
I implement this assumption by (1) dropping the two observations for which the
week is 52, (2) creating a new time variable that goes from 1 to the number of
observations in the sample, and (3) using -tsset- to declare the new time
variable. By construction, there are no missing time periods.
. drop if week(dofw(t)) == 52
(2 observations deleted)
. generate t2 = _n
. tsset t2
time variable: t2, 1 to 118
delta: 1 unit
Having removed the gaps in the data by imposing an additional assumption on
our model, we can use -dfactor- to estimate the parameters.
. dfactor(D.(p v ep)=,noconstant)(f=,ar(1/2))
searching for initial values ...........
(setting technique to bhhh)
Iteration 0: log likelihood = -329.57244
Iteration 1: log likelihood = -324.04945
Iteration 2: log likelihood = -322.84128
Iteration 3: log likelihood = -322.38332
Iteration 4: log likelihood = -322.10539
(switching technique to nr)
Iteration 5: log likelihood = -322.05329
Iteration 6: log likelihood = -321.89974
Iteration 7: log likelihood = -321.89735
Iteration 8: log likelihood = -321.89735
Refining estimates:
Iteration 0: log likelihood = -321.89735
Iteration 1: log likelihood = -321.89735
Dynamic-factor model
Sample: 2 - 118 Number of obs = 117
Wald chi2(5) = 378.91
Log likelihood = -321.89735 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
| OIM
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
f |
f |
L1. | 1.302608 .205203 6.35 0.000 .9004174 1.704798
L2. | -.5836402 .1829143 -3.19 0.001 -.9421456 -.2251347
-------------+----------------------------------------------------------------
D.p |
f | -.1543837 .0445771 -3.46 0.001 -.2417532 -.0670142
-------------+----------------------------------------------------------------
D.v |
f | -.0692728 .0441276 -1.57 0.116 -.1557613 .0172156
-------------+----------------------------------------------------------------
D.ep |
f | .1194667 .0516863 2.31 0.021 .0181635 .2207699
-------------+----------------------------------------------------------------
var(De.p) | .1336773 .0247502 5.40 0.000 .0851679 .1821868
var(De.v) | .4937034 .066499 7.42 0.000 .3633677 .6240391
var(De.ep) | .4552322 .0655476 6.95 0.000 .3267612 .5837031
------------------------------------------------------------------------------
Note: Tests of variances against zero are conservative and are provided only
for reference.
(Of course the parameter estimates are for our simulated data.)
I hope this helps.
-Rich
[email protected]
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/