Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <n.j.cox@durham.ac.uk> |
To | "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu> |
Subject | st: RE: Binary time series |
Date | Wed, 22 Sep 2010 15:45:04 +0100 |
Bob Yaffee did allude to some of the literature on irregular time series, and there's plenty more. For example, astronomers and others have a separate literature on getting spectra out of irregular series. But if this were my problem I wouldn't go that way. I've a gut feeling that a simple regression-like model could work quite well for 30 data points but less well for any time series model you care to name. Time series models seem more data-hungry even when they work. The researcher's question appears to hinge on looking at seasonality. Month as such I imagine to be quite arbitrary and artificial for tadpoles (unless lunar cycles are important, and if they are, you would be modelling them directly). Also, if you have a parameter per month, you are spreading the information pretty thinly. I would work with Fourier series picking up dependence on time of year and then check for error structure. There is Stata-based literature at SJ-6-4 st0116 . . . . Speaking Stata: In praise of trigonometric predictors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox Q4/06 SJ 6(4):561--579 (no commands) discusses the use of sine and cosine as predictors in modeling periodic time series and other kinds of periodic responses SJ-6-3 gr0025 . . . . . . . . . . . . Speaking Stata: Graphs for all seasons (help cycleplot, sliceplot if installed) . . . . . . . . . N. J. Cox Q3/06 SJ 6(3):397--419 illustrates producing graphs showing time-series seasonality which may help in one way or another. Both papers are accessible via the Stata Journal. You have a response that is a proportion. See for a review SJ-8-2 st0147 . . . . . . . . . . . . . . Stata tip 63: Modeling proportions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. F. Baum Q2/08 SJ 8(2):299--303 (no commands) tip on how to model a response variable that appears as a proportion or fraction In addition, converting time of year to a circular scale might help. There is a bundle of circular statistics programs in -circular- on SSC. At home we have tadpoles sometimes in a small pond in our garden, but I have no data to share. I don't know what Baum 2006 is. (But then Bob Yaffee didn't even give years in his "references"....) Nick n.j.cox@durham.ac.uk John Morton I am seeking advice on analysis of a time series dataset in Stata. The same site was visited irregularly 30 times over 3 years (median interval between visits 35 days, range 18 to 68 days). At each visit, usually 5 tadpoles (but sometimes 6 or 9) were sampled (numbers were limited because this is an endangered species). Different tadpoles were sampled at each visit. Each tadpole was tested and categorised as test positive or test negative. Apparent prevalences were 1.00 at about half of the visits and 0.00 at about 25% of visits. The researcher's question is whether prevalence varies by month (ie Jan, Feb, Mar etc) or by season. The features of this data that seem important are that the errors would be expected to be serially correlation over time, the dependent variable is binary, prevalences of 0 and 1 were common, the very small number of tadpoles sampled at each visit, and these are not panel data (ie different tadpoles were sampled at each visit). I have done some exploratory modelling treating prevalence as a continuous dependent variable (using -regress-) after declaring the data to be time-series data (with sequential visit number rather than day number as the time variable, using -tsset-). With a null model, tests for serial correlation (Durbin-Watson test (-estat dwatson-), Durbin's alternative (h) test (-estat durbinalt-),Breush-Godfrey test ( -estat bgodfrey,lag(6)-), Portmaneau (Q) test (-wntestq-) and the autocorrelogram (-ac-)(all from Baum 2006) indicate serial correlation. In contrast, after fitting month as a fixed effect, these tests do not support rejecting the null hypothesis that no serial correlation exists. However treating prevalence (a proportion) as a continuous dependent variable (using -regress-) is inappropriate. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/