Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Splines
From
Marc Peters <[email protected]>
To
[email protected]
Subject
Re: st: Splines
Date
Thu, 21 Feb 2013 09:53:25 -0600
Dear Nick,
Thank you for these clarifications.
Best,
Marc
On Thu, Feb 21, 2013 at 3:19 AM, Nick Cox <[email protected]> wrote:
> Thanks for filling out the details. I've not read that paper. But in
> any case I don't know what you mean by "dealing with temporal
> dependence". Dependence in time series can mean anything from
>
> dependence in error structure which is regarded as a nuisance or
> complication in regression-type models
>
> to
>
> dependence treated as the main feature by some kind of time series
> modelling, such as binary time series modelling or Markov chains.
>
> It seems, however, that what you have in mind something else roughly
> in between those extremes.
>
> It seems that this is most likely to be carried forward by people
> familiar with the literature now identified. Alternatively, if this is
> a widely used method, there should be guides somewhere on how to do it
> in Stata.
>
> Nick
>
> On Thu, Feb 21, 2013 at 2:16 AM, Marc Peters <[email protected]> wrote:
>> Dear Nick,
>>
>> Thank you for your prompt answer. I am very sorry for being imprecise.
>>
>>
>> The reference I am talking about is Beck, Nathaniel; Jonathan N. Katz
>> and Richard Tucker. 1998. "Taking Time Seriously: Time-Series
>> Cross-Section Analysis with a Binary Dependent Variable." American
>> Journal of Political Science, 42(4) 1260-1288.
>>
>>
>> BTSCS is the word they use for Time-Series Cross-Section Analysis with
>> a Binary Dependent Variable. In their article they replicate a study
>> of militarized conflict, where a country dyad do or do not have a
>> conflict in a given year. As a conflict can persist for a number of
>> consecutive years, the data structure is quite similar to mine. Your
>> point about lowess is well taken, but if I understand you correctly
>> you would not recommend using splines for any analyses with repeated
>> events? Would you recommend another strategy for dealing with temporal
>> dependence. As I have understood it, a lagged dependent variable is
>> insufficient.
>>
>>
>> Once again, thank you for your help
>
> On Wed, Feb 20, 2013 at 7:28 PM, Nick Cox <[email protected]> wrote:
>
>>> You were asked to read the FAQ before posting. That explains that you
>>> are asked not to give minimal name (date) references. Also, BTSCS
>>> looks to me like jargon from your field. It is difficult not to use
>>> jargon on a list like this, but unexplained jargon nevertheless cuts
>>> down the number of people who might both read and reply to your posts.
>>>
>>> In terms of your question, running -lowess- and calling the smooth a
>>> spline does not make it a spline. There are many classes of spline,
>>> but I doubt that there's any definition that generous.
>>>
>>> The most common kinds of splines are linear and cubic. -mkspline-
>>> creates either kind. My best advice is to read the manual entry on
>>> -mkspline- and run through the examples in the help.
>>>
>>> I can't easily follow what you are trying to do otherwise. If you are
>>> saying that your response (dependent variable, in your terms) flips
>>> between states of 0 and states of 1, it sounds quite unsuitable for
>>> splines. But you seem to be trying to model it as a function of
>>> duration, not time; sorry, but you lost on me on that.
>>>
>>> My bottom line is that -lowess- is _not_ a spline method.
>
> On Thu, Feb 21, 2013 at 1:08 AM, Marc Peters <[email protected]> wrote:
>
>>>> I have never used splines before and have a rather silly question. I
>>>> am running a BTSCS model and have read up on my Beck, Katz and Tucker
>>>> (1998) and understood that I should use either temporal dummies or
>>>> splines to adjust for temporal dependence.
>>>>
>>>> The data is structured as duration data, with events coded as 1 and
>>>> non-events as 0. The dependent variable is measured at discrete
>>>> intervals (years) and an event can go on for several years (it often
>>>> does).
>>>>
>>>> From the data I have created a variable (duration) counting the number
>>>> of years since the last event. The variable is coded as 0 as long as
>>>> the event is ongoing.
>>>>
>>>> From this variable I create lowess splines using
>>>>
>>>> lowess Y duration, gen (spline)
>>>>
>>>> and then:
>>>>
>>>>
>>>> logit Y X spline, cluster(id)
>>>>
>>>>
>>>> I have understood that this is what you are supposed to do, but since
>>>> the spline is defined on the dependent variable the spline variable
>>>> always take on a high value when duration=0 (i.e. there is an event).
>>>> Consequently, when running the model I receive the following message
>>>> when running the command:
>>>>
>>>>
>>>> spline > .4679623 predicts data perfectly
>>>>
>>>>
>>>> I would be very grateful if anyone could help me with what it is I am
>>>> doing wrong. In the end, I should probably use cubic splines but first
>>>> I want to understand the simple principle.
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/