Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

SV: st: Analysis of event history data


From   "Kristian Thor Jakobsen" <KRJ@dm.dk>
To   <statalist@hsphsun2.harvard.edu>
Subject   SV: st: Analysis of event history data
Date   Wed, 21 Mar 2012 13:30:13 +0100

That's so true because I have one more.....

I now have my data sorted in the following way, where the variable status is a dummy indicating if a person has exited from a specific programme during a spell of unemployment. Now I need to calculate the average number of weeks that the person has been employed during e.g. the past two years before exiting this programme (for example, employment[_n-1] to employment[_n-104] if status[_n]==1), but I can't make STATA do the correct calculation (it is taking the mean total employment of each person).

Id      Status  Employment	Time
1       0       1		1
1       1       0		2
1       0       0		3
2       0       1		1
2       0       1		2	
2       0       0		3
3       0       0		1
3       0       1		2
3       1       1		3

Any idea of how to get around this?

Really appreciate the help.

-----Oprindelig meddelelse-----
Fra: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] På vegne af Nick Cox
Sendt: 20. marts 2012 15:14
Til: statalist@hsphsun2.harvard.edu
Emne: Re: st: Analysis of event history data

Never say "one final question"!

-help egen- shows that there are -egen- functions -anycount()-, -anymatch()-. -anyvalue()-. So

egen ones = anycount(y_*), values(1)
keep if ones

Even if those functions did not exist, you could do this

gen ones = 0

quietly foreach v of var y_* {
      replace ones = ones + (`v' == 1)
}

keep if ones

Nick

On Tue, Mar 20, 2012 at 1:28 PM, Kristian Thor Jakobsen <KRJ@dm.dk> wrote:

> Thanks again, Nick. I figured it out with your help. But I have one final question. Given that my dataset consists of several million observations, I would like to trim the dataset down before I do the -reshape- command in order to avoid wasting time on observations that I would subsequently throw out. Say that I want to keep those observations where y_* is equal to 1 in one or more cases:
>
>  Id      y_1001  y_1002  y_1003 ...     y_1101  area_10  area_11
>  1       1       1       0              1       10      5
>
> I guess I could do the following:
>
> keep if y_1001==1| y_1002==1 etc.
>
> But given that I have around 1000 variables or so where I would need to check for the sufficient condition that would be a quite tedious function. Is there a smart way to get around this?

Nick Cox

> Do spend some time studying the resources for -reshape- including FAQs.
>
> First off, your -y_- cannot be an identifier! It doesn't identify observations.
>
> Second off, you can include -area- in the -reshape- but I guess you 
> will need some extra surgery before and after. I would try a -rename- 
> of the -area*- such as
>
> foreach v of var area* {
> rename `v' `v'01
> }
>
> and then there will be some fill-in afterwards.
>
> Nick
>
> On Mon, Mar 19, 2012 at 12:30 PM, Kristian Thor Jakobsen <KRJ@dm.dk> wrote:
>> Thanks, Nick. -reshape- is a big help. But what if I have time-varying variables that I would like to carry over as well, but not with same intervals. For example:
>>
>> Id      y_1001  y_1002  y_1003 ...      y_1101  area_10
>> area_11
>> 1       1       1       0       0       10      5
>>
>> If I do -reshape using y_ as the identifier I would get something like:
>>
>> Id      j       y_      area_10 area_11
>> 1       1001    1       10      5
>> 1       1002    1       10      5
>> 1       1003    0       10      5
>> .
>> .
>> .1      1101    0       10      5
>>
>> But I would like to have something like:
>>
>> Id      j       y_      area
>> 1       1001    1       10
>> 1       1002    1       10
>> 1       1003    0       10
>> .
>> .
>> .
>> 1       1101    0       5
>>
>> Is that possible with -reshape-? Or would I have to convert the yearly time-varying variables into weekly first?
>>
>> Thanks again,
>> Kristian
>>
>> -----Oprindelig meddelelse-----
>> Fra: owner-statalist@hsphsun2.harvard.edu
>> [mailto:owner-statalist@hsphsun2.harvard.edu] På vegne af Nick Cox
>> Sendt: 19. marts 2012 12:43
>> Til: statalist@hsphsun2.harvard.edu
>> Emne: Re: st: Analysis of event history data
>>
>> For most Stata purposes your data would indeed be better reshaped to a long data structure or shape or form (some people do say "format", but in a Stata context format implies -format-, etc.).
>>
>> reshape long y_ , i(id) j(time)
>> rename y_ status
>>
>> should do it. See also -tsspell- (SSC) and
>>
>> SJ-7-2  dm0029  . . . . . . . . . . . . . . Speaking Stata:
>> Identifying spells
>>        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N.
>> J. Cox
>>        Q2/07   SJ 7(2):249--265                                 (no
>> commands)
>>        shows how to handle spells with complete control over
>>        spell specification
>>
>> as well as the literature on survival analysis with which you are evidently familiar.
>>
>> Nick
>>
>> On Mon, Mar 19, 2012 at 11:32 AM, Kristian Thor Jakobsen <KRJ@dm.dk> wrote:
>>
>>> I am trying to do an analysis of transition in and out of public 
>>> income transfers. My data is organized roughly the following way:
>>>
>>> Id      y_1001  y_1002  y_1003
>>> 1       0       1       0
>>> 2       0       0       0
>>> 3       1       1       0
>>>
>>> This means that I have the weekly status of each individual from 
>>> 1991 to 2011. But in order to any sort of analysis I would guess 
>>> that I had to convert the data into the following way instead (for 
>>> example survival
>>> analysis):
>>>
>>> Id      Status  Time
>>> 1       0       1
>>> 1       1       2
>>> 1       0       3
>>> 2       0       1
>>> 2       0       2
>>> 2       0       3
>>> 3       1       1
>>> 3       1       2
>>> 3       0       3
>>>
>>> Is that correct, and if so, does there exist a smart way to convert 
>>> the data from one format into the other? Or can I perhaps use the 
>>> data as given?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index