I take it that -ID- is an identifier for episodes.
I don't know what -X1- is doing here.
Each episode is to be represented by a number of observations
that is I think
floor((end - 1)/12) - floor(start/12) + 1
So
gen nexpand = floor((end - 1)/12) - floor(start/12) + 1
expand nexpand
bysort ID : replace end = 12 * (_n + floor(start/12)) if nexpand > 1 & _n < _N
by ID: replace start = end[_n-1] + 1 if nexpand > 1 & _n > 1
Nick
[email protected]
Ola Sj�berg
I have a dataset with entry- and exit-month into different labour market
statuses (n=2500; episodes=10500), where entry- and exit-month is
measured as calendar months (0=January 1900). Now, I would like split
this dataset according to calendar year (I have 30 years), i.e. episodes
that stretches over two or more calendar years should be be split and
all information (i.e. other independent variables) about this episode
should be duplicated. To split according to the duration of episodes
seems easy, I have trouble to split according to calendar year.
So, this is what I have
ID start end X1
1 1 11 2
2 13 37 3
And this is what I would like to have
ID start end year X1
1 1 11 1 2
2 13 24 2 3
2 25 36 3 3
2 37 37 4 3
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/