I take it that -ID- is an identifier for episodes. 
I don't know what -X1- is doing here. 
Each episode is to be represented by a number of observations 
that is I think 
floor((end - 1)/12) - floor(start/12) + 1 
So 
gen nexpand = floor((end - 1)/12) - floor(start/12) + 1 
expand nexpand 
bysort ID : replace end = 12 * (_n + floor(start/12)) if nexpand > 1 & _n < _N 
by ID: replace start = end[_n-1] + 1 if nexpand > 1 & _n > 1
Nick 
[email protected] 
Ola Sj�berg
I have a dataset with entry- and exit-month into different labour market 
statuses (n=2500; episodes=10500), where entry- and exit-month is 
measured as calendar months (0=January 1900). Now, I would like split 
this dataset according to calendar year (I have 30 years), i.e. episodes 
that stretches over two or more calendar years should be be split and 
all information (i.e. other independent variables) about this episode 
should be duplicated. To split according to the duration of episodes 
seems easy, I have trouble to split according to calendar year.
So, this is what I have
ID    start   end   X1
1      1        11    2
2      13      37    3
And this is what I would like to have
ID    start   end     year     X1
1      1       11        1         2
2      13     24        2         3
2      25    36         3         3
2      37    37         4         3
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/