Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: expanding data set by variable
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: expanding data set by variable
Date
Wed, 9 May 2012 00:04:54 +0100
I'd turn this round and ask
1. What the date variables are (string, numeric, numeric with a date format)?
2. Why you think the second data structure is going to be a good one?
If this were my data, I would get a different structure this way:
clear
input ID double (start end) user
str1 type
1 20071001 20071010 1 A
2 20071003 20071231 1 A
3 20071009 20080214 1 A
4 20080117 20080117 1 B
5 20070306 20070308 2 A
6 20070314 20070319 2 A
7 20070314 20070316 2 A
end
gen mydate = date(string(start, "%12.0f"), "YMD")
gen mydate2 = date(string(end, "%12.0f"), "YMD")
format mydate %td
gen length = mydate2 - mydate + 1
expand length
bysort ID : replace mydate = mydate + _n - 1
drop mydate2 length
edit
Nick
On Tue, May 8, 2012 at 9:49 PM, KOTa <[email protected]> wrote:
> i need help with creating from each observation with start and end
> variables few several according to values of start and end i.e. to
> split observations so there would be no partial (full overlap is ok)
> time overlap between them, preserving all other variables the same
>
> sample of data:
>
> ID start end user type
> 1 20071001 20071010 1 A
> 2 20071003 20071231 1 A
> 3 20071009 20080214 1 A
> 4 20080117 20080117 1 B
> 5 20070306 20070308 2 A
> 6 20070314 20070319 2 A
> 7 20070314 20070316 2 A
>
> the result i need is (from first 4)
>
> ID start end user type
> 1 20071001 20071003 1 A
> 1 20071003 20071009 1 A
> 1 20071009 20071010 1 A
> 2 20071003 20071009 1 A
> 2 20071009 20071010 1 A
> 2 20071010 20071231 1 A
> 3 20071009 20071010 1 A
> 3 20071010 20081231 1 A
> 3 20071231 20080214 1 A
> 4 20080117 20080117 1 B
>
> i was think about counting how many overlaps there are for each
> observation (i.e. 3 for ID 1) saving it into additional variable per
> observation and then to use expand and replace start/end value.
>
> but didn't find a way how to copy observation by variable value number of times
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/