Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: expanding data set by variable
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: expanding data set by variable
Date
Wed, 9 May 2012 08:51:14 +0100
1. The help for -expand- looks clear enough to me, but that's not a
test of much. You could write to StataCorp explaining what you find
unclear.
2. I think a structure in which each observation is a person-day (or a
person-day-activity) is going to make your calculations easiest, which
is why I suggest it.
3. What
bysort ID : replace mydate = mydate + _n - 1
does can be worked out by what it does, but the observation number _n
increases 1 up within blocks of -ID- so _n - 1 increases 0 up: the
result is an increasing sequence of daily dates. On -by:- and _n
within -by:- see a tutorial at
SJ-2-1 pr0004 . . . . . . . . . . Speaking Stata: How to move step by: step
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
Q1/02 SJ 2(1):86--102 (no commands)
explains the use of the by varlist : construct to tackle
a variety of problems with group structure, ranging from
simple calculations for each of several groups to more
advanced manipulations that use the built-in _n and _N
Or see something similar within section 7 of
FAQ . . . . . . . . . . . . . . . . . . . . . . . Replacing missing values
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
2/03 How can I replace missing values with previous or
following nonmissing values?
http://www.stata.com/support/faqs/data/missing.html
On Wed, May 9, 2012 at 1:18 AM, KOTa <[email protected]> wrote:
> thanks for quick response, Nick
>
>> 1. What the date variables are (string, numeric, numeric with a date format)?
>
> sorry i just didnt think to mention this, cause it can easily
> converted among those formats you listed. ( and i actually have dates
> in all 3 of them)
>
>> 2. Why you think the second data structure is going to be a good one?
>
> what i am trying to do is to count the time spent on each "type" of
> activity. which i already figured out how to do (with the help from
> the statalist )
> but the problem is if activities overlap in days for same person and i
> dont take account for this - i over-count them both.
> so what i tried to do is to split time equally among requests (ID)
> that happened at the same time(for same user). i managed to do this
> for requests (ID) that start at the same time, but could not find a
> way to do this if they start at different times (and it can be overlap
> between more then 2 requests). the aprouch i though to take is to
> recode the data so each observation would be split into not
> overlapping periods.
>
>> If this were my data, I would get a different structure this way:
>> ...
>> gen mydate = date(string(start, "%12.0f"), "YMD")
>> gen mydate2 = date(string(end, "%12.0f"), "YMD")
>> format mydate %td
>> gen length = mydate2 - mydate + 1
>
> that is how i started
>
>> expand length
>
> that what i wanted to do, but could not find in the help or examples
> if "expand" can be used this way
>
>> bysort ID : replace mydate = mydate + _n - 1
>
> 1. i forgot to mention that the count has to be by activity type. so,
> correct me if i wrong the bysort then should be "bysort ID type" ..."
> 2. i didn't understand the logic of replace mydate = mydate + _n - 1
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/