Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Merging/summing daily values across time periods in (unbalanced) panel dataset
From
Nick Cox <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
RE: st: Merging/summing daily values across time periods in (unbalanced) panel dataset
Date
Wed, 9 Mar 2011 13:11:24 +0000
Assuming that your -date_from- and -date_to- are really daily date variables, you can start with
gen duration = date_to - date_from + 1
expand duration
on your master and then use the logic in
FAQ . . . . . . . . . . . . . . . . . . . . . . . Replacing missing values
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
2/03 How can I replace missing values with previous or
following nonmissing values?
http://www.stata.com/support/faqs/data/missing.html
to replace with sequences of dates.
Nick
[email protected]
Amanda Dwelley
Thanks, good point! The same daily x1 x2 apply to everyone in the
master dataset - to start, a "merge" ID could be 1 for everyone in the
master data set and all of the using data set.
Master data set looks like:
id date_from date_to merge_id
2 07apr2009 09jun2009 1
2 10jun2009 03jul2009 1
3 16may2009 12jun2009 1
3 13jun2009 10jul2009 1
(~50,000 observations; time periods at irregular intervals)
Daily data could look like:
merge_id date x1 x2
1 02jun2009 5.7 0
1 03jun2009 2.3 0
1 04jun2009 0 2
1 05jun2009 0 3.5
(3 yrs of daily values that are same on each day for each person in
master data set)
For example, for first observation in master data, looking for sum of
x1 from April 7-June 9, and sum of x2 within same dates.
On Tue, Mar 8, 2011 at 8:47 PM, Nick Cox <[email protected]> wrote:
> I see no information on identifiers in your daily dataset.
>
> Nick
>
> On Tue, Mar 8, 2011 at 10:15 PM, Amanda Dwelley <[email protected]> wrote:
>
>> I'm looking for suggestions on how to join and sum daily x values from one
>> dataset into time periods defined by a start date and end date in another
>> (panel) data set.
>>
>> Master data set looks like:
>> id date_from date_to
>> 1 07apr2009 09jun2009
>> 1 10jun2009 03jul2009
>> 2 13jun2009 10jul2009
>> 2 11jul2009 05aug2009
>> (~50,000 observations; time periods at irregular intervals)
>>
>> Daily data looks like:
>> date x1 x2
>> 02jun2009 5.7 0
>> 03jun2009 2.3 0
>> 04jun2009 0 2
>> 05jun2009 0 3.5
>> (3 yrs of daily values)
>>
>> As you can see the using data could be in wide or long format.
>> If I merge the wide format daily data into the master dataset, is there any
>> way to reference/look for the dates in variable names that are between the
>> start/end date, and sum associated variables? If using the long format of
>> the daily data, is there a way to sum values between start/end dates during
>> a join or merge?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/