Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: manual weighted average variable in panel data set
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: manual weighted average variable in panel data set
Date
Fri, 16 Nov 2012 09:22:21 +0000
Those curious about a shorter version could think about
. bysort id : gen meanc = sum(conc * resid_time) / sum(resid_time)
. by id : replace meanc = meanc[_N]
Nick
On Fri, Nov 16, 2012 at 2:36 AM, hind lazrak <[email protected]> wrote:
> The best things are the simplest, isn't it?
> Thanks for the help!
>
> Best,
> Hind
>
> On Thu, Nov 15, 2012 at 4:56 PM, Nick Cox <[email protected]> wrote:
>> No loops are needed.
>>
>> . l id conc resid_time
>>
>> +--------------------------+
>> | id conc resid_~e |
>> |--------------------------|
>> 1. | 20059 15.96 380 |
>> 2. | 20059 21.17 100 |
>> 3. | 20059 18.07 480 |
>> 4. | 20060 30 181 |
>> 5. | 20060 16.68 292 |
>> |--------------------------|
>> 6. | 20061 23.78 269 |
>> 7. | 20061 18.07 103 |
>> +--------------------------+
>>
>> . bysort id : gen sumw = sum(resid_time)
>>
>> . by id : gen sumwc = sum(conc * resid_time)
>>
>> . by id : gen meanc = sumwc[_N] / sumw[_N]
>>
>> . l
>>
>> +-------------------------------------------------------+
>> | id conc resid_~e sumw sumwc meanc |
>> |-------------------------------------------------------|
>> 1. | 20059 15.96 380 380 6064.8 17.55771 |
>> 2. | 20059 21.17 100 480 8181.8 17.55771 |
>> 3. | 20059 18.07 480 960 16855.4 17.55771 |
>> 4. | 20060 30 181 181 5430 21.77708 |
>> 5. | 20060 16.68 292 473 10300.56 21.77708 |
>> |-------------------------------------------------------|
>> 6. | 20061 23.78 269 269 6396.82 22.19901 |
>> 7. | 20061 18.07 103 372 8258.03 22.19901 |
>> +-------------------------------------------------------+
>>
>> This code is too long for efficiency, but shows what I believe you
>> want. You can also generate sumwc / sumc if you wish.
>>
>> I didn't try to follow your code, but you're missing how -sum()- does
>> cumulative sums.
>>
>> See also _gwtmean from SSC (David Kantor). (I don't agree with his
>> advice to make it a substitute for Stata's code for -gmean()-.)
>>
>> Nick
>>
>> On Fri, Nov 16, 2012 at 12:33 AM, hind lazrak <[email protected]> wrote:
>>
>>> I have a panel data set with the following variables: ID time_resid conc
>>> The repeated observations vary from 1 to 5 for each individual (ID).
>>> Each observation has a time period (time_resid) during which a
>>> pollutant concentration occurs (conc).
>>> Here is an excerpt of the data
>>>
>>> list id conc resid_time counter in 1/7, sepby(id)
>>>
>>> +-------------------------------+
>>> id conc resid_~e counter
>>> ------------------------------------
>>> 1. 20059 15.96 380 1
>>> 2. 20059 21.17 100 2
>>> 3. 20059 18.07 480 3
>>> ------------------------------------
>>> 4. 20060 30 181 1
>>> 5. 20060 16.68 292 2
>>> ------------------------------------
>>> 6. 20061 23.78 269 1
>>> 7. 20061 18.07 103 2
>>> +------------------------------------+
>>>
>>> I need to create a variable using time_resid and conc that computes
>>> the time-weighted average concentration for each ID.
>>>
>>> The steps that I took were to create variable product (equal to
>>> resid_time * conc) and then I have been trying to come up with a loop
>>> that would do the following:
>>> for each person, and each observation compute the time weighted
>>> average concentration =( sum of product / total resid_time up) until
>>> the time at which the observation occurs.
>>>
>>> Here's my code:
>>> ***************************************
>>> bysort id: gen counter = _n
>>> bysort id: gen product= resid_time*conc
>>>
>>> bysort id: gen time = resid_time if _n==1 | counter[_N]==1
>>> bysort id: gen twa= conc if _n==1 | counter[_N]==1
>>> qui su counter
>>> forval i=1/`r(max)' {
>>> bysort id: replace product= product+ product[_n-`i'] if _n!=1
>>> bysort id: replace time= resid_time+ resid_time[_n-`i'] if _n=1
>>> }
>>> bysort id: gen twa = product/time
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/