Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: FW: Aggregating values back to a certain row
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: FW: Aggregating values back to a certain row
Date
Thu, 20 Mar 2014 13:47:38 +0000
More technique:
clear
input invest_id firm_id value year
1 1 1.5 2001
2 1 2.7 2001
3 1 3.4 2002
4 2 0.9 1998
5 2 5.5 2003
end
collapse (count) count=value (sum) sum=value, by(firm_id year)
tsset firm_id year
tsfill
replace count = 0 if missing(sum)
replace sum = 0 if missing(sum)
bysort firm_id (year) : gen prev2 = max(0, sum[_n-1]) + max(0, sum[_n-2])
Nick
[email protected]
On 20 March 2014 13:34, Nick Cox <[email protected]> wrote:
> This is a nice idea, although the example data showed that
>
> 1. two or more items can be observed for the same firm in the same
> year (still, what that implies is just a prior -collapse-, which might
> be a good move any way)
>
> 2. there can be gaps (again, there are remedies)
>
> Nick
> [email protected]
>
>
> On 20 March 2014 13:14, Jeph Herrin <[email protected]> wrote:
>>
>> If the window is really a moderate 5 or so years, you might get by with
>> 'reshaping' the data first:
>>
>> forv y=1/5 {
>> bys firm_id (year): gen value_`y'=value[_n-`y'] ///
>> if inrange(year[_n-`y'],year,year-5)
>> }
>> egen value_agg = rowtotal(value_1-value_5)
>>
>>
>> You will have (at most 5) new variables populated on each row, containing
>> the values from the last (at most) 5 years. Then just aggregate across the
>> row.
>>
>> J
>>
>>
>>
>> On 3/19/2014 4:03 PM, Nick Cox wrote:
>>>
>>> This kind of question is often asked here.
>>>
>>> Your code can't easily be rescued. A loop over observations doesn't
>>> really mix with a -while- loop. Also note, for example, that -if-
>>> qualifiers are illegal and indeed make no sense in defining -local-s.
>>>
>>> You don't spell it out, and it's not even tacit in your code, but I am
>>> guessing that all calculations must be for the same firm.
>>>
>>> Consider this:
>>>
>>> 1. Initialise all aggregate values to 0.
>>>
>>> gen firm_value_agg = 0
>>>
>>> 2. Loop over all the observations.
>>>
>>> quietly forvalues n=1/`=_N' {
>>>
>>> 3. Summarize value if the firm is the same as that in the present
>>> observation and investment was in the previous 5 years.
>>>
>>> Here previous five years is interpreted as meaning for 2013 (say)
>>> 2013, 2012, 2011, 2010, 2009.
>>>
>>> su value if firm == firm[`n'] & inrange(year, year[`n'] - 4,
>>> year), meanonly
>>>
>>> If you want to exclude the present year, it would be
>>>
>>> su value if firm == firm[`n'] & inrange(year, year[`n'] - 5,
>>> year[`n'] - 1), meanonly
>>>
>>> The -meanonly- option is helpful to speed up what is likely to be slow
>>> code. Despite its name, -meanonly- does include calculation of the
>>> sum.
>>>
>>> 4. Replace the aggregated value in the current observation.
>>>
>>> replace firm_value_agg = r(sum) in `n'
>>> }
>>>
>>> Counts would mean using r(N) not r(sum).
>>>
>>> There's a lengthier sermon in
>>> http://www.stata-journal.com/sjpdf.html?articlenum=pr0033 which you
>>> should cite if you find it useful. (Many people will cite a paper for
>>> just mentioning one idea they allude to, but not a paper that saved
>>> them hard work in data management.)
>>>
>>> Nick
>>> [email protected]
>>>
>>>
>>> On 19 March 2014 19:03, Daniel Stefan Hain <[email protected]> wrote:
>>>>
>>>> Dear Stata users,
>>>>
>>>> I currently have a problem I find myself unable to solve.
>>>>
>>>> I have longitudinal data on investments made by companies, each
>>>> investment a single row. It looks sort of like:
>>>>
>>>> invest_id firm_id value
>>>> year
>>>> 1 1 1,5
>>>> 2001
>>>> 2 1 2,7
>>>> 2001
>>>> 3 1 3,4
>>>> 2002
>>>> 4 2 0,9
>>>> 1998
>>>> 5 2 5,5
>>>> 2003
>>>>
>>>> I want to introduce a "investment experience", which aggregates the
>>>> number (or value) of previous investments over the last X (lets say 5) years
>>>> previous to the current one. Firms have very different investment activity,
>>>> so might the one have 100, the next only one per year. I up to now tried
>>>> different versions of something like
>>>>
>>>>
>>>> local N =_N
>>>> forvalues n=1/`N' {
>>>> local i=1
>>>> local t=0
>>>> while t=0 {
>>>> replace firm_value_agg = firm_value_agg
>>>> + value[_n - `i']) if `n'=_n
>>>> local i = `i' + 1
>>>> local t = 1 if year[_n] - 5
>>>> == year[_n - `i']
>>>> }
>>>> }
>>>>
>>>> Here I try to write a loop that goes over every row (first forvalues),
>>>> and then (in theory) aggregates previous investment values in
>>>> "firm_value_agg", as long till it comes to a row where the year is smaller
>>>> than the current year minus 5 (since I want to aggregate the last 5 years).
>>>> I tried some different versions, but always get some error messages like "no
>>>> weights allowed", or the code just produces unintended results. For every
>>>> piece of advice how to make this work I would be very grateful.
>>>>
>>>> Best wishes
>>>>
>>>>
>>>> Daniel S. Hain
>>>>
>>>> Ph.D. Fellow, M.Sc. Economics, Dipl.-Wirt.-Ing.
>>>> Department of Business and Management | IKE | DRUID | EIS
>>>> T: (+45) 9940 2724 | Email: [email protected]
>>>> Web: www.ike.aau.dk | www.eis-all.dk | www.innoresource.org
>>>> Aalborg University | Fibigerstræde 11, Room 91 | 9220 Aalborg, Denmark
>>>>
>>>>
>>>> *
>>>> * For searches and help try:
>>>> * http://www.stata.com/help.cgi?search
>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> * http://www.ats.ucla.edu/stat/stata/
>>>
>>>
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/