Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <njcoxstata@gmail.com> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | Re: st: FW: Aggregating values back to a certain row |
Date | Thu, 20 Mar 2014 13:47:38 +0000 |
More technique: clear input invest_id firm_id value year 1 1 1.5 2001 2 1 2.7 2001 3 1 3.4 2002 4 2 0.9 1998 5 2 5.5 2003 end collapse (count) count=value (sum) sum=value, by(firm_id year) tsset firm_id year tsfill replace count = 0 if missing(sum) replace sum = 0 if missing(sum) bysort firm_id (year) : gen prev2 = max(0, sum[_n-1]) + max(0, sum[_n-2]) Nick njcoxstata@gmail.com On 20 March 2014 13:34, Nick Cox <njcoxstata@gmail.com> wrote: > This is a nice idea, although the example data showed that > > 1. two or more items can be observed for the same firm in the same > year (still, what that implies is just a prior -collapse-, which might > be a good move any way) > > 2. there can be gaps (again, there are remedies) > > Nick > njcoxstata@gmail.com > > > On 20 March 2014 13:14, Jeph Herrin <info@flyingbuttress.net> wrote: >> >> If the window is really a moderate 5 or so years, you might get by with >> 'reshaping' the data first: >> >> forv y=1/5 { >> bys firm_id (year): gen value_`y'=value[_n-`y'] /// >> if inrange(year[_n-`y'],year,year-5) >> } >> egen value_agg = rowtotal(value_1-value_5) >> >> >> You will have (at most 5) new variables populated on each row, containing >> the values from the last (at most) 5 years. Then just aggregate across the >> row. >> >> J >> >> >> >> On 3/19/2014 4:03 PM, Nick Cox wrote: >>> >>> This kind of question is often asked here. >>> >>> Your code can't easily be rescued. A loop over observations doesn't >>> really mix with a -while- loop. Also note, for example, that -if- >>> qualifiers are illegal and indeed make no sense in defining -local-s. >>> >>> You don't spell it out, and it's not even tacit in your code, but I am >>> guessing that all calculations must be for the same firm. >>> >>> Consider this: >>> >>> 1. Initialise all aggregate values to 0. >>> >>> gen firm_value_agg = 0 >>> >>> 2. Loop over all the observations. >>> >>> quietly forvalues n=1/`=_N' { >>> >>> 3. Summarize value if the firm is the same as that in the present >>> observation and investment was in the previous 5 years. >>> >>> Here previous five years is interpreted as meaning for 2013 (say) >>> 2013, 2012, 2011, 2010, 2009. >>> >>> su value if firm == firm[`n'] & inrange(year, year[`n'] - 4, >>> year), meanonly >>> >>> If you want to exclude the present year, it would be >>> >>> su value if firm == firm[`n'] & inrange(year, year[`n'] - 5, >>> year[`n'] - 1), meanonly >>> >>> The -meanonly- option is helpful to speed up what is likely to be slow >>> code. Despite its name, -meanonly- does include calculation of the >>> sum. >>> >>> 4. Replace the aggregated value in the current observation. >>> >>> replace firm_value_agg = r(sum) in `n' >>> } >>> >>> Counts would mean using r(N) not r(sum). >>> >>> There's a lengthier sermon in >>> http://www.stata-journal.com/sjpdf.html?articlenum=pr0033 which you >>> should cite if you find it useful. (Many people will cite a paper for >>> just mentioning one idea they allude to, but not a paper that saved >>> them hard work in data management.) >>> >>> Nick >>> njcoxstata@gmail.com >>> >>> >>> On 19 March 2014 19:03, Daniel Stefan Hain <dsh@business.aau.dk> wrote: >>>> >>>> Dear Stata users, >>>> >>>> I currently have a problem I find myself unable to solve. >>>> >>>> I have longitudinal data on investments made by companies, each >>>> investment a single row. It looks sort of like: >>>> >>>> invest_id firm_id value >>>> year >>>> 1 1 1,5 >>>> 2001 >>>> 2 1 2,7 >>>> 2001 >>>> 3 1 3,4 >>>> 2002 >>>> 4 2 0,9 >>>> 1998 >>>> 5 2 5,5 >>>> 2003 >>>> >>>> I want to introduce a "investment experience", which aggregates the >>>> number (or value) of previous investments over the last X (lets say 5) years >>>> previous to the current one. Firms have very different investment activity, >>>> so might the one have 100, the next only one per year. I up to now tried >>>> different versions of something like >>>> >>>> >>>> local N =_N >>>> forvalues n=1/`N' { >>>> local i=1 >>>> local t=0 >>>> while t=0 { >>>> replace firm_value_agg = firm_value_agg >>>> + value[_n - `i']) if `n'=_n >>>> local i = `i' + 1 >>>> local t = 1 if year[_n] - 5 >>>> == year[_n - `i'] >>>> } >>>> } >>>> >>>> Here I try to write a loop that goes over every row (first forvalues), >>>> and then (in theory) aggregates previous investment values in >>>> "firm_value_agg", as long till it comes to a row where the year is smaller >>>> than the current year minus 5 (since I want to aggregate the last 5 years). >>>> I tried some different versions, but always get some error messages like "no >>>> weights allowed", or the code just produces unintended results. For every >>>> piece of advice how to make this work I would be very grateful. >>>> >>>> Best wishes >>>> >>>> >>>> Daniel S. Hain >>>> >>>> Ph.D. Fellow, M.Sc. Economics, Dipl.-Wirt.-Ing. >>>> Department of Business and Management | IKE | DRUID | EIS >>>> T: (+45) 9940 2724 | Email: dsh@business.aau.dk >>>> Web: www.ike.aau.dk | www.eis-all.dk | www.innoresource.org >>>> Aalborg University | Fibigerstræde 11, Room 91 | 9220 Aalborg, Denmark >>>> >>>> >>>> * >>>> * For searches and help try: >>>> * http://www.stata.com/help.cgi?search >>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>> * http://www.ats.ucla.edu/stat/stata/ >>> >>> >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>> * http://www.ats.ucla.edu/stat/stata/ >>> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/