Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <njcoxstata@gmail.com> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | Re: Re: st: sum over variables for determinate observations |
Date | Mon, 27 Jan 2014 12:19:15 +0000 |
What you mean by "did not work" is not explained here, but once you -keep- just one observation for each group, scope for accurate calculations of totals of any other variable is lost. -collapse- is, it seems, what you need here, obviating the need for a loop. It was suggested earlier in this thread, and it's not clear why you are not using it. Nick njcoxstata@gmail.com On 27 January 2014 12:12, Marie-Luise Schmitz <querida-ise@gmx.de> wrote: > Dear Roberto, > > thank you for your suggestion. I used: > > bysort province_name ateco_section: egen numero_contribuenti_2005_test = total(numero_contribuenti_2005) > by province_name ateco_section: keep if _n == 1 > replace numero_contribuenti_2005_test=.a if numero_contribuenti_2005==.a > > to do the task for one variable and it perfectly worked out. But the data set contains 93 numeric variables. I tried to do a foreach loop but this did not work. Any suggestion how to do this for many variables? > > > > Gesendet: Sonntag, 26. Januar 2014 um 19:01 Uhr > Von: "Roberto Ferrer" <refp16@gmail.com> > An: "Stata Help" <statalist@hsphsun2.harvard.edu> > Betreff: Re: st: sum over variables for determinate observations > Alternatives are: > > /* > Use -egen, total()-, to compute totals and keep an arbitrary observation > (here the first one). > */ > > bysort provname atecosec: egen snumcontrib = total(numcontrib) > by provname atecosec: keep if _n == 1 > > > /* > Use -sum- to compute a cumulative sum and keep the last observation > */ > > bysort provname atecosec: gen snumcontrib = sum(numcontrib) > by provname atecosec: keep if _n == _N > > The Stata Journal (2002) > 2, Number 1, pp. 86–102 > Speaking Stata: How to move step by: step > Nicholas J. Cox > > is a helpful reference. > > On Sun, Jan 26, 2014 at 1:13 PM, Roberto Ferrer <refp16@gmail.com> wrote: >> You're right, -collapse- works: >> >> *----------- begin code -------------- >> >> clear all >> set more off >> >> input /// >> str20 provname provcode str2 lic str1 atecosec str1 >> atecosec2002 numcontrib >> AGRIGENTO 84 AG A >> A 100 >> AGRIGENTO 84 AG A >> B 50 >> AGRIGENTO 84 AG B >> C 12 >> AGRIGENTO 84 AG C >> D 79 >> AGRIGENTO 84 AG O >> P 34 >> AGRIGENTO 84 AG P >> Q 0 >> AGRIGENTO 84 AG Z >> Z 1 >> ALESSANDRIA 6 AL A >> A 29 >> ALESSANDRIA 6 AL A >> B 12 >> ALESSANDRIA 6 AL B >> C 0 >> ALESSANDRIA 6 AL C >> D 5 >> end >> >> list, sepby(provname) >> >> collapse (sum) numcontrib, by(provname atecosec) >> >> list, sepby(provname) >> >> *------------------- end code ------------------------ >> >> On Sun, Jan 26, 2014 at 11:06 AM, Marie-Luise Schmitz >> <querida-ise@gmx.de> wrote: >>> Dear Stata Users, >>> >>> I have a data set that looks like this: >>> >>> province_name province_code_107 license_number ateco_section ateco_section2002 numero_contribuenti... >>> AGRIGENTO 84 AG A A 100 >>> AGRIGENTO 84 AG A B 50 >>> AGRIGENTO 84 AG B C 12 >>> AGRIGENTO 84 AG C D 79 >>> AGRIGENTO 84 AG O P 34 >>> AGRIGENTO 84 AG P Q 0 >>> AGRIGENTO 84 AG Z Z 1 >>> ALESSANDRIA 6 AL A A 29 >>> ALESSANDRIA 6 AL A B 12 >>> ALESSANDRIA 6 AL B C 0 >>> ALESSANDRIA 6 AL C D 5 >>> >>> It contains numerous numeric variables following the variable numero_contribuenti. >>> The variable ateco_section is a redefined version of the variable ateco_section2002 and shows sectors of economic activity. For instance, A = agriculture, B = fishery, etc. >>> In the redefined variable ateco_section, sectors A and B are summarzied by A. >>> However, the problem is that I want only one entry for sector A for each province that is, for numeric variables as numero_contribuenti I want the sum of previous A and B, hence: >>> >>> province_name province_code_107 license_number ateco_section numero_contribuenti ......... >>> AGRIGENTO 84 AG A 150 >>> AGRIGENTO 84 AG B 12 >>> >>> >>> I want to apply that to each province. >>> I guess this problem may be solved with collapse (sum) but I am totally lost. >>> Any help is highly appreciated. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/