Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Jia Peng" <jiapengcass@gmail.com> |
To | <statalist@hsphsun2.harvard.edu> |
Subject | st: How to count occurrences of specific value |
Date | Wed, 1 May 2013 20:03:54 +0800 |
Dear All, I have a data set with the following structure, id date flag 95001 14jun2000 1 95001 12apr2000 1 95001 16mar2000 0 95001 16nov1999 0 95001 10may1999 1 95001 30mar1995 0 95002 01nov1989 0 95002 01mar1985 1 95002 01jun1983 0 95002 01may1983 1 95002 01dec1982 0 95002 01oct1982 0 And now, I would like to generate a new variable, say temp, which represents for each observation how many times flag == 1 has occurred within the same id from five years ago to the date specified, i.e., for the first observation, I want to count how many times flag == 1 has occurred with the id 95001 between 14jun1995 and 14jun2000. I have tried to loop over every observation using the following code, gen temp = . local N = _N forvalues i = 1(1)`N' { count if flag == 1 & id = id[`i'] & (date[`i'] - date)/365.25 <= 5 & (date[`i'] - date)/365.25 >= 0 replace temp = r(N) in `i' } However, there are half a million observations in the entire data and the above code cost hours of time. Is there any way to solve the above problem more efficiently? I have also tried to use -egen-, but all I can get is how many times flag == 1 has occurred with the same id. Is there any way to take into consideration different date ranges in this context? Any thoughts? Peng Jia * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/