It's a bit naughty posting the same question three times. It was clear
first time, just that nobody was able and willing to answer before now.
I don't understand the last sentence ("If there are no observations over
the past
12 months, the value of the new variable should equal 1"), but otherwise
here goes:
This problem is the subject of a Stata Journal Tip in which the
associated ideas are discussed at excruciating length.
SJ-7-3 pr0033 . . . . . . . . . . . . . . Stata tip 51: Events in
intervals
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N.
J. Cox
Q3/07 SJ 7(3):440--443 (no
commands)
tip for counting or summarizing irregularly spaced
events in intervals
The basic idea explored in that Tip is, however, quite simple.
We want to count how many investors held stock in that company in the
previous 12 months. The Tip analyses it something like this:
Focus on observation 1:
count if Company_no == Company_no[1] & inrange(Date, Date[1] - 365,
Date[1])
To extend this to all observations, we loop over observations and make
sure that we store values as they emerge from a -count-:
local N = _N
gen count = .
qui forval i = 1/`N' {
count if Company_no == Company_no[`i'] & inrange(Date, Date[`i']
- 365, Date[`i'])
replace count = r(N) in `i'
}
Except that that doesn't count each investor just once, as stipulated.
Here's a way of building that in:
local N = _N
gen count = .
qui forval i = 1/`N' {
egen tag = tag(Investor_id) ///
if Company_no == Company_no[`i'] & ///
inrange(Date, Date[`i'] - 365, Date[`i'])
count if tag
replace count = r(N) in `i'
drop tag
}
Warning: untested code. If the precise definition here of previous year
doesn't suit, then you need to change it.
If this problem interests you, here is more of the same:
SJ-7-4 dm0033 . . . . . . Speaking Stata: Counting groups, especially
panels
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N.
J. Cox
Q4/07 SJ 7(4):571--581 (no
commands)
discusses how to count panels through reduction commands
or through tabulation commands and how to overcome
problems that do not yield easily to these approaches
SJ-7-1 pr0029 . . . . . . . . . . . . . . . Speaking Stata: Making it
count
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N.
J. Cox
Q1/07 SJ 7(1):117--130 (no
commands)
discusses count used with a loop over observations
or variables
Nick
[email protected]
Marietta Jones
I have the following dataset:
Company_no Date Investor_id
1 03/01/2000 1
1 04/09/2000 1
1 12/12/2001 2
2 13/12/2000 4
2 07/08/2001 7
3 09/08/2000 4
3 19/03/2001 4
3 02/05/2001 5
3 03/12/2001 6
For each observation, I would like to calculate the number of distinct
investors owning shares in a given company in the previous 12 months.
The new variable (No_investors_past_12_months) will look as follows:
Company_no Date Investor_id No_investors_past_12_months
1 03/01/2000 1 1
1 04/09/2000 1 1
1 12/12/2001 2 1
2 13/12/2000 4 1
2 07/08/2001 7 2
3 09/08/2000 4 1
3 19/03/2001 4 1
3 02/05/2001 5 2
3 03/12/2001 6 3
For example, for company 2 on 07/08/2001 there are two investors who
have owned shares of this company within the past 12 months.
The idea is that the same investor should be counted only once and we
should count all observations over the past 12 months, including the
current observation date. If there are no observations over the past
12 months, the value of the new variable should equal 1.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/