Note that -inspect- will report missing if there are more than 99
distinct obs, whereas -distinct- or -unique- have much higher limits
(up to double precision, I believe). You could replace -inspect- in
Scott Merryman's code with -distinct- after downloading it using -ssc
inst distinct- and replace r(N_unique) with r(ndistinct), or you could
try out the following code.
clear
set obs 5
gen productid=_n
expand 30
gen date=int(uniform()*10000)
format date %d
gen customerid=int(uniform()*100)
gen year=year(date)
su ye
local miny=r(min)
local maxy=r(max)
gen ncust=.
la val year year
forval y=`=`miny'+2'/`=`maxy'-2' {
qui distinct cust if ye>=`y'-2 & ye<=`y'+2
qui replace ncust=r(ndistinct) if year==`y'
la def year `y' "`=`y'-2' to `=`y'+2'", modify
}
table year if ncust<., c(mean ncust)
-----Original Message-----
From: [email protected] [mailto:[email protected]]
Sent: Wednesday, June 29, 2005 2:24 PM
To: [email protected]
Subject: Re: st: Counting unique values over moving time windows
Here is one way, using -inspect- to count the unique values and -forv-
to loop across observations:
<snip>
Hope this helps,
Scott
----- Original Message -----
From: Arik <[email protected]>
Date: Wednesday, June 29, 2005 12:10 pm
Subject: st: Counting unique values over moving time windows
> I am working with panel data and am trying to create a
> variable that will count the number of unique values
> an existing variable has over overlapping moving
> windows (i.e. I have a list of products sold to
> customers and wish to find the total number of
> customers over 5-year moving windows). This seems to
> fall between the stata commands UNIQUE and MOVSUMM,
> but I can't really think of a syntax that will provide
> the results I'm looking for. I would appreciate any
> help you may offer.
> Arik
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/