[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: duplicates tag - and range

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	st: RE: duplicates tag - and range
Date	Tue, 20 Feb 2007 18:47:45 -0000

As the original author of -duplicates- (which in turn owes
much to earlier joint work with Thomas Steichen)  I have
to say that its behaviour is exactly right here. Indeed 
I would say the same if I had never touched the code. 

-duplicates-' idea of a duplicate is that observations
are identical (on the variables specified). How could it 
be otherwise? Thus -duplicates- is indeed irrelevant to your problem. 

Your problem is different but is soluble in Stata terms 
if you can give exact rules for what kind of tolerance you allow
_within groups of observations_. As with any kind of clustering
problem, specifying a distance or difference tolerance is only 
part of the problem, as joining or merging rules need to be
specified too. 

Nick 
[email protected] 

[email protected]
 
> I am working with a very large panel dataset, and would like to tag
> observations that repeat annually (compared to the odd, or 
> the unscheduled
> observation). My rule for tagging observations is something like: if
> another observation falls exactly one year before or after the current
> observation (-/+ 3 days, to deal with minor deviations - due 
> to, say, dates
> that fall on weekends), tag both observations. I explored the use of
> "duplicates" and splitting the dates to year, month, and day to little
> effect (it can be used only for exact matches rather ranges, 
> and will tag
> similar observations in terms of day and month in 
> non-consecutive years).
> Any help would be greatly appreciated.
> 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: Re: st: new Stata macro -starjas- available from SSC
Next by Date: RE: st: dot-dash-plot
Previous by thread: st: Mahalanobis Distances/Syntax/Stata
Next by thread: st: RE: Re: [bivariate kernel density estimation]
Index(es):
- Date
- Thread