[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: is ordering with -bysort- unique?

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	st: RE: RE: is ordering with -bysort- unique?
Date	Tue, 10 Jun 2003 23:08:06 +0100

Radu Ban

> i'm cleaning a dataset and i encounter repeated ids. i want
> to keep them
> unique, but the problem is that for some repeated ids the variables
> differ.
> i want to keep just one of the repeated ids. so i'm using:
>
> bysort id: keep if _n == 1
>
> now i would like to know if this will keep the same id whenever the
> program
> is run. or does the ordering change?
>
> sorry for such a basic question but right now i don't have
> access to the manuals.

Jose Luis Negrin Mu�oz

> I have used the following small program to get rid of duplicates
>
> 	sort ref;
> 	by ref: gen dup=_n;
> 	gsort ref -mesini dup;
> 	drop if dup>1;
>
> where ref is a reference code and mesini is a variable
> related to time
> (1, 2...j); these are the variables I am sorting my data
> by. So I will
> keep only the oldest data with each particular reference code

Two comments:

1. This code appears to keep the most recent
observation within each group, not the oldest.
Another way to do it is

bysort ref (mesini) : keep if _n == _N

2. For general handling of duplicates, there
are several programs, as

. findit duplicate

will show.

Nick
[email protected]

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: RE: RE: RE: is ordering with -bysort- unique?
  - From: "Nick Cox" <[email protected]>

References:
- st: RE: is ordering with -bysort- unique?
  - From: Jose Luis Negrin Mu�oz <[email protected]>

Prev by Date: st: RE: is ordering with -bysort- unique?
Next by Date: st: RE: RE: RE: is ordering with -bysort- unique?
Previous by thread: st: RE: is ordering with -bysort- unique?
Next by thread: st: RE: RE: RE: is ordering with -bysort- unique?
Index(es):
- Date
- Thread