Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Duplicate observations

From	emanuele mazzini <[email protected]>
To	[email protected]
Subject	st: Duplicate observations
Date	Mon, 10 Mar 2014 19:30:52 +0100

Hello to everybody,

I have an issue about duplicate observations that I find puzzling to solve.
I have data on country-pairs by year and I am interested in two
specific variables, a date and, say a variable which I call x_1.

Specifically, my data look like this :

reporter  partner   year       date         x_1

Albania  Austria   1980   6dec1980     n_1
Albania  Austria   1980  15nov1980    n_1
.         .        .
.         .        .
.         .        .

As you may have noticed observations differ amongst them only by date
and I need to drop them so as to keep the most recent one (hence, in
this case the second one).

I ran the following commands:

duplicates tag reporter partner year, generate(dup)

by reporter partner year (x_1 -date), sort: gen duplicates=_n

so as to be able to identify duplicates and then - among those with
dup >0 - drop those for which duplicates > 1.
This method was suggested in this thread (I take this opportunity to
thank again), but it seems not to work for some observations.
Take, for instance the following example:

reporter partner    year      date         x_1    dup     duplicates
Albania Germany 1967 08apr1967    n_1      1           1
Albania Germany 1967 17dec1967   n_1      1           2

As you may notice, Stata identifies the observation occurred the
17dec1967 as those with duplicates > 1 (which will then be dropped),
while I would have expected Stata to make the opposite.

Can anyone explain me why and, possibly, tell me how to deal with such issue?

Thank you very much in advance,

Emanuele
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: RE: Duplicate observations
  - From: Joe Canner <[email protected]>
- Re: st: Duplicate observations
  - From: Nick Cox <[email protected]>

Prev by Date: Re: Re: st: Generating days eligible when eligibility changes over time
Next by Date: Re: st: Duplicate observations
Previous by thread: st: Reshape to long problems
Next by thread: Re: st: Duplicate observations
Index(es):
- Date
- Thread