Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Replacing duplicate values
From
"Pavlos C. Symeou" <[email protected]>
To
[email protected]
Subject
st: Replacing duplicate values
Date
Thu, 01 Apr 2010 16:51:49 +0200
Dear Statalisters,
I would like to ask for your assistance with the following:
I have a dataset which concerns patents. Every patent is assigned a
number of International Patent Classifications (IPCs). However, there
are mistakes in the database and certain IPCs appear more than once for
a single patent, which is meaningless. Examples are patents with id 6
and id 7 (ipc_1, ipc_2 etc list the number of IPCs a single patent is
assigned). For the patent with id 6 we can see that ipc_2 and ipc_3 are
the same. Id 7 illustrates a more general issue. Duplicate values may
not appear sequentially and may appear more than twice.
id ipc_1 ipc_2 ipc_3 ipc_4
1 A44B G09F H04N
2 A47B G06F H05K E05D
3 A47B G06F
4 A47B H04N H05K
5 A47B
6 A47B F16M F16M H05K
7 A47B A47B F16M A47B
Can you suggest a way to delete the duplicate values, which can be more
than two, and move the remaining to the left? For example patents with
id 6 and id 7 would look like this:
id ipc_1 ipc_2 ipc_3 ipc_4
6 A47B F16M H05K
7 A47B F16M
Best regards,
Pavlos
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/