Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Replacing duplicate values
From
"Nick Cox" <[email protected]>
To
<[email protected]>
Subject
RE: st: Replacing duplicate values
Date
Thu, 1 Apr 2010 17:59:12 +0100
Not really. You can manage fine without -duplicates- here, as my code sketch implied.
Nick
[email protected]
Abdel Rahmen El Lahga
the question was not enough clear and I simply invited Pavlos to try
duplicates which is certainly the unvoidable command to perform such
task
2010/4/1 Martin Weiss <[email protected]>:
> Abdel, how would you do that?
>
>
> *************
> clear*
>
> input byte(id) str4(ipc_1 ipc_2 ipc_3 ipc_4)
> 1 A44B G09F H04N
> 2 A47B G06F H05K E05D
> 3 A47B G06F
> 4 A47B H04N H05K
> 5 A47B
> 6 A47B F16M F16M H05K
> 7 A47B A47B F16M A47B
> end
>
> duplicates report ipc_?
> *************
>
> [mailto:[email protected]] Im Auftrag von Abdel Rahmen El
> Lahga
>
> type help duplicates drop under Stata and you will find what you are
> looking for
> 2010/4/1 Pavlos C. Symeou <[email protected]>:
>> I would like to ask for your assistance with the following:
>>
>> I have a dataset which concerns patents. Every patent is assigned a number
>> of International Patent Classifications (IPCs). However, there are
> mistakes
>> in the database and certain IPCs appear more than once for a single
> patent,
>> which is meaningless. Examples are patents with id 6 and id 7 (ipc_1,
> ipc_2
>> etc list the number of IPCs a single patent is assigned). For the patent
>> with id 6 we can see that ipc_2 and ipc_3 are the same. Id 7 illustrates
> a
>> more general issue. Duplicate values may not appear sequentially and may
>> appear more than twice.
>>
>> id ipc_1 ipc_2 ipc_3 ipc_4
>> 1 A44B G09F H04N
>> 2 A47B G06F H05K E05D
>> 3 A47B G06F
>> 4 A47B H04N H05K
>> 5 A47B
>> 6 A47B F16M F16M H05K
>> 7 A47B A47B F16M A47B
>>
>> Can you suggest a way to delete the duplicate values, which can be more
> than
>> two, and move the remaining to the left? For example patents with id 6 and
>> id 7 would look like this:
>>
>> id ipc_1 ipc_2 ipc_3 ipc_4
>> 6 A47B F16M H05K
>> 7 A47B F16M
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/