This refers to a program of mine called -dlist- available
from SSC.
It appears that during my absence from base, mainly at the
Boston users' meeting, no one else has
had a go at this or suggested alternative solutions.
I have two basic comments:
1. Even with duplicates in the narrowest sense (observations
occur in pairs), there will not always be enough real estate
on your monitor to show observations side-by-side. For the
general case this will be even more difficult, not to say
futile.
2. More positively, if observations are identical (other than
their current position in the data) then there would seem
no benefit or need from repeating that information. So,
-dlist- can be modified to -duplist-:
. sysuse auto, clear
. duplist for rep78
40 48
foreign Car type Domestic
rep78 Repair Record 1978 1
12 17 18 21 22 23 46 52
foreign Car type Domestic
rep78 Repair Record 1978 2
1 2 4 6 8 9 10 11 13 14 16 19 25 26 27 28 31 32 34 36 37 39 41 42 44 49 50
foreign Car type Domestic
rep78 Repair Record 1978 3
5 15 24 29 30 33 35 38 47
foreign Car type Domestic
rep78 Repair Record 1978 4
20 43
foreign Car type Domestic
rep78 Repair Record 1978 5
54 60 65
foreign Car type Foreign
rep78 Repair Record 1978 3
55 56 58 59 62 63 70 72 73
foreign Car type Foreign
rep78 Repair Record 1978 4
53 57 61 66 67 68 69 71 74
foreign Car type Foreign
rep78 Repair Record 1978 5
You may or may not prefer the results of the official
command:
. duplicates list foreign rep78
Duplicates in terms of foreign rep78
+----------------------------------+
| group: obs: foreign rep78 |
|----------------------------------|
| 1 40 Domestic 1 |
| 1 48 Domestic 1 |
| 2 12 Domestic 2 |
| 2 17 Domestic 2 |
| 2 18 Domestic 2 |
|----------------------------------|
| 2 21 Domestic 2 |
| 2 22 Domestic 2 |
| 2 23 Domestic 2 |
| 2 46 Domestic 2 |
| 2 52 Domestic 2 |
|----------------------------------|
| 3 1 Domestic 3 |
| 3 2 Domestic 3 |
| 3 4 Domestic 3 |
| 3 6 Domestic 3 |
| 3 8 Domestic 3 |
|----------------------------------|
| 3 9 Domestic 3 |
| 3 10 Domestic 3 |
| 3 11 Domestic 3 |
| 3 13 Domestic 3 |
| 3 14 Domestic 3 |
|----------------------------------|
| 3 16 Domestic 3 |
| 3 19 Domestic 3 |
| 3 25 Domestic 3 |
| 3 26 Domestic 3 |
| 3 27 Domestic 3 |
|----------------------------------|
| 3 28 Domestic 3 |
| 3 31 Domestic 3 |
| 3 32 Domestic 3 |
| 3 34 Domestic 3 |
| 3 36 Domestic 3 |
|----------------------------------|
| 3 37 Domestic 3 |
| 3 39 Domestic 3 |
| 3 41 Domestic 3 |
| 3 42 Domestic 3 |
| 3 44 Domestic 3 |
|----------------------------------|
| 3 49 Domestic 3 |
| 3 50 Domestic 3 |
| 4 5 Domestic 4 |
| 4 15 Domestic 4 |
| 4 24 Domestic 4 |
|----------------------------------|
| 4 29 Domestic 4 |
| 4 30 Domestic 4 |
| 4 33 Domestic 4 |
| 4 35 Domestic 4 |
| 4 38 Domestic 4 |
|----------------------------------|
| 4 47 Domestic 4 |
| 5 20 Domestic 5 |
| 5 43 Domestic 5 |
| 6 3 Domestic . |
| 6 7 Domestic . |
|----------------------------------|
| 6 45 Domestic . |
| 6 51 Domestic . |
| 7 54 Foreign 3 |
| 7 60 Foreign 3 |
| 7 65 Foreign 3 |
|----------------------------------|
| 8 55 Foreign 4 |
| 8 56 Foreign 4 |
| 8 58 Foreign 4 |
| 8 59 Foreign 4 |
| 8 62 Foreign 4 |
|----------------------------------|
| 8 63 Foreign 4 |
| 8 70 Foreign 4 |
| 8 72 Foreign 4 |
| 8 73 Foreign 4 |
| 9 53 Foreign 5 |
|----------------------------------|
| 9 57 Foreign 5 |
| 9 61 Foreign 5 |
| 9 66 Foreign 5 |
| 9 67 Foreign 5 |
| 9 68 Foreign 5 |
|----------------------------------|
| 9 69 Foreign 5 |
| 9 71 Foreign 5 |
| 9 74 Foreign 5 |
+----------------------------------+
The code for -duplist- is set as an exercise, or,
alternatively, given below my signature.
Nick
[email protected]
*! 1.0.0 NJC 30 July 2006
program duplist, byable(recall)
version 8.2
syntax [varlist] [if] [in] ///
[, noLabel Name(int 32) Varlabel(int 80) Spaces(int 3) noTRim ]
marksample touse, novarlist
qui count if `touse'
if r(N) == 0 error 2000
// variable name width and variable label width
local nam = 0
local var = 0
foreach v of local varlist {
local nam = max(`nam', length("`v'"))
local var = max(`var', length(trim(`"`: variable label `v''"')))
}
local nam = min(`name', `nam')
local var = min(`varlabel', `var')
// spaces is number of spaces between columns
local col2 = cond(`nam' == 0, 1, `nam' + `spaces' + 1)
local col3 = `col2' + cond(`var' == 0, 0, `var' + `spaces' + 1)
tempvar which group
gen long `which' = _n
qui egen `group' = group(`varlist') if `touse'
su `group', meanonly
local max = r(max)
forval i = 1/`max' {
qui levels `which' if `group' == `i', local(levels)
di _n as txt "{p}`levels'{p_end}"
local l : word 1 of `levels'
foreach v of local varlist {
if "`label'" != "" | "`: value label `v''" == "" {
capture confirm numeric variable `v'
if _rc == 0 {
local show : di `: format `v'' `= `v'[`l']'
local show = trim(`"`show'"')
}
else {
local show : di `: format `v'' `"`= `v'[`l']'"'
if "`trim'" == "" local show = trim(`"`show'"')
}
}
else {
local show `"`: label (`v') `=`v'[`l']''"'
}
di as txt cond(`nam' > 0, abbrev("`v'", `nam'), "") ///
"{col `col2'}" as txt ///
cond(`var' > 0, abbrev(trim(`"`: var label `v''"'), `var'), "") ///
"{col `col3'}" as res "`show'"
}
}
end
Holly Wright 21 July 2006
> I have modified dlist.ado so that I can use it to do a side by side
> comparison of some duplicates in my data. The only hitich is
> that I can't
> figure out how to format the second record so it lines up nicely.
>
> Nick, would you be so kind?
>
> *! 1.3.0 NJC 8 Feb 2006
> * 1.2.0 NJC 7 Feb 2006
> * 1.1.0 NJC 7 Feb 2006
> * 1.0.0 NJC 7 Feb 2006
> program dlist2, byable(recall)
> version 8.2
> syntax [varlist] [if] [in] ///
> [, noLabel Name(int 32) Varlabel(int 80) Spaces(int 3) ]
>
>
> marksample touse, novarlist
> qui count if `touse'
> if r(N) == 0 error 2000
>
> // variable name width and variable label width
> local nam = 0
> local var = 0
> foreach v of local varlist {
> local nam = max(`nam', length("`v'"))
> local var = max(`var', length(trim(`"`:
> variable label `v''"')))
> }
>
> local nam = min(`name', `nam')
> local var = min(`varlabel', `var')
>
> // spaces is number of spaces between columns
> local col2 = cond(`nam' == 0, 1, `nam' + `spaces' + 1)
> local col3 = `col2' + cond(`var' == 0, 0, `var' + `spaces' + 1)
>
> tempvar which
> gen long `which' = _n
> qui levels `which' if `touse', local(levels)
>
> foreach l of local levels {
> if mod(`l',2)==1 {
> di _n as txt "`l'."
> foreach v of local varlist {
> if "`label'" != "" | "`: value label
> `v''" == "" {
> local show : di `: format `v''
> `v'[`l'] _skip(20) `: format `v''
> `v'[`l'+1]
> capture confirm numeric variable `v'
> if _rc == 0 local show =
> trim(`"`show'"')
> }
> else {
> local show `"`: label (`v')
> `=`v'[`l']' `v'[`l'+1]'"'
> }
>
> di as txt cond(`nam' > 0, abbrev("`v'",
> `nam'), "") ///
> "{col `col2'}" as txt ///
> cond(`var' > 0, abbrev(trim(`"`: var label `v''"'),
> `var'), "") ///
> "{col `col3'}" as res "`show'"
> }
> }
> }
> end
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/