Hi,
Nick Cox already mentioned the 'duplicates' command and it's just a
little twist to use it to find non-duplicates. "duplicates" is easy to
set up and works with different types of vars.
duplicates tag zipcode var1-var5, gen(dup)
"dup" counts the number of copies in each zipcode group starting with
the second identical case.
If var1-var5 in a zipcode group are constant, dup + 1 is equal to the
number of cases in the group (_N)
bysort zipcode : assert _N == dup+1
In case of errors there may be many ways to spot and correct them,
depending on the size of the dataset, the number of vars to compare and
possible sources of error. It may be feasible to create a variable for
_N in each zipcode group
bysort zipcode : gen N = _N
The following code tabulates non-constant vars by zipcode
levelsof zipcode if N != dup + 1, local(ziperror)
foreach x of local ziperror {
di "Zipcode: `x'"
foreach y of varlist var1-var5 {
qui tab `y' if zipcode == "`x'" // only to check if the var has more
than one non-missing values
if r(r) > 1 & r(r) <. tab `y' if zipcode == "`x'" // tabulates var if it
has more than one value
}
}
*** An example with an additional string var and some errors (the assert
command is commented out)
clear
input str10 zipcode var1 /*
*/ var2 var3 var4 var5 str1 var6
"0182801" 1252 144 115 113 29 "A"
"0182801" 1253 144 115 123 29 "A"
"0182801" 1253 144 115 113 29 "B"
"0182801" 1253 144 115 113 29 "A"
"0183204" 91 8 8 8 0 "C"
"0183204" 90 8 8 8 0 "D"
"0183331" 772 81 64 62 17 "E"
"0183331" 772 81 64 62 17 "F"
"0183331" 772 81 64 62 17 "E"
"0183505" 1716 262 218 211 44 "A"
"0183505" 1716 262 218 211 44 "A"
end
duplicates tag zipcode var1-var6, gen(dup)
* bysort zipcode : assert _N == dup+1
bysort zipcode : gen N = _N
levelsof zipcode if N != dup + 1, local(ziperror)
foreach x of local ziperror {
di ""
di "Zipcode: `x'"
foreach y of varlist var1-var6 {
qui tab `y' if zipcode == "`x'" // only to check if the var has more
than one values
if r(r) > 1 & r(r) <. tab `y' if zipcode == "`x'" // show vars with more
than one values
}
}
Best wishes
Stefan Gawrich
Dillenburg
Germany
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/