I suggest -append-ing the two datasets, with 
a marker variable indicating which is which, 
and then using -duplicates- to check that 
duplicates exist on all other variables. 
Nick 
[email protected] 
Carol Kolb
 
> I am trying to compare 2 versions of a single dataset that is being
> entered by 2 different people (double-data-entry of a large survey). I
> have used the cf3 command to generate the list of matches & 
> mismatches.
> What I would like to do is have these matches/mismatches listed by
> survey/household rather than by variable. Given there are 2000 surveys
> & 100 variables, it would make the correction-process tremendously
> easier for my staff. So, what I would like to see is this:
> 
> Household code: XX-DD-123
> q00  match
> q01  mismatch
>        _q01   q01   HHcode
>            1         2    XX-DD-123
> q02 .... etc.
> 
> Household code: YY-DD-123
> q00 mismatch
>       _q00     q00   HHcode
>           2          4      YY-DD-123
> q01 match
> 
> ---------
> 
> Rather than:
> q00 mismatch
> obs  _q00    q00    HHcode
> 12       2          4       YY-DD-123
> 32       3          2       YD-DE-129
> 
> q01 mismatch
> obs   _q01   q01   HHcode
> 21        1        2      XX-DD-123
> 33        3        3      YD-DE-129
> 
> Any ideas? By commands do not work with cf3 from what I can tell so
> far...
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/