I suggest -append-ing the two datasets, with
a marker variable indicating which is which,
and then using -duplicates- to check that
duplicates exist on all other variables.
Nick
[email protected]
Carol Kolb
> I am trying to compare 2 versions of a single dataset that is being
> entered by 2 different people (double-data-entry of a large survey). I
> have used the cf3 command to generate the list of matches &
> mismatches.
> What I would like to do is have these matches/mismatches listed by
> survey/household rather than by variable. Given there are 2000 surveys
> & 100 variables, it would make the correction-process tremendously
> easier for my staff. So, what I would like to see is this:
>
> Household code: XX-DD-123
> q00 match
> q01 mismatch
> _q01 q01 HHcode
> 1 2 XX-DD-123
> q02 .... etc.
>
> Household code: YY-DD-123
> q00 mismatch
> _q00 q00 HHcode
> 2 4 YY-DD-123
> q01 match
>
> ---------
>
> Rather than:
> q00 mismatch
> obs _q00 q00 HHcode
> 12 2 4 YY-DD-123
> 32 3 2 YD-DE-129
>
> q01 mismatch
> obs _q01 q01 HHcode
> 21 1 2 XX-DD-123
> 33 3 3 YD-DE-129
>
> Any ideas? By commands do not work with cf3 from what I can tell so
> far...
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/