[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

[no subject]

dups provides information about unique and duplicate observations in the
dataset and, optionally, drops all duplicate observations.

varlist is an optional variable list that determines which observations are
duplicates: observations must match exactly on all variables in the list to
be duplicates.  If no varlist is given, then all variables in the dataset
are used to determine duplicates.

Sarah

--On Wednesday, August 14, 2002 7:22 PM +0100 "Siyam,AA  (pgr)" 
<[email protected]> wrote:

> Dear  Stata-users,
>
> I have a household roster data file which consists of about 20 variables
> measured on household members.  I have my doubts that the persons_id
> within a household is not unique.  Is there a way I can "mass-check" all
> 20 variables between members of the same households to determine
> duplicate records.  I thought of the following:
>
> sort hhid persons_id
>
> for var V1-V20: gen DX=X[_n]==X[_n-1]
>
> quietly by hhid: egen DSUM=rsum( DV1. .... DV20)
>
> quietly by hhid: drop if DSUM[n]==20
>
>
> Does that make sense!
>
> Many thanks for your thoughts in advance...
>
> Amani
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>



Sarah Mustillo, Ph.D
Center for Developmental Epidemiology
Department of Psychiatry and Behavioral Sciences
Duke University Medical Center

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: Re: st: Detecting Duplicate Records
Next by Date: st: RE: Re: output of matrix to tab-delimited format
Previous by thread: st: Detecting Duplicate Records
Next by thread: st: RE: RE: single spaces in outfile
Index(es):
- Date
- Thread