Generally you should not use foreach to loop over observations. Use -by-
instead. In your case it is possible to reshape the data to long to do the
check. Something like this should work:
.. ren person_id id1
.. ren mother_id id2
.. reshape long id, i(index) j(idtyp)
.. sort id idtyp
.. by id: gen control = idtyp[1]==2 if id~= ""
If your data is ok, the variable "control" should be all zero.
uli
Jisheng Cui wrote
> > Following is a sample data with two columns indicating person's ID and
> mother's ID within a family. I would like to seek the best way to check
> whether each mother's ID is one of the person's IDs. Otherwise something
> wrong with the data. Please note: (1) We do not need to check the blank
> mother's ID.
> (2) There are some duplicate mother's ID in the family. If a mother's ID is
> one of the person's ID, then we skip its duplicates. (3) There are
> thousands of such families. The program has to be efficient in calculation.
> Be ware that -foreach- seems not work with the -by- command.
>
> With best wishes,
>
> Jisheng.
>
>
> person_id mother_id
>
> 042208201006
> 042208201014
> 042208201008
> 042208201099
> 042208201005
> 042208201007
> 042208201097 042208201001
> 042208201098 042208201001
> 042208201001 042208201002
> 042208201094 042208201005
> 042208201093 042208201005
> 042208201002 042208201005
> 042208201095 042208201005
> 042208201096 042208201005
> 042208201003 042208201007
> 042208201009 042208201007
> 042208201010 042208201007
> 042208201011 042208201007
> 042208201013 042208201011
> 042208201012 042208201011
-
[email protected]
http://www.sowi.uni-mannheim.de/lesas
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/