Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Looping/Searching across Rows and Columns


From   Nanlesta Pilgrim <[email protected]>
To   [email protected]
Subject   st: Looping/Searching across Rows and Columns
Date   Mon, 18 Nov 2013 18:51:58 -0500

Dear all,
I would be grateful for assistance in how to create a looping code for
the following situation (or references that I can use to create the
code).  I am working with a large longitudinal data set (app. 25
years)  in which some participants id's were not kept constant over
time (due to movement in and out of communities and households). Ids
are created based on one's community and household living in at that
survey round. I do have all the id's that a participant may have had
over the time period in a separate file.  However, in a given round of
the data, an individual might be present two or more times but under
different id's.  I'm trying to reconcile this issue. Thus far, I've
linked the file containing all the id's that a person might have had
overtime and have created a unique id for the individual.  The file
looks this way:

newid currentid    altid1     altid2
7391    01203      01202    01209
7438    01377      01379    01413
7454    01405      01415    01503

newid: I created this unique to the participant by taking it from a
long format to a wide format
currentid:  id for the current round
altid1: alternate id used at some previous round
altid2: alternate id used at some previous round

Note that a person can have at least 5 alternate id.

Where the actual data is stored, I can merge on currentid.

Example of round X datafile.
currentid
 01202
 01203
 01377
 01405
 01415

When I merge my created datafile with round X data file using
currentid. I would get:

newid currentid    altid1     altid2
.          01202         .             .
7391    01203      01202    01209
7438    01377      01379    01413
7454    01405      01415    01503
.          01415          .           .

What I could like to create is a code that looks across the columns
and rows to identify who are the same people and make a indication
that they are the same by placing "newid" next to that individual.
For example:  place newid=7391 if any the alternate id's (01202 or
01209) appear is also a currentid, essentially looking over up to 5
columns of data but many rows.

Is this feasible?  Is there an alternative solution that would not be
extremely time consuming given the number of rounds of data?

With thank!
Nanlesta
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index