Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Looping/Searching across Rows and Columns
From
Sergiy Radyakin <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: Looping/Searching across Rows and Columns
Date
Mon, 18 Nov 2013 20:41:45 -0500
Nanlesta,
may I suggest using reshape wide->long on your IDs reference file? I
think that you can make it the following:
roundid uniqueid
1 201
2 201
3 201
4 202
5 202
....
66 213
66 217
then when you merge, match on the roundid and pull the uniqueid. That
would work if your ids were not reused for other people in other
rounds, which I assume is true, otherwise you would need some
applicability constraints for your 5 reference ids.
With this you will have minimal coding, while relying on standard
Stata's commands (reshape and merge) and avoid explicit looping.
Hope this helps.
Best, Sergiy Radyakin
On Mon, Nov 18, 2013 at 6:51 PM, Nanlesta Pilgrim <[email protected]> wrote:
> Dear all,
> I would be grateful for assistance in how to create a looping code for
> the following situation (or references that I can use to create the
> code). I am working with a large longitudinal data set (app. 25
> years) in which some participants id's were not kept constant over
> time (due to movement in and out of communities and households). Ids
> are created based on one's community and household living in at that
> survey round. I do have all the id's that a participant may have had
> over the time period in a separate file. However, in a given round of
> the data, an individual might be present two or more times but under
> different id's. I'm trying to reconcile this issue. Thus far, I've
> linked the file containing all the id's that a person might have had
> overtime and have created a unique id for the individual. The file
> looks this way:
>
> newid currentid altid1 altid2
> 7391 01203 01202 01209
> 7438 01377 01379 01413
> 7454 01405 01415 01503
>
> newid: I created this unique to the participant by taking it from a
> long format to a wide format
> currentid: id for the current round
> altid1: alternate id used at some previous round
> altid2: alternate id used at some previous round
>
> Note that a person can have at least 5 alternate id.
>
> Where the actual data is stored, I can merge on currentid.
>
> Example of round X datafile.
> currentid
> 01202
> 01203
> 01377
> 01405
> 01415
>
> When I merge my created datafile with round X data file using
> currentid. I would get:
>
> newid currentid altid1 altid2
> . 01202 . .
> 7391 01203 01202 01209
> 7438 01377 01379 01413
> 7454 01405 01415 01503
> . 01415 . .
>
> What I could like to create is a code that looks across the columns
> and rows to identify who are the same people and make a indication
> that they are the same by placing "newid" next to that individual.
> For example: place newid=7391 if any the alternate id's (01202 or
> 01209) appear is also a currentid, essentially looking over up to 5
> columns of data but many rows.
>
> Is this feasible? Is there an alternative solution that would not be
> extremely time consuming given the number of rounds of data?
>
> With thank!
> Nanlesta
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/