My code solved your problem as stated! But
I appreciate that missings should be ignored.
Try this. Here -1 will be returned in -same-
if and only all PIDs are missing.
gen long id = _n
reshape long PID, i(id)
bysort id (PID) : gen same = cond(mi(PID), -1, PID == PID[_n-1])
bysort id (same) : replace same = same[_N]
reshape wide
Nick
[email protected]
Derek Darves
> Thanks all for the comments.
>
> I had to rewrite Nick's suggestion (see original message below) to
> get this to work. In Nick's original formulation every case was
> rendered a "1". I think the problem is that some of the PID
> variables were missing for nearly every case. So, I added a little
> bit of code. I did some error checking and, for the cases
> that it did
> mark greater than 1, the data are correct. This does not mean, of
> course, that I did not miss cases. Since my goal is not find a
> repeated (non-missing) value in a varlist, will this code do the
> trick. In other words, does anyone see a way that the code below
> could have missed a repeated value in varlist? This is the code:
> *Start
> clear
> set mem 1000m
> use data, clear
> keep pid* index
> save safecopy, replace
> // Preparations for easy reshape
> local i 1
> foreach var of varlist pid* {
> ren `var' pid`i++'
> }
>
> // Solution for Problem
> reshape long pid, i(index) j(var)
> by index (pid), sort: gen same = sum(pid==pid[_n-1]) if pid!=.
> replace same = 0 if same ==.
> gen same1=0
> bysort index (same) : replace same1 = same[_N]
> drop same
> reshape wide
>
> save shareddirector, replace
> *end
>
>
> On Oct 13, 2005, at 4:43 AM, Nick Cox wrote:
>
> > This is easier done long.
> >
> > save safecopy
> >
> > gen long id = _n
> > reshape long PID, i(id)
> > bysort id (PID) : gen same = PID == PID[_n-1]
> > bysort id (same) : replace same = same[_N]
> > reshape wide
> >
> > Nick
> > [email protected]
> >
> > Seb Buechte
> >
> >
> >> you could take a "brute force" approach by comparing each
> var with
> >> all
> >> the other vars using two loops:
> >>
> >> gen interlock=0
> >> foreach var1 of varlist PID1 PID2 .... {
> >> foreach var2 of varlist PID2 PID3.... {
> >> if "`var1'"!="`var2'" { // making sure you do not
> compare the
> >> var with itself
> >> replace interlock=1 if `var1' == `var2'
> >> }
> >> }
> >> }
> >>
> >> I am not too sure how long it will take to run through these loops.
> >>
> >
> > Derek Darves
> >
> >
> >>> I have a group of variables:
> >>>
> >>> PID1 - PID15
> >>>
> >>> PID* takes on values from 1 to 8000, and many are missing.
> >>>
> >>> Basically, I would like to make a new variable, called interlock,
> >>> that is equal to 1 if any of the variables in the list
> are equal to
> >>> any other variable in the list (not including itself, of course).
> >>> For example, if PID5==705 and PID14==705 I would like like
> >>>
> >> interlock==1
> >>
> >>>
> >>> Likewise, if none of the the variables in PID* take on
> the value of
> >>> any of the other variables in PID*, I would like interlock==0
> >>>
> >>> I tried this:
> >>> egen interlock = group(pid1_a pid1_b pid2_a pid2_b pid3_a
> >>> pid3_b pid4_a pid4_b pid5_a pid5_b pid6_a pid6_b pid7_a
> >>> pid7_b pid8_a pid8_b pid9_a pid9_b pid10_a pid10_b pid11_a
> >>> pid11_b pid12_a pid12_b pid13_a pid13_b pid14_a
> >>>
> >> pid14_b pid15_a)
> >>
> >>>
> >>> , but it returned all missing values when I know that some share a
> >>> common value in two of the PID* fields.
> >>>
> >>> Lastly, not that it should matter, but the above is a simplifying
> >>> example. In my actual dataset I have about 130 PID*
> >>>
> >> variables. I just
> >>
> >>> mention this in case I am hitting some kind of memory
> limitation (I
> >>> am not receiving any errors when I run the command,
> though, it just
> >>> doesn't work).
> >>>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/