Derek Darves wrote:
> I have a group of variables:
>
> PID1 - PID15
>
> PID* takes on values from 1 to 8000, and many are missing.
>
> Basically, I would like to make a new variable, called interlock,
> that is equal to 1 if any of the variables in the list are equal to
> any other variable in the list (not including itself, of course).
> For example, if PID5==705 and PID14==705 I would like like interlock==1
>
> Likewise, if none of the the variables in PID* take on the value of
> any of the other variables in PID*, I would like interlock==0
>
> I tried this:
> egen interlock = group(pid1_a pid1_b pid2_a pid2_b pid3_a
> pid3_b pid4_a pid4_b pid5_a pid5_b pid6_a pid6_b pid7_a
> pid7_b pid8_a pid8_b pid9_a pid9_b pid10_a pid10_b pid11_a
> pid11_b pid12_a pid12_b pid13_a pid13_b pid14_a pid14_b pid15_a)
>
> , but it returned all missing values when I know that some share a
> common value in two of the PID* fields.
>
> Lastly, not that it should matter, but the above is a simplifying
> example. In my actual dataset I have about 130 PID* variables. I just
> mention this in case I am hitting some kind of memory limitation (I
> am not receiving any errors when I run the command, though, it just
> doesn't work).
There might be easier solutions, but I always think reshape-by:
----------------------------------------------
// Preparations for easy reshape
local i 1
foreach var of varlist pid* {
ren `var' pid`i++'
}
gen index = _n
// Solution for Problem
reshape long pid, i(index) j(var)
by index (pid), sort: gen same = sum(pid==pid[_n-1])
by index: replace same = same[_N] >= 1
reshape wide
------------------------------------------------
. by by
Uli
--
[email protected]
+49 (030) 25491-361
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/