You can get first and second names alphabetically like this:
gen first = cond(name < twinname, name, twinname)
Then sort on -first-. Here < records inequality in alphanumeric order.
Thus "Helena Harris" < "Joanne Moore" and in both cases, -first- will be
returned as "Helena Harris". This can be a twin id, or you can map that
to integers using -egen, group()-.
This method is not robust to any leading or trailing spaces (use
-trim()- if any are present) or differences in spelling, but neither is
any other automated method.
Nick
[email protected]
Maria Garden
I have the following problem: I have data where people had been asked
to provide their name and the name of their twin.
I would like to generate an identifier for each twin pair. How can I
do that without really search for each name? I have a couple of
hundred people.
My data looks like this:
name / Twin name / TwinID
Helena Harris / Joanne Moore / 1
Kevin Dedman / Tom Dedman / 2
Freya Lamb / Sandra Scott / 3
Sandra Scott / Freya Lamb / 3
Tom Dedman / Kevin Dedman / 2
Joanne Moore / Helena Harris / 1
How can I generate the variable TwinID? Can anyone help?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/