Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: how can I group airport markets
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: how can I group airport markets
Date
Wed, 20 Apr 2011 08:47:57 +0100
This was last asked on 14 April. That shows how consultation of the
archives is a good idea. See
http://www.stata.com/statalist/archive/2011-04/msg00767.html
and the resulting thread.
I will assume string variables. The trick is to realise that LAX MIA
and MIA LAX both sort alphabetically to the same pair, so how do we do
that? The functions -min()- and -max()- don't take string arguments,
so we turn instead to -cond()-.
gen first = cond(origin > destination, destination, origin)
gen second = cond(origin < destination, destination, origin)
Note that there is no difficulty about applying > and < to strings --
the expression ("b" > "a") evaluates as true (1), for example.
Then you are home and dry with
egen group = group(first second), label
The thread started with a reference to
http://www.ats.ucla.edu/stat/stata/faq/dyad_ids.htm
but after reading it I still prefer the method above, which was
earlier documented at
SJ-8-4 dm0043 . Tip 71: The problem of split identity, or how to group dyads
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
Q4/08 SJ 8(4):588--591 (no commands)
tip on how to handle dyadic identifiers
If you have numeric variables with value labels, you can still use
exactly the same technique, but the grouping will not necessarily be
in alphabetical order: that depends on your value labels.
Also, it is possible that this method will uncover some typos in your
data, so you would need to fix those and re-do the grouping.
Nick
P.S. Does "PhD in Economics" mean you have one, or you hope to get one?
On Wed, Apr 20, 2011 at 6:18 AM, <[email protected]> wrote:
> I have a large dataset on flights with information about origins and
> destinations.
> For example: origin - destination
> LAX - MIA
> LAX - MIA
> MIA - LAX
> MIA - LAX
> MIA - LAX
> LAS - LAX
> LAX - LAS
> ... ...
> How can I group LAX-MIA and MIA-LAX into the same market? If I use
> egen market = group (origin destination), they will be grouped into 2
> different markets. Another way is to sort the data into the same origins
> to the same destinations, but I don't know how to do that either.
> For example: origin - destination
> LAX - MIA
> LAX - MIA
> LAX - MIA
> LAX - MIA
> LAX - MIA
> LAX - LAS
> LAX - LAS
> ... ...
> Thanks!
>
>
> Dan Luo
> PhD in Economics
> University of California, Irvine
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/