Clive Nicholas
> I'm looking to dummy-code (0/1) which party won
> the ith seat in the jth election. Since this is Blighty,
> there can only be
> one winner per district, but since that n=3452, that's an
> awful lot of
> outcomes to code manually! There are five outcome
> categories: conwin;
> labwin; ldmwin; natwin; and othwin.
>
> Now here's the rub: since it's plurality-rule, I need to
> tell Stata to
> code, say, conwin=1 and labwin-othwin=0 if, say, for
> district X: conpc=35;
> labpc=31; ldmpc=16; natpc=17; othpc=1. I've tried several
> generates, such
> as:
>
> -g conwin=0 if conpc > labpc & ldmpc & natpc & othpc-
> -replace conwin=1 if conpc < labpc & ldmpc & natpc & othpc-,
>
> but, although Stata did not return errors at *any* of my
> 'solutions', each
> kept producing multiple, rather than unique, 1's for each
> case (or n).
>
> Any ideas as to where I'm going wrong?
I'm going to ignore the possibilities of ties for first
place.
Suppose, contrary to fact, that there were just two
parties. Then -conwin- would be 1 if the
Conservatives won and 0 if Labour won, i.e.
gen conwin = conpc > labpc
or, more long-windedly,
gen conwin = 1 if conpc > labpc
replace conwin = 0 if conpc < labpc
which has 0 and 1 reversed from what you have.
When you bring in other parties, note that your extra
conditions
& ldmpc & natpc & othpc
are read by Stata as
& (ldmpc != 0) & (natpc != 0) & (othpc != 0)
which is not what you want. Perhaps you are
guessing that Stata will interpret
& ldmpc & natpc & othpc
as if it meant
& (conpc > ldmpc) & (conpc > natpc) & (conpc > othpc)
but that's not the way Stata works.
So, in short, you went wrong (1) because 0 and 1 are the wrong
way round and (2) you're misinterpreting how
compound conditions are handled.
There's some context at
http://www.stata.com/support/faqs/data/trueorfalse.html
Now Matt Dobra has already given another solution.
Here's another, which is not better, but nevertheless
shows a Stataish approach useful in many other problems.
First, map from your names to others
foreach p in con lab ldm nat oth {
rename `p'pc pc`p'
}
Second, -reshape- to long
reshape long pc, i(district) j(party) string
Third,
bysort district (pc) : gen win = _n == _N
generates your -win- variable collectively.
This works as follows:
bysort district (pc) :
sorts the winning party to the end
of each block of observations,
and in that context
gen win = _n == _N
puts win = 1 in the last observation and win = 0 in
the others in each block.
Fourth, -reshape- back:
reshape wide pc win, i(district) j(party) string
Fifth, if you prefer your names, map backwards:
foreach p in con lab ldm nat oth {
rename pc`p' `p'pc
rename win`p' `p'win
}
So, given appropriate variable names, the code
boils down to
reshape long pc , i(district) j(party) string
bysort district (pc) : gen win = _n == _N
reshape wide pc win, i(district) j(party) string
If you had a copy, [R] reshape would be
a place to look. As it is, there are still
examples you can look at in the on-line help
and at
http://www.stata.com/support/faqs/data/reshape3.html
Some examples are very close to this problem.
There was a tutorial on -by:- in Stata Journal 2(1)
86-102 (2002).
Nick
[email protected]
P.S. I've supposed in all this that the data
concern a single election. If there were
several, then I think something like this would
work (assuming an extra variable -year-):
reshape long pc , i(district year) j(party) string
bysort district year (pc) : gen win = _n == _N
reshape wide pc win, i(district year) j(party) string
The -reshape- brings real bonus whenever the "obvious"
wide data structure turns out to be awkward for some
manipulation (althugh it can be avoided, as in Matt
Dobra's solution, in some cases by using -egen-
functions).
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/