Mary Zellmer-Bruhn <[email protected]> wrote,
> I am interested in calculating kappa for interrater agreement. I have a
> dataset that contains about 350 individuals (members of about 90 teams).
> The individuals are the rows in the data. Columns are a set of 13 items on
> which the individuals answered yes (1) or no (0). I want to calculate the
> interrater agreement on these 13 items. I think the kappa command should do
> this, but I am not sure about the exact programming necessary. I'd
> appreciate any insights or comments.
If I understand the question, Mary wishes to calculate Kappa when the are more
than two raters (the 350 individuals), and two possible ratings (the "yes" or
"no" answers). Kappa will be calculated over the 19 questions.
-kappa- is the command and the syntax for kappa in this case is
kappa pos neg
where variable pos records the number of raters assessing positive ("yes")
and variable neg records the number of raters assessing negative ("no").
In Mary's case, this two-variable dataset would have 19 observations, one for
each question.
Mary's data does not look like that. She is starting with a 350-observation
data that looks like,
personid q1 q2 ... q19
--------------------------------------
1. 100 1 0 ... 1
2. 105 1 1 ... 0
.. .. .. .. ... ..
350. 4222 0 1 ... 1
Perhaps her dataset contains other variables as well; it does not matter.
Anyway, assuming the variables are named as I have shown them above, here is
what Mary needs to type:
. reshape long q, i(personid) j(qnum)
. sort qnum
. by qnum: gen pos = sum(q==1)
. by qnum: gen neg = sum(q==0)
. by qnum: keep if _n==_N
. kappa pos neg
The key to the solution is the -reshape- command, which allowed me to convert
Mary's wide data to the long form. After -reshape-, Mary's data looks like:
personid qnum q
---------------------------
1. 100 1 1 --+
2. 100 2 0 | this was previously obs 1
.. .. .. .. |
19. 100 19 1 --+
20. 105 1 1 --+
21. 105 2 1 | this was previously obs 2
.. .. .. .. |
38. 105 19 0 --+
.. .. .. ..
6632. 4422 1 0 --+
6633. 4422 2 1 | this was previously obs 350
.. .. .. .. |
6650. 4422 19 1 --+
With the data in this form, I can now order it on question number, typing
-sort qnum-, to obtain
personid qnum q
---------------------------
1. 100 1 1
2. 105 1 1
.. .. .. ..
350. 4422 1 0
351. 100 2 0
352. 105 2 1
.. .. .. ..
700. 4422 2 1
.. .. .. ..
6301. 100 19 1
6302. 105 19 0
.. .. .. ..
6650. 4422 19 1
Actually, I will not be ordered within qnum within personid unless I typed
-sort qnum personid-, but that does not matter. Type that if you want.
In any case, I am now in position to type the sums of positive and negative
responses, by question number, and produce the desired 19-observation dataset.
-- Bill
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/