[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: RE: multiple response to binary

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	st: RE: multiple response to binary
Date	Fri, 12 Jul 2002 17:57:56 +0100
Lee Sieswerda

> I have a data management problem (WinNT4, Stata v7).
>
> I have data from a questionnaire where some of the questions allow the
> respondent to choose multiple responses. Lets say there 7
> possible responses
> and they could choose any number of them. I would code this as a set of 7
> binary variables. Unfortunately, the way it was coded was not so
> straightforward. It was coded across 7 variables, but the responses were
> simply entered in the order in which they were given by the respondent. So
> the data look like this:
>
> f4m1      f4m2      f4m3      f4m4      f4m5      f4m6      f4m7
>     1         7         4         .         .         .         .
>     1         .         .         .         .         .         .
>     1         .         .         .         .         .         .
>     1         .         .         .         .         .         .
>     7         3         .         .         .         .         .
>     1         .         .         .         .         .         .
>     1         2         3         4         .         .         .
>     1         2         3         4         6         .         .
>     1         2         7         .         .         .         .
>     1         .         .         .         .         .         .
>
> As you can see, you cannot simply tabulate the number of people who
> responded 1, 2 , 3 etc because the responses are scattered over the 7
> variables in a different order for every person. The folks who provided me
> with this data use SPSS and they get around this problem by using
> "multiple
> responses sets". In SPSS, you can define a set of variables as a multiple
> response set (in this case, seven variables) and then ask for tables of
> frequencies and crosstabs generated from across the 7 variables. It works,
> but I'd much rather use Stata than SPSS. Also, the SPSS solution
> is limited
> to simple tables and doesn't permit you to get chi-square or other
> statistics.
>
> Now, in Stata I know I can generate dummy variables from this mess like
> this:
> gen dum1 = 0
> replace dum1 = 1 if f4m1==1 | f4m2==1 | f4m3==1 etc.
> replace dum1 = . if f4m1==. & f4m2==. & f4m3==. etc.
> gen dum2 = 0
> etc.
>
> However, this is tedious in the extreme and there are many of
> these multiple
> response questions in the dataset. I could automate the procedure somewhat
> using -foreach-, but its still more brute force than elegance.
> Someone told
> me about a SAS solution to this problem using an array procedure. Does
> anyone have a nice elegant Stata solution to this problem?

Is this what you want?

1. Use -tabm- from -tab_chi- on SSC.

Advantage: You keep the same data structure.
Disadvantage: Ugly table.

. l

           f4m1        f4m2        f4m3        f4m4        f4m5        f4m6
f4m7
  1.          1           7           4           .           .           .
.
  2.          1           .           .           .           .           .
.
  3.          1           .           .           .           .           .
.
  4.          1           .           .           .           .           .
.
  5.          7           3           .           .           .           .
.
  6.          1           .           .           .           .           .
.
  7.          1           2           3           4           .           .
.
  8.          1           2           3           4           6           .
.
  9.          1           2           7           .           .           .
.
 10.          1           .           .           .           .           .
.

. tabm f4m?

           |                         Values
  Variable |         1          2          3          4          6 |
Total
-----------+-------------------------------------------------------+--------
--
      f4m1 |         9          0          0          0          0 |
10
      f4m2 |         0          3          1          0          0 |
5
      f4m3 |         0          0          2          1          0 |
4
      f4m4 |         0          0          0          2          0 |
2
      f4m5 |         0          0          0          0          1 |
1
-----------+-------------------------------------------------------+--------
--
     Total |         9          3          3          3          1 |
22


           |   Values
  Variable |         7 |     Total
-----------+-----------+----------
      f4m1 |         1 |        10
      f4m2 |         1 |         5
      f4m3 |         1 |         4
      f4m4 |         0 |         2
      f4m5 |         0 |         1
-----------+-----------+----------
     Total |         3 |        22

. tabm f4m? , trans

           |                        Variable
    Values |      f4m1       f4m2       f4m3       f4m4       f4m5 |
Total
-----------+-------------------------------------------------------+--------
--
         1 |         9          0          0          0          0 |
9
         2 |         0          3          0          0          0 |
3
         3 |         0          1          2          0          0 |
3
         4 |         0          0          1          2          0 |
3
         6 |         0          0          0          0          1 |
1
         7 |         1          1          1          0          0 |
3
-----------+-------------------------------------------------------+--------
--
     Total |        10          5          4          2          1 |
22

Or 2. You -reshape- to long.
Advantage: Much nicer tables, more control.
Disadvantage: Different data structure.

. gen id = _n
. reshape long f4m , i(id)

. table f4m _j

----------------------------------------
          |              _j
      f4m |    1     2     3     4     5
----------+-----------------------------
        1 |    9
        2 |          3
        3 |          1     2
        4 |                1     2
        6 |                            1
        7 |    1     1     1
----------------------------------------

. table _j f4m

----------------------------------------------
          |                f4m
       _j |    1     2     3     4     6     7
----------+-----------------------------------
        1 |    9                             1
        2 |          3     1                 1
        3 |                2     1           1
        4 |                      2
        5 |                            1
----------------------------------------------

Nick
[email protected]

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
References:
- st: multiple response to binary
  - From: Lee Sieswerda <[email protected]>
Prev by Date: st: Re: multiple response to binary
Next by Date: st: RE: multiple response to binary
Previous by thread: st: Re: multiple response to binary
Next by thread: st: RE: multiple response to binary
Index(es):
- Date
- Thread