Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Cleaning Survey Data
From
Erika Kociolek <[email protected]>
To
[email protected]
Subject
st: Cleaning Survey Data
Date
Wed, 5 Feb 2014 09:47:28 -0800
When working with survey data - specifically closed-ended, multiple
response questions - datasets are often structured like this:
Q1_R1 Q1_R2 Q1_R3 Q1_ROTHER
1 3 98 "lemons"
2
1 2
I ultimately want to know the number of respondents that selected 1,
2, 3, 98, so I write code that looks something like this:
local values 1 2 3 98
foreach x of local values {
generate Q1_`x'_flag = `x' if (Q1_R1 == `x' | Q1_R2 == `x' | Q1_R3 == `x')
}
Is there a better way to get to the goal (what's below)?
label define Q1_1_label "Milk"
label values Q1_1_flag Q1_1_label
label define Q1_2_label "Bread"
label values Q1_2_flag Q1_2_label
label define Q1_3_label "Apples"
label values Q1_3_flag Q1_3_label
label define Q1_98_label "Other"
label values Q1_98_flag Q1_98_label
Q1_1_flag Q1_2_flag Q1_3_flag Q1_98_flag
1 1 1
1
1 1
It can be tedious to type out "if (Q1_R1 == | Q1_R2 == | Q1_R3 == |
...)" when different questions have different numbers of variables and
there are many possible responses to a given question (i.e. Q1_R1
through Q1_R17).
Thanks for any advice you have.
Best,
Erika
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/