Title | Apply labels after reshape | |
Author | Theresa Boswell, StataCorp |
After reshaping my dataset using the reshape command, some of the variable and value labels are deleted. Is there a way to retrieve the original labels and apply them to the reshaped dataset?
The reshape command reconstructs your dataset to wide or long form. It does not have an option to save the value or variable labels of the variables that change. (See FAQ: "I am having problems with the reshape command. Can you give further guidance?" or reshape for more information on the command.) However, you can use macros to save the labels before reshaping and apply the labels to the new variables. To show how this is done, we will look at an example dataset created as follows:
. input id year answer inc id year answer inc 1. 1 80 0 5000 2. 1 81 1 5500 3. 1 82 0 6000 4. 2 80 1 2000 5. 2 81 0 2200 6. 2 82 1 3300 7. 3 80 0 3000 8. 3 81 1 2000 9. 3 82 1 1000 10. end . label define answer 0 "Yes" 1 "No" . label define year 80 "1980" 81 "1981" 82 "1982" . label values answer answer . label values year year . label variable id "Identification" . label variable year "Year of study" . label variable answer "Answer to question" . label variable inc "value of inc" . describe Contains data obs: 9 vars: 4 size: 180 (99.9% of memory free) ------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------- id float %9.0g Identification year float %9.0g year Year of study answer float %9.0g answer Answer to question inc float %9.0g value of inc ------------------------------------------------------------------------------- Sorted by: Note: dataset has changed since last saved
Using the code below, we save the variable labels in lv where v is the name of the variable. Thus the macro id will contain the string identification.
foreach v of var*{ local l`v' : variable label `v' }
Because not all variables will contain value labels, we will first create a local list of the variables that contain value labels.
local list "year answer"
Now we are ready to loop over this list and create macros that contain the value label for each value of all variables in the local list above.
/* save the value labels for variables in local list*/ foreach var of local list{ levelsof `var', local(`var'_levels) /* create local list of all values of `var' */ foreach val of local `var'_levels { /* loop over all values in local list `var'_levels */ local `var'vl`val' : label `var' `val' /* create macro that contains label for each value */ } }
If you are confused as to how the macros are named, type macro list to see a list of all macros currently in memory. The above loop creates the following macros.
. macro list (output omitted) _answervl1: No _answervl0: Yes _yearvl82: 1982 _yearvl81: 1981 _yearvl80: 1980
Thus, if we want the value label for answer when the value is 1, we can simply type answervl1 to retrieve No.
. reshape wide inc answer, i(id) j(year)
Here these variables are inc and answer. After identifying the variables, we can create a local list containing these variables.
/* create local list for variables that we want to add labels */ local variablelist "inc answer"
To apply the new labels, you must decide how you would like the labels to look. Here our new variables are in the form answer/year and inc/year. Thus we want to add labels that specify the year and answer to the question or value of inc. Below we create a loop that adds the variable label and new value labels simultaneously.
/* apply the variable & value labels as variable labels */ /* variables are in form answeryear incyear */ foreach variable of local variablelist{ /* loop over list "inc answer" */ foreach value of local year_levels{ /* loop over list "80 81 82" */ label variable `variable'`value' "`l`variable'': `yearvl`value''" } }
Now our reshaped dataset contains labels that help identify each variable.
. describe Contains data obs: 3 vars: 7 size: 96 (99.9% of memory free) ------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------- id float %9.0g Identification answer80 float %9.0g answer Answer to question: 1980 inc80 float %9.0g value of inc: 1980 answer81 float %9.0g answer Answer to question: 1981 inc81 float %9.0g value of inc: 1981 answer82 float %9.0g answer Answer to question: 1982 inc82 float %9.0g value of inc: 1982 ------------------------------------------------------------------------------- Sorted by: id