Sometimes, we wish to create variables that are functions of other variables. For example, we may need to calculate body mass index (BMI) using height and weight.
Let's begin by opening and describing an example dataset from the Stata website.
. use https://www.stata.com/users/youtube/rawdata.dta, clear (Fictitious data based on the National Health and Nutrition Examination Survey) . describe Contains data from https://www.stata.com/users/youtube/rawdata.dta Observations: 1,268 Fictitious data based on the National Health and Nutrition Examination Survey Variables: 10 6 Jul 2016 11:17 (_dta has notes)
Variable Storage Display Value name type format label Variable label |
id str6 %9s Identification Number age byte %9.0g sex byte %9.0g Sex race str5 %9s Race height float %9.0g height (cm) weight float %9.0g weight (kg) sbp int %9.0g Systolic blood pressure (mm/Hg) dbp int %9.0g Diastolic blood pressure (mm/Hg) chol str3 %9s serum cholesterol (mg/dL) dob str18 %18s |
The description tells us that the variable height is measured in centimeters (cm) and the variable weight is measured in kilograms (kg). We wish to calculate BMI, which is defined as weight in kilograms divided by the square of height measured in meters. Let's use Stata's generate command to create a new variable for height measured in meters. We simply divide height by 100 to convert centimeters to meters.
. generate heightm = height/100
Then we can create a variable for BMI using our new heightm variable.
. generate bmi = weight / heightm^2
Let's list the first five observations and summarize bmi to check our work.
. list weight height heightm bmi in 1/5 . summarize bmi
Variable | Obs Mean Std. dev. Min Max | |
bmi | 1,268 25.77892 5.241681 15.43519 53.11815 |
Note that we could have divided height by 100 and created the bmi variable with one generate command.
. generate bmi2 = weight / (height/100)^2 . summarize bmi bmi2
Variable | Obs Mean Std. dev. Min Max | |
bmi | 1,268 25.77892 5.241681 15.43519 53.11815 | |
bmi2 | 1,268 25.77892 5.241681 15.43518 53.11815 |
We also could have used Stata's replace command to replace bmi rather than generate a second bmi variable.
. replace bmi2 = weight / (height/100)^2 (0 real changes made) . summarize bmi bmi2
Variable | Obs Mean Std. dev. Min Max | |
bmi | 1,268 25.77892 5.241681 15.43519 53.11815 | |
bmi2 | 1,268 25.77892 5.241681 15.43518 53.11815 |
You can watch a demonstration of these commands by clicking on the link to the YouTube video below. You can read more about these commands by clicking on the links to the Stata manual entries below.
Watch Data management: How to create a new variable that is calculated from other variables.
Read more in the Stata Data Management Reference Manual; see [D] describe, [D] generate, and [D] save. In the Stata Base Reference Manual, see [R] summarize.