Stata | Stata is easy to grow with

Home / Products / Why use Stata? / Stata is easy to grow with

Stata is easy to grow with

Consistent command syntax

Stata's commands are intuitive and easy to learn. Even better, everything you learn about performing a task can be applied to other tasks.

Need to limit your analysis to females? Add if female==1 to any command.

Need standard errors that are robust to many common assumptions? Add vce(robust) to almost any estimation command.

Need to account for sampling weights, clusters, and stratification? Add svy: to the beginning of the command.

The consistency goes even deeper. What you learn about data management commands often applies to estimation commands, and vice versa.

There is a full suite of postestimation commands to perform hypothesis tests, form linear and nonlinear combinations, make predictions, form contrasts, and even perform marginal analysis with interaction plots. These commands work the same way after virtually every estimator.

See how it works

First, we load our dataset.

. webuse nhanes2

Let's start with linear regression. We fit a variety of models and explore results using the postestimation commands for testing, prediction, and marginal analysis.

// Regression of body mass index (BMI) on age and region indicators
regress bmi age i.region 

// Fit the model for females only 
regress bmi age i.region if female==1 

// Obtain robust standard errors 
regress bmi age i.region, vce(robust) 

// Include a female indicator and its interaction with age 
regress bmi age i.region i.female c.age#i.female 

// Perform a joint test of significance for the region indicators 
testparm i.region 

// Compute the predicted BMI for each person 
predict bmi_hat 

// Obtain the average prediction (potential outcome), treating
// all individuals as if they live in region 1 
margins 1.region 

// Obtain average predictions for all regions 
margins region 

// Obtain average predictions by sex across a range of ages 
margins female, at(age=(20 40 60 80)) 

// Plot this interaction 
marginsplot 

(See the graph)
  
    ×

What if we instead have a binary outcome variable, an indicator of whether an individual has high blood pressure? We could fit a logistic regression model. We replace regress in the commands above with logistic, and we use highbp instead of bmi as the dependent variable. Otherwise, the model specification, options, and postestimation commands are almost identical.

// Logistic regression of high blood pressure on age and region indicators 
logistic highbp age i.region 

// Fit the model for females only 
logistic highbp age i.region if female==1 

// Obtain robust standard errors 
logistic highbp age i.region, vce(robust) 

// Include a female indicator and its interaction with age 
logistic highbp age i.region i.female c.age#i.female 

// Perform a joint test of significance for the region indicators 
testparm i.region 

// Compute the predicted probability of high blood pressure
// for each person 
predict prob_hbp 

// Obtain the average predicted probability (potential outcome),
// treating all individuals as if they live in region 1 
margins 1.region 

// Obtain average predicted probability for all regions 
margins region 

// Obtain average predicted probabilities by sex across a range of ages 
margins female, at(age=(20 40 60 80)) 

// Plot this interaction 
marginsplot 

(See the graph)
  
    ×

If we have a count outcome such as the number of individuals in the household, we might want to fit a Poisson model. We use the poisson command and houssiz as the dependent variable, but again, the rest of the command syntax is the same.

// Poisson regression of household size on age and region indicators 
poisson houssiz age i.region 

// Fit the model for females only 
poisson houssiz age i.region if female==1 

// Obtain robust standard errors 
poisson houssiz age i.region, vce(robust) 

// Include a rural location indicator and its interaction with age 
poisson houssiz age i.region i.rural c.age#i.rural 

// Perform a joint test of significance for the region indicators 
testparm i.region 

// Compute the predicted number of individuals in each household 
predict size 

// Obtain the average predicted household size (potential outcome),
// treating all individuals as if they live in region 1 
margins 1.region 

// Obtain average predicted household size for all regions 
margins region 

// Obtain average predicted household size by rural across 
// a range of ages 
margins rural, at(age=(20 40 60 80)) 

// Plot this interaction 
marginsplot 

(See the graph)
  
    ×

We could fit many other models. Models for ordered and unordered categorical outcomes. Multilevel models. Models for time-series, panel, or survival data. Models accounting for endogeneity and sample selection. Regardless of the model, we can use the same command structure, same options, and same postestimation commands that we used above.

We use cookies

We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.

Cookie Settings

Last updated: 16 November 2022

StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.

Advertising and performance cookies

This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.

Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.

Stata is easy to grow with

Consistent command syntax

See how it works

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies

Stata/MP4 Annual License (download)

Stata is easy to grow with

Consistent command syntax

See how it works

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies