Community corner: Cross-validation in Stata

Evaluating the out-of-sample properties of statistical models is important, especially for predictive modeling and analytics. Steven Brownell and Billy Buchanan’s crossvalidate package makes it easy. It contains xv, an extensible prefix command implementing cross-validation for Stata estimation commands.

The xv and xvloo prefixes split your sample, fit your model to the training sample, predict outcomes on the validation or test sample, and compute metrics related to fit, all in one command.

For example, use an 80/20 split to evaluate the mean squared error for a linear regression model:

. xv .8, metric(mse): reg price mpg i.foreign

Or use a 60/20/20 split with four folds to evaluate accuracy for a logistic regression model:

. xv .6 .2, metric(acc) kfold(4): logit low age lwt i.race smoke pt1 ht ui

Use one of more than 40 built-in metrics or create your own. You can install these prefix commands and learn more about them and the built-in metrics by typing

. ssc install crossvalidate2

. help crossvalidate2

. help libxv##classification

To learn more about the crossvalidate package and all of its options, take a look at Billy's GitHub page and Steven’s talk from the 2024 Stata Conference.

«Back to main page

We use cookies

We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.

Cookie Settings

Last updated: 16 November 2022

StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.

Advertising and performance cookies

This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.

Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.