FAQ: Statistics

Home / Resources & support / FAQs / Statistics

Statistics

FAQ:

Statistics

Questions are listed below in the following categories:

ANOVA and ANCOVA
Bayesian analysis
Binary outcome qualitative dependent variable models
Causal inference/Treatment-effects
Cluster/factor analysis
Conditional logistic regression
Epidemiological tables
Instrumental variables and simultaneous equations systems
Lasso
Linear regression with simple error structures
Marginal effects after estimation
Meta-analysis
Models with endogenous sample selection
Models with time-series data
Multiple imputation
Multiple outcome qualitative dependent variable models
Panel-data models
17.1 General questions
17.2 Linear regression with panel data
17.3 Censored linear regression with panel data
17.4 Generalized linear model with panel data
Probability distributions
Robust variance estimation
Simple count dependent variable models
Structural equation models
Survey-data analysis
Survival-time (failure-time) models
Tests and CIs

General questions

FAQs concerning releases before Stata 19

17. Panel-data models

17.1 General questions

How do I obtain bootstrapped standard errors with panel data?

How can I generate a variable relating panel data to a reference panel?

How should I interpret changing quadchk results?

What is the difference between random-effects and population-averaged estimators?

Why don't the decomposed variances in xtsum add up?

I imported a dataset from an Excel file. When I use the xtset command to declare the panel data, why do I receive the following error message?

17.2 Linear regression with panel data

Why does xtgls not report an R² statistic?

How do I test for panel-level heteroskedasticity and autocorrelation?

What is the between estimator?

How does xtgls differ from regression clustered with robust standard errors?

Why does xtreg with the mle option produce different results from xtreg with the re option?

How can there be an intercept in the fixed-effects model estimated by xtreg, fe?

What role does the time variable play in xtgls?

Why isn't the calculation of R² the same for areg and xtreg, fe?

17.3 Censored linear regression with panel data

Why do I obtain different results when executing xttobit on the same data in different sessions?

17.4 Generalized linear model with panel data

Why does xtgee sometimes report that convergence was not achieved?

How can I calculate the pseudo R² for xtprobit?

What are the divisors used in xtgee? (Technical FAQ)

Can Stata estimate a Rasch model?

How does Stata's implementation of GEE differ from other implementations?

18. Probability distributions

How do I get the Euler–Mascheroni constant gamma = 0.57721 ... in Stata?

How do I calculate values of the beta function?

What is the delta method and how is it used to estimate the standard error of a transformed parameter?

How are the chi-squared and F distributions related?

19. Robust variance estimation

Which references should I cite when using the vce(cluster clustvar) option to obtain Stata's cluster-correlated robust estimate of variance?

What are some of the small sample adjustments to the sandwich estimate of variance?

Does test after estimating with regress, vce(robust) perform a Chow test?

How can the standard errors with the vce(cluster clustvar) option be smaller than those without the vce(cluster clustvar) option?

What are the advantages of using the robust variance estimator over the standard maximum-likelihood variance estimator in logistic regression?

How do the ML estimation commands (e.g., logit and probit) compute the model chi-squared test when they estimate robust standard errors on clustered data?

Are the estimates produced by probit and logit with the vce(cluster clustvar) option true maximum likelihood estimates?
Is there a difference between the estimates produced by the svy: probit, with psu variable specified in svyset command and probit, vce(cluster clustvar) (and, similarly, between svy: logit, psu variable specified in svyset and logit, vce(cluster clustvar))?

Why should I not do a likelihood-ratio test after an ML estimation (e.g., logit, probit) with clustering or pweights?

20. Simple count dependent variable models

How do you specify the variance function in nbreg to coincide with Cameron and Trivedi's (Regression analysis of count data, page 62) NB1 and NB2 variance functions?
What is the difference between the models fit using nbreg, dispersion(mean) and nbreg, dispersion(constant)?

21. Structural equation models

Why did I get an error saying "no paths from latent variable to observed variables" from sem or gsem?

22. Survey-data analysis

How can I estimate correlation coefficients and their p-values for complex survey data?

What should I do when one of the survey estimators returns an error message, "Missing standard error because of stratum with single sampling unit"?

How is the number of observations computed for subpopulation estimation?

How do I obtain percentiles for survey data?

If we change the order of cluster sampling and stratification when sampling the population, would the svyset command be different?

Is there a way in Stata to do stepwise regression with svy: logit or any of the svy commands?

Do commands used with the svy prefix handle zero weights differently than commands used without the svy?

Why doesn't summarize accept pweights? What does summarize calculate when you use aweights?

23. Survival-time (failure-time) models

What is the relationship between baseline hazard and baseline hazard contribution?

How are the standard errors and confidence intervals computed for hazard ratios (HRs) by stcox and streg?

How do I convert my spell-type data into a survival dataset?
How do I stset my spell-type data?

How do I analyze multiple failure-time data using Stata?

Why does stsum sometimes report missing values for the percentiles of survival time?

Why can't a subject die at time 0?
Why can't a subject enter and die at the same time in the Cox model?

What is the difference between sts list and ltable?

24. Tests and CIs

Is there a way to estimate a nonlinear combination with nlcom, when the error “expression too long” is displayed?

The results from estimation commands display only two-sided tests for the coefficients. How can I perform a one-sided test?

How do I bootstrap a vector of results?

Can you explain Chow tests?

How can I use Monte Carlo simulations to estimate power in Stata?
How can I integrate a simulation program into the power command?

How large should the bootstrapped samples be relative to the total number of cases in the dataset?

How can you specify a term other than residual error as the denominator in a single degree-of-freedom F test after ANOVA?

What are some of the small sample adjustments to the sandwich estimate of variance?

Why does the test command sometimes produce chi-squared and other times F statistics?

Does test after estimating with regress, vce(robust) perform a Chow test?

How can I compute the Chow test statistic?

25. General questions

Why do I get slightly different results when running a ml procedure on Stata/SE and Stata/MP?

Why do I see different p-values, etc., when I change the base level for a factor in my regression?

How can I get an R-squared value when a Stata command does not supply one?

How can I calculate percentile ranks?
How can I calculate plotting positions?

How do I estimate a nonlinear model using ml?

Why does bootstrap give a warning message for non-eclass commands?

How can I get the variance–covariance matrix or coefficient vector?

What are some of the problems with stepwise regression?

Why doesn't summarize accept pweights?
What does summarize calculate when you use aweights?

Why do estimation commands sometimes omit variables?

How do I keep all levels of my categorical variable in my model?
How do I specify a cell means model?

26. FAQs concerning releases before Stata 19

How do I fit a linear regression with interval (inequality) constraints in Stata?

How can I obtain the correlation between the factors after an oblique rotation?

Does Stata provide a test for trend?

What meta-analysis features are available in Stata?

How do you fit a model when the dependent variable is a proportion?

How do I calculate row medians?

How can I estimate correlations and their level of significance with survey data?

How do I obtain the standard error of the predicted probability with logistic regression analysis?

What are the divisors used in xtgee? (Technical FAQ)

How can I form various tests comparing the different levels of a categorical variable after anova or regress?

Why do Stata’s xtgee standard errors differ from those reported by SAS’s PROC GENMOD?

I am using a model with interactions. How can I obtain marginal effects and their standard errors?

I need to run mfx more than once on my dataset, and it's taking a long time. What can I do to make it run as fast as possible?

Can I use mfx on survey data with unweighted means?

I am using mfx after an estimation that has an offset. How does mfx take that into account?

Running mfx on my dataset takes a long time, and I am worried it may have crashed. How can I tell if it is still running?

I am only interested in obtaining a few of the marginal effects for a few independent variables. How can I do that?

When I run mfx, I am getting the warning message "warning: predict() expression unsuitable for standard-error calculation; option nose imposed". What does that mean?

When I run mfx, I am getting the error message "predict() option unsuitable for marginal effects". What does that mean?

When I run mfx, I am getting the warning message "warning: derivative missing; try rescaling variable mpg". What does that mean?

What is the difference between the linear and nonlinear methods that mfx uses?

How do I calculate least square means in Stata?

What does “completely determined” mean in my logistic regression output?

How can I produce adjusted means after ANOVA?

Why does stcox sometimes produce missing standard errors?

What are the differences between predict and adjust?

How can I obtain the correlation matrix as a Stata matrix?

Why does my mlogit take so long to converge?

How can I get robust standard errors for tobit?

Why do Stata and SAS differ in the results that they report for the stratified generalized Wilcoxon test for time-to-event data?

Is there any difference between using tsset and iis and tis before xt commands?

How can I get robust standard errors for tobit?

How do I estimate a nonlinear model using ml?

Why do I get an "unbalanced data" error message when I run nlogit?

How do you test the equality of regression coefficients that are generated from two different regressions, estimated on two different samples?

Is it possible to analyze survey data with two or more levels of clustering with the svy commands?

How can I calculate moving averages for panel data?

Does Stata support any multiple comparison tests following two-way ANOVA?

How do I get the correct variance–covariance matrix from the bs routine?

How can I estimate stepwise Cox models?

How can I estimate a fixed-effects regression with instrumental variables?

Why were the timings in the American Statistician (August 1997) review of the svy commands so slow?

How do I estimate a Cox model with a continuously time-varying parameter?

What are completely determined panels?

What is the difference between biprobit/heckprob and the STB commands?

Where are the Wald tests for zinb that appear in the manual?

Why do Stata's cc and cci commands report different confidence intervals than Epi Info?

How can I get one-tailed probabilities for the Student's t distribution?

How can I simulate random multivariate normal observations from a given correlation matrix?

Why does Weibull with entry and exit times produce different results from Weibull with duration?

How does Stata's xtgee handle singletons with exchangeable correlation?

I am running clogit and get the message "Note: multiple positive outcomes within groups encountered." Is this something I should worry about or is this a normal message?

Can Stata's ml routine converge and produce answers that look good even when it shouldn't?

Why don't the old huber results match the new robust versions?

How can I get predicted probabilities for different x values after probit?

How can I get predicted probabilities after svylogit, svyprobt, svymlog, svyolog, or svyoprob?

Why does the goodness-of-fit chi-squared test reported by poisson change when the counts and exposures are grouped differently?

What is the pseudo R² in the weibull output?

How can I get the Mills' ratios for my heckman model?

How do I test endogeneity?
How do I perform a Durbin–Wu–Hausman test?

Statistics

1. ANOVA and ANCOVA

2. Bayesian analysis

3. Binary outcome qualitative dependent variable models

4. Causal inference/Treatment-effects

5. Cluster/factor analysis

6. Conditional logistic regression

7. Epidemiological tables

8. Instrumental variables and simultaneous equations systems

9. Lasso

10. Linear regression with simple error structures

11. Marginal effects after estimation

12. Meta-analysis

13. Models with endogenous sample selection

14. Models with time-series data

15. Multiple imputation

16. Multiple outcome qualitative dependent variable models

17. Panel-data models

17.1 General questions

17.2 Linear regression with panel data

17.3 Censored linear regression with panel data

17.4 Generalized linear model with panel data

18. Probability distributions

19. Robust variance estimation

20. Simple count dependent variable models

21. Structural equation models

22. Survey-data analysis

23. Survival-time (failure-time) models

24. Tests and CIs

25. General questions

26. FAQs concerning releases before Stata 19

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies

Stata/MP4 Annual License (download)

Statistics

1. ANOVA and ANCOVA

2. Bayesian analysis

3. Binary outcome qualitative dependent variable models

4. Causal inference/Treatment-effects

5. Cluster/factor analysis

6. Conditional logistic regression

7. Epidemiological tables

8. Instrumental variables and simultaneous equations systems

9. Lasso

10. Linear regression with simple error structures

11. Marginal effects after estimation

12. Meta-analysis

13. Models with endogenous sample selection

14. Models with time-series data

15. Multiple imputation

16. Multiple outcome qualitative dependent variable models

17. Panel-data models

17.1 General questions

17.2 Linear regression with panel data

17.3 Censored linear regression with panel data

17.4 Generalized linear model with panel data

18. Probability distributions

19. Robust variance estimation

20. Simple count dependent variable models

21. Structural equation models

22. Survey-data analysis

23. Survival-time (failure-time) models

24. Tests and CIs

25. General questions

26. FAQs concerning releases before Stata 19

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies