General statistics
Here are the details:
-
You can now save estimation results to disk. You type
estimates save filename
to save results and
estimates use filename
to reload them. In fact, the entire
estimates command has been reworked. The new command
estimates notes allows you to add notes to estimation
results just as you add them to datasets. The new command
estimates esample allows you to restore
e(sample) after reloading estimates, should that be
necessary (usually it is not). The maximum number of estimation results
that can be held in memory (as opposed to saved on disk) is increased to
300 from 20. See [R]
estimates.
-
Stata now has exact logistic and exact Poisson regression. Rather than
having their inference based on asymptotic normality, exact estimators
enumerate the conditional distribution of the sufficient statistics and
then base inference upon that distribution. In small samples, exact
methods have better coverage than asymptotic methods, and exact methods
are the only way to obtain point estimates, tests, and confidence
intervals from covariates that perfectly predict the observed outcome.
Postestimation command estat se reports odds
ratios and their asymptotic standard errors.
estat predict, available only after
exlogistic
computes predicted probabilities, asymptotic standard errors, and exact
confidence intervals for single cases.
See [R]
exlogistic and [R]
expoisson.
-
New estimation command
asclogit performs alternative-specific
conditional logistic regression, which includes McFadden’s choice
model. Postestimation command
estat alternatives reports alternative-specific
summary statistics.
estat mfx reports marginal effects of regressors
on probabilities of each alternative. See [R]
asclogit and [R]
asclogit postestimation.
-
New estimation command asroprobit performs
alternative-specific rank-ordered probit regression.
asroprobit is related to rank-ordered logistic
regression
(rologit) but allows modeling
alternative-specific effects and modeling the covariance structure of the
alternatives. Postestimation command
estat alternatives provides summary statistics
about the alternatives in the estimation sample.
estat covariance displays the
variance–covariance matrix of the alternatives.
estat correlation displays the
correlation matrix of the alternatives.
estat mfx computes the marginal effects of
regressors on the probability of the alternatives. See [R]
asroprobit and [R]
asroprobit postestimation.
-
New estimation command
ivregress performs single-equation
instrumental-variables regression by two-stage least squares,
limited-information maximum likelihood, or generalized method of moments.
Robust and HAC covariance matrices may be requested.
Postestimation command
estat firststage provides various
descriptive statistics and tests of instrument relevance.
estat overid tests overidentifying restrictions.
ivregress replaces the previous
ivreg command. See [R]
ivregress and [R]
ivregress postestimation.
-
New estimation command nlsur fits a system of
nonlinear equations by feasible generalized least squares, allowing for
covariances among the equations; see [R]
nlsur.
-
Existing estimation command nlogit was rewritten
and has new, better syntax and runs faster when there are more than two
levels. Old syntax is available under version control.
nlogit now optionally fits the
random utilities maximization (RUM) model as well as the
nonnormalized model that was available previously. The new
nlogit now allows unbalanced groups and allows
groups to have different sets of alternatives.
nlogit now excludes entire choice sets (cases)
if any alternative (observation) has a missing value; use new option
altwise to exclude just the alternatives
(observations) with missing values. Finally,
vce(robust) is allowed regardless of the number
of nesting levels. See [R]
nlogit.
-
Existing estimation command asmprobit has the
following enhancements:
-
The new default parameterization estimates the covariance of the
alternatives differenced from the base alternative, making the
estimates invariant to the choice of base. New option
structural specifies that the previously
structural (nondifferenced) covariance parameterization be used.
-
asmprobit now permits estimation of the
constant-only model.
-
asmprobit now excludes entire choice sets
(cases) if any alternative (observation) has a missing value; use new
option altwise to exclude just the
alternatives (observations) with missing values.
-
New postestimation command estat mfx
computes marginal effects after asmprobit.
See [R]
asmprobit and [R]
asmprobit postestimation.
-
Existing estimation command clogit now accepts
pweights and may be used with the
svy: prefix.
Also, clogit used to be willing to produce
cluster-robust VCEs when the groups were not nested within the
clusters. Sometimes, this VCE was consistent, and other times it was not.
You must now specify the new nonest option to
obtain a cluster-robust VCE when the groups are not nested within
panels.
predict after clogit
now accepts options that calculate the Δβ influence statistic,
the Δchi2 lack-of-fit statistic, the Hosmer and
Lemeshow leverage, the Pearson residuals, and the standardized Pearson
residuals.
See [R]
clogit and [R]
clogit postestimation.
-
Existing estimation command
cloglog now accepts
pweights, may
now be used with the svy: prefix, and has new option
eform that requests that exponentiated
coefficients be reported; see [R]
cloglog.
-
Existing estimation command
cnreg now accepts
pweights, may be used with the
svy: prefix, and is now noticeably faster (up to five
times faster) when used within loops, such as by
statsby. See [R]
cnreg.
-
Existing estimation commands
cnsreg and
tobit now accept
pweights, may be used with the
svy: prefix, and are now noticeably faster (up
to five times faster) when used within loops, such as by
statsby. Also, cnsreg
now has new advanced option mse1 that sets the mean
squared error to 1. See [R]
cnsreg and [R]
tobit.
-
Existing estimation command
regress is now noticeably faster (up to
five times faster) when used with loops, such as by
statsby. Also,
-
Postestimation command
estat hettest has new option
iid that specifies that an alternative
version of the score test be performed that does not require the
normality assumption. New option
fstat specifies that an alternative
F test be performed that also does not
require the normality assumption.
-
Existing postestimation command
estat vif has new option
uncentered that specifies that uncentered
variance inflation factors be computed.
See [R]
regress postestimation.
-
Existing estimation commands
logit,
mlogit,
ologit,
oprobit, and
probit are now noticeably faster (up to five
times faster) when used within loops, such as by
statsby.
-
For existing estimation command
probit,
predict now allows the
deviance option; see [R]
probit postestimation.
-
Existing estimation command nl has the following
enhancements:
-
Option vce(vcetype)
is now allowed, with supported
vcetypes that include types derived from
asymptotic theory, that are robust to some kinds of misspecification,
that allow for intragroup correlation, and that use bootstrap or
jackknife methods. Also, three heteroskedastic- and
autocorrelation-consistent variance estimators are available.
-
nl no longer reports an overall model
F test because the test that all parameters
other than the constant are jointly zero may not be appropriate in
arbitrary nonlinear models.
-
The coefficient table now reports each parameter as its own equation,
analogous to how
ml reports single-parameter equations.
-
predict after
nl has new options that allow you to obtain
the probability that the dependent variable lies within a given
interval, the expected value of the dependent variable conditional on
its being censored, and the expected value of the dependent variable
conditional on its being truncated. These predictions assume that the
error term is normally distributed.
-
mfx can be used after
nl to obtain marginal effects.
-
lrtest can be used after
nl to perform likelihood-ratio tests.
See [R]
nl and [R]
nl postestimation.
-
Existing estimation command
mprobit now allows
pweights, may now be used with the
svy: prefix, and has new option
probitparam that specifies that the probit
variance parameterization, which fixes the variance of the differenced
latent errors between the scale and the base alternatives to one, be used.
See [R]
mprobit.
-
Existing estimation command rologit now allows
vce(bootstrap) and
vce(jackknife). See [R]
rologit.
-
Existing estimation command truncreg now allows
pweights and now works with the
svy: prefix. See [SVY]
svy estimation.
-
After existing estimation command ivprobit,
postestimation commands estat classification,
lroc, and lsens are
now available. Also, in ivprobit, the order of
the ancillary parameters in the output has been changed to reflect the
order in e(b). See [R]
ivprobit and [R]
ivprobit postestimation.
-
All estimation commands that allowed options
robust and
cluster() now allow option
vce(vcetype).
vce() specifies how the variance–covariance
matrix of the estimators (and hence standard errors) are to be calculated.
This syntax was introduced in Stata 9, with options such as
vce(bootstrap),
vce(jackknife), and
vce(oim).
In Stata 10, option vce() is extended to
encompass the robust (and optionally clustered) variance calculation.
Where you previously typed
. estimation-command ..., robust
you are now to type
. estimation-command ..., vce(robust)
and where you previously typed
. estimation-command ..., robust cluster(clustervar)
with or without the robust, you are now to type
. estimation-command ..., vce(cluster clustervar)
You can still type the old syntax, but it is undocumented. The new syntax
emphasizes that the robust and cluster calculation affects standard
errors, not coefficients. See [R]
vce_option.
In accordance with this change, estimation commands now have a term for
their default variance calculation. Thus, you will see things like
vce(ols), and
vce(gnr). Here is what they all mean:
-
vce(ols). The variance estimator for
ordinary least squares; an
s2(X′X)−1-type calculation.
-
vce(oim). The observed information matrix based on the
likelihood function; a
(−H)−1-type
calculation, where H is the Hessian matrix.
-
vce(conventional). A generic term to
identify the conventional variance estimator associated with the model.
For instance, in the Heckman two-step estimator,
vce(conventional) means the Heckman-derived
variance matrix from an augmented regression. In two different
contexts, vce(conventional) does not
necessarily mean the same calculation.
-
vce(analytic). The variance estimator
derived from first principles of statistics for means, proportions,
and totals.
-
vce(gnr). The variance matrix based on an
auxiliary regression, which is analogous to
s2(X′X)−1
generalized to nonlinear regression. gnr
stands for Gauss–Newton regression.
-
vce(linearized). The variance matrix
calculated by a first-order Taylor approximation of the statistic,
otherwise known as the Taylor linearized variance estimator, the
sandwich estimator, and the White estimator. This is identical to
vce(robust) in other contexts.
The above are used for defaults. vce() may also be
-
vce(robust). The variance matrix calculated by the
sandwich estimator of variance, VDV-type calculation,
where V is the conventional variance matrix and
D is the outer product of the gradients,
Σi gig′i.
-
vce(clustervarname).
The cluster-based version of vce(robust)
where sums are performed within the groups formed by
varname, which is equivalent to assuming that
the independence is between groups only, not between observations.
-
vce(hc2) and
vce(hc3). Calculated similarly as
vce(robust) except that different scores are
used in place of the gradient vectors gi.
-
vce(opg). The variance matrix calculated by
the outer product of the gradients; a
(Σi gig′i)−1
calculation.
-
vce(jackknife). The variance matrix
calculated by the jackknife, including delete one, delete
n, and the cluster-based jackknife.
-
vce(bootstrap). The variance matrix
calculated by bootstrap resampling.
You do not need to memorize the above; the documentation for the
individual commands, and their corresponding dialog boxes, make
clear what is the default and what is available.
-
Estimation commands specified with option
vce(bootstrap) or
vce(jackknife) now report a note when a variable is dropped
because of collinearity.
-
The new option
collinear, which has been added to many estimation
commands, specifies that the estimation command not remove collinear
variables. Typically, you do not want to specify this option. It is for
use when you specify constraints on the coefficients
such that, even though the variables are collinear, the model is fully
identified. See [R]
estimation options.
-
Estimation commands having a model Wald test composed of more than just
the first equation now save the number of equations in the model Wald
test in e(k_eq_model).
-
All estimation commands now save macro
e(cmdline) containing the command line as originally typed.
-
Concerning existing estimation command ml,
-
ml now saves the number of equations used to
compute the model Wald test in
e(k_eq_model), even when option
lf0() is specified.
-
ml score has new option
missing that specifies that observations
containing variables with missing values not be eliminated from the
estimation sample.
-
ml display has new option
showeqns that requests that equation names
be displayed in the coefficient table.
See [R] ml.
-
New command
lpoly performs a kernel-weighted local polynomial
regression and displays a graph of the smoothed values with optional
confidence bands; see [R]
lpoly.
-
New prefix command nestreg: reports comparison
tests of nested models; see [R]
nestreg.
-
Existing commands
fracpoly,
fracgen, and
mfp have new
features:
-
fracpoly and
mfp now support
cnreg,
mlogit,
nbreg,
ologit, and
oprobit.
-
fracpoly and
mfp have new option
all that specifies that out-of-sample
observations be included in the generated variables.
-
fracpoly, compare now reports a closed-test
comparison between fractional polynomial models by using deviance
differences rather than reporting the gain; see [R]
fracpoly.
-
fracgen has new option
restrict() that computes adjustments and
scaling on a specified subsample.
See [R]
fracpoly and [R]
mfp.
-
For existing postestimation command
hausman, options
sigmaless and
sigmamore may now be used after
xtreg. These options improve results when
comparing fixed- and random-effects regressions based on small to moderate
samples because they ensure that the differenced covariance matrix will be
positive definite. See [R]
hausman.
-
Existing postestimation command testnl now
allows expressions that are bound in parentheses or brackets to have
commas. For example, testnl _b[x] = M[1,3] is
now allowed. See [R]
testnl.
-
Existing postestimation command nlcom has a new
option noheader that suppresses the output
header; see [R]
nlcom.
-
Existing command statsby now works with more
commands, including postestimation commands.
statsby also has new option
forcedrop for use with commands that do not allow
if or in.
forcedrop specifies that observations outside the
by() group be temporarily dropped before the
command is called. See [D]
statsby.
-
Existing command mkspline will now create
restricted cubic splines as well as linear splines. New option
displayknots will display the location of the
knots. See [R]
mkspline.
-
In existing command kdensity,
kernel(kernelname)
is now the preferred way to specify the kernel, but the previous method of
simply specifying kernelname still works. See [R]
kdensity.
-
Existing command ktau’s computations are
now faster; see [R]
spearman.
-
In existing command ladder, the names of the
transformations in the output have been renamed to match those used by
gladder and qladder.
Also, the returned results
r(raw) and r(P_raw)
have been renamed to r(ident) and
r(P_ident), respectively. See [R]
ladder.
-
Existing command ranksum now allows the
groupvar in option
by(groupvar) to be a string; see [R]
ranksum.
-
Existing command tabulate, exact now allows
exact computations on larger tables. Also, new option
nolog suppresses the enumeration log. See [R]
tabulate
twoway.
-
Existing command tetrachoric’s default
algorithm for computing tetrachoric correlations has been changed from the
Edwards and Edwards estimator to a maximum likelihood estimator. Also,
standard errors and two-sided significance tests are produced. The
Edwards and Edwards estimator is still available by specifying the new
edwards option. A new
zeroadjust option requests that frequencies be
adjusted when one cell has a zero count. See [R]
tetrachoric.
Back to highlights
|
|