|
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: Questions related to -predict-, -adjust-, and predictive margins
From |
"Austin Nichols" <[email protected]> |
To |
[email protected] |
Subject |
Re: st: Questions related to -predict-, -adjust-, and predictive margins |
Date |
Fri, 26 Sep 2008 07:58:25 -0400 |
Steven, Michael, et al.--
The "predicted marginals" you describe are what most economists would
call marginal effects (as opposed to the approximation offered by
-mfx-); to get any variety you like, you can wrap calls to -predict-
inside a -program- and -bootstrap- the program (resampling clusters if
that is more appropriate than resampling observations) like so:
http://www.stata.com/statalist/archive/2008-03/msg00667.html
Note when E(Y|X)=F(Xb), where F() is nonlinear, e.g. the cumulative
normal for probit, or the inverse logit, etc.,
D E(y|X) / D x where D is capital delta, meaning "change in" and we
consider a change in x of one unit,
is only approximately equal to f(Xb)b for each observation
and the mean D E(y|X) / D x over observations is approximately equal
to the mean of f(Xb)b
or b times the mean of f(Xb), but these are not equal to f(E(X)b)b
which is what -mfx- reports.
I.e. you cannot pass the E operator inside a nonlinear f.
That -mfx- approach is like the marginal effect for a single
(imaginary) observation which is the "typical" individual in the data,
but no obs in the data may have a pattern of data anything like the
mean of X (think of all columns of X having extremely bimodal
distributions), so the mean of individual marginal effects seems more
intuitively appealing.
See also http://terpconnect.umd.edu/~gelbach/ado/ and the file
margfx.ado plus discussion in
http://terpconnect.umd.edu/~gelbach/ado/margfx.pdf
to see that Gelbach uses the delta method for SEs, not the bootstrap
as suggested above.
On Thu, Sep 25, 2008 at 7:30 PM, Steven Samuels
<[email protected]> wrote:
> Michael would like a standard error for the weighted average of
> P(foreign|himpg =0, Z) - P(foreign | high mpg=1, Z)
> with other covariates at their original values Z. In SUDAAN parlance, the
> weighted average of the individual estimated probabilities is a "predicted
> marginal" and the difference is a contrast in the predicted marginals.
> (SUDAAN 8.0 Manual, p. 266).
>
> SUDAAN cam compute standard errors for the predicted marginals and their
> contrasts. Stata can compute the predicted marginals and contrasts, but not
> their standard errors.
>
> To compute the predicted marginals and their contrasts in Stata, run -svy:
> logit- . Then compute the individual predicted values: -adjust- will do it
> easily with the -pr- and -gen- options. Once individual predictions are at
> hand, -generate- the difference between any two. A call to -svy: mean- will
> compute the average of the predicted values (i.e. the "predicted marginals"
> and of the differences. However the standard error produced by -svy: mean-
> will not account for uncertainty in the estimated coefficients, and so will
> be too small.
>
> Despite this, it may be useful, and perhaps something is to be learned
> graphing the distribution of the differences for various groups, with
> -dotplot-.
>
> Michael can also get an idea of the magnitude of error in individual
> predictions by computing confidence intervals for them; he can do this by
> running -predict- after his -svy: logit-. If he generates the linear
> predictor -xb- and its standard error -stdp-, he can compute a CI for the
> linear predictor, then endpoints for the predicted probability itself. He
> could then plot a histogram of the length of these intervals. -predictnl-
> run after -svy: logit- can also directly compute the difference in
> probabilities and will also produce a standard error for these differences.
>
>
>
> -Steve
>
>
>
> On Sep 24, 2008, at 3:53 PM, Michael I. Lichter wrote:
>
>> Question 1: How do you calculate SEs for predicted probabilities for data
>> that require weights or are from a complex sample design? I've seen the FAQ
>> about how to do this in general, but I suspect that the FAQ's advice is not
>> correct for weighted data/data from complex samples.
>>
>> Question 2: -adjust, pr ci- produces confidence intervals for
>> proportions. Is it not the case that SE = (UB - LB)/(2 * 1.96) given a 95%
>> confidence interval (assuming that weights/design are not a problem)?
>>
>> Question 3: I want to calculate predictive margins (predictions where
>> every element is treated as if it belonged to a given group, but otherwise
>> the elements' own values are used in the prediction), AND I want to be able
>> to test for equality of predicted proportions. From what I glean from an
>> recent article in NEJM, SUDAAN can do this, but I don't know how.
>>
>> Here is an example that goes partway there:
>>
>> . sysuse auto
>> . gen himpg = mpg > 25
>>
>> . logit foreign himpg weight
>>
>> ------------------------------------------------------------------------------
>> foreign | Coef. Std. Err. z P>|z| [95% Conf.
>> Interval]
>>
>> -------------+----------------------------------------------------------------
>> himpg | -2.079449 .998357 -2.08 0.037 -4.036193
>> -.1227054
>> weight | -.0037159 .0009375 -3.96 0.000 -.0055534
>> -.0018785
>> _cons | 9.795139 2.632037 3.72 0.000 4.636442
>> 14.95384
>>
>> ------------------------------------------------------------------------------
>>
>> . adjust himpg=0, pr ci
>>
>> --------------------------------------------------------------------------------
>> Dependent variable: foreign Command: logit
>> Variable left as is: weight
>> Covariate set to value: himpg = 0
>>
>> --------------------------------------------------------------------------------
>> ----------------------------------------------
>> All | pr lb ub
>> ----------+-----------------------------------
>> | .193884 [.085888 .381067]
>> ----------------------------------------------
>> Key: pr = Probability
>> [lb , ub] = [95% Confidence Interval]
>>
>> . adjust himpg=1, pr ci
>>
>>
>> --------------------------------------------------------------------------------
>> Dependent variable: foreign Command: logit
>> Variable left as is: weight
>> Covariate set to value: himpg = 1
>>
>> --------------------------------------------------------------------------------
>> ----------------------------------------------
>> All | pr lb ub
>> ----------+-----------------------------------
>> | .029187 [.003519 .203809]
>> ----------------------------------------------
>> Key: pr = Probability
>> [lb , ub] = [95% Confidence Interval]
>>
>>
>> What can I say about the relationship between the predictions (aside from
>> the obvious facts that they seem to be very different but their CIs are wide
>> and overlap)?
>>
>> All | pr lb ub
>> ----------+-----------------------------------
>> | .193884 [.085888 .381067]
>> | .029187 [.003519 .203809]
>> ----------------------------------------------
>> Key: pr = Probability
>> [lb , ub] = [95% Confidence Interval]
>> Thanks.
>>
>> Michael
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/