I was writing a reply to George Hoffman's question, when Vince Wiggins reply
popped into my mail box. I have a question for Vince. First, here was what I
was suggesting for George.
bysort foreign: egen mean = mean(weight)
bysort foreign: egen sd = sd(weight)
bysort foreign: gen ub = mean + invttail(_N-1,.025)*(sqrt((sd^2)/_N))
bysort foreign: gen lb = mean - invttail(_N-1,.025)*(sqrt((sd^2)/_N))
twoway (rcap lb ub foreign) (scatter mean foreign)
This gives the same results as -ci-. Specifically, it gives a 95% CI with
the t critical value based on _N-1 observations within strata (of foreign in
this case). It is -ci-'s results that I gather Nick Cox is using to produce
his new -ciplot- (pardon me Nick, if I'm misrepresenting you).
Then, I read your reply suggesting the use of -predictnl-, which is
intriguing. Vince, if I'm understanding your post correctly, I could obtain
a 95% CI for the mean of weight by foreign like so:
regress weight foreign
predictnl yhat=predict(), ci(lb ub)
When I do so, I get the following upper and lower bounds:
. bysort foreign: sum ub lb
____________________________________________________________________________
___
-> foreign = Domestic
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
ub | 52 3491.338 0 3491.338 3491.338
lb | 52 3142.893 0 3142.893 3142.893
____________________________________________________________________________
___
-> foreign = Foreign
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
ub | 22 2583.76 0 2583.76 2583.76
lb | 22 2048.058 0 2048.058 2048.058
These differ from what -ci- produces:
. ci weight, by(foreign)
____________________________________________________________________________
___
-> foreign = Domestic
Variable | Obs Mean Std. Err. [95% Conf.
Interval]
-------------+--------------------------------------------------------------
-
weight | 52 3317.115 96.4296 3123.525
3510.706
____________________________________________________________________________
___
-> foreign = Foreign
Variable | Obs Mean Std. Err. [95% Conf.
Interval]
-------------+--------------------------------------------------------------
-
weight | 22 2315.909 92.31665 2123.926
2507.892
Now, I gather that the difference is a results of this message that I
received after running -predictnl-:
. predictnl yhat=predict(), ci(lb ub)
note: Confidence intervals calculated using t(72) critical values.
So, here is the dumb question. For what George is looking for (and many
others I'm sure), should a person be using t critical values based on the
total sample (_N-2=72), or based on the sample within strata (_N-1=21 and
_N-1=51)?
Thanks (and sorry for the long posting),
Lee
Lee Sieswerda, Epidemiologist
Thunder Bay District Health Unit
999 Balmoral Street
Thunder Bay, Ontario
Canada P7B 6E7
Tel: +1 (807) 625-5957
Fax: +1 (807) 623-2369
[email protected]
www.tbdhu.com
> -----Original Message-----
> From: [email protected] [SMTP:[email protected]]
> Sent: Monday, February 10, 2003 5:55 PM
> To: [email protected]
> Subject: Re: st: graph mean and sd over/by category
>
> Among other things, George Hoffman <[email protected]> asks,
>
> > [...] fitted curves under scatter plots look beautiful - can the
> > regression coefficients from fplotci or qplotci be captured somehow,
> > as poor-man's curve fit?
>
> I think George is referring to the -fpfitci- and -qfitci- plot types of
> -graph twoway-. If so, he can readily perform the regressions that
> produced
> the graphs.
>
> -qfitci- just performs a quadratic regression. If we use the auto data,
> -sysuse auto-, the lines for the graph command,
>
> . twoway qfitci mpg weight
>
> are the predictions of the quadratic fit,
>
> . gen weight2 = weight^2
> . regress mpg weight weight2
>
> The coefficients can be seen in the output of -regress-, or manipulated in
> the
> usual way through the saved results.
>
> If George wants to add the predictions, and their CIs to his dataset, he
> can
> type,
>
> . predictnl mpg_hat2 = predict() , ci(ci_low ci_high)
>
>
> This is a very simple application of -predictnl-, Bobby Gutierrez
> <[email protected]> said more in a prior post, but it lets us get both
> the
> predictions and their CIs with one command.
>
> We could then get a graph similar to our earlier -twoway qfitci-, by
> typing,
>
> . twoway rarea ci_low ci_high weight, sort || line mpg_hat2 weight,
> sort
>
> which we will immediately think looks ugly and decide to relabel the CI in
> the
> legend, option -legend(label())-, and change the fill color of the CI to
> be
> the standard for our scheme, option -p(ci)-.
>
> . twoway rarea ci_low ci_high weight , sort p(ci) ||
> line mpg_hat2 weight , sort legend(label(1 "CI"))
>
>
> The -fpfitci- plot type just uses -fracpoly- as the engine to produce the
> fits, much like -regress- is used for the quadratic fit. For our example,
> the
> corresponding -fracpoly- estimation command is,
>
> . fracpoly regress mpg weight
>
> and we can repeat the rest of the story, or just use -fracplot-, to plot
> the
> fit and CI.
>
>
> -- Vince
> [email protected]
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/