Dear Joao, Maarten, all,
Thank you for your help. It seems I am getting the same values regardless of whether I set the other variables constant or not, and this seems odd. Here's what I did:
*//GETTING PREDICTED PROBABILITIES OF BMICAT FROM THE FULL MODEL*//
do "C:\DOCUME~1\MONAMO~1\LOCALS~1\Temp\STD06000000.tmp"
do "C:\DOCUME~1\MONAMO~1\LOCALS~1\Temp\STD06000000.tmp" (these are do files for my multionomial model)
predict pbmi2 pbmi3 pbmi4, pr
sum pbmi2
sum pbmi3
sum pbmi4
table ED2, c(m pbmi2 m pbmi3 m pbmi4)
table WB_pov, c(m pbmi2 m pbmi3 m pbmi4)
table ASSET_INDEX, c(m pbmi2 m pbmi3 m pbmi4)
table PCAwealthindex, c(m pbmi2 m pbmi3 m pbmi4)
describe pbmi2
describe pbmi3
describe pbmi4
*//GETTING PREDICTED PROBABILITIES OF BMICAT KEEPING OTHER VARIABLES CONSTANT (SET AT LOWEST RISK GROUP FOR ALL)*//
preserve
replace WB_pov=4
replace ASSET_INDEX=1
replace PCAwealthindex=1
replace AGECAT4=1
replace FATHERED=1
replace GENHEALTH_PAST=1
predict pbmiset2 pbmiset3 pbmiset4, p
table ED2, c(m pbmiset2 m pbmiset3 m pbmiset4)
table WB_pov, c(m pbmi2 m pbmi3 m pbmi4)
table ASSET_INDEX, c(m pbmi2 m pbmi3 m pbmi4)
table PCAwealthindex, c(m pbmi2 m pbmi3 m pbmi4)
restore
I have 2 follow-up questions:
1) Does it make sense that I would get the same predicted probabilities whether or not I fixed the other variables in the model?
2) Do you know how I can get 95% CI's for these means? (did not see that in the options with stata help)
A millions thanks,
Mona
>>> "Joao Ricardo F. Lima" <[email protected]> 11/17/2008 6:14 AM >>>
Dear Mona, Maarten and Statalisters,
reading Maarten's answer, I would like to ask if this procedure is correct:
******
" // creating predictions while keeping other variables constant
// predicted probabilities of urban women of average age
preserve
sum age if e(sample), meanonly
replace age = r(mean)
replace female = 1
replace rural = 0
predict pra*, pr
table race , c(m pra1 m pra2 m pra3 m pra4 m pra5)
restore"
***************
because the value of r(mean) (sample) is different of svy: mean (population):
webuse nhanes2f, clear
svyset psuid [pweight=finalwgt], strata(stratid)
. svy: mean age
(running mean on estimation sample)
Survey: Mean estimation
Number of strata = 31 Number of obs = 10337
Number of PSUs = 62 Population size = 117023659
Design df = 31
--------------------------------------------------------------
| Linearized
| Mean Std. Err. [95% Conf. Interval]
-------------+------------------------------------------------
age | 42.23732 .3034412 41.61844 42.85619
--------------------------------------------------------------
. sum age if e(sample)
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
age | 10337 47.5637 17.21678 20 74
If I am using svy: mlogit, the mean to be used isńt the populational?
Thanks a lot,
Best Regards,
Joao Lima
2008/11/16 Maarten buis <[email protected]>:
> --- Mona Mowafi <[email protected]> wrote:
>> I am seeking to attain predicted probabilities of my outcome (BMI
>> cats - normal, overweight, obese) for four main independent
>> variables. I am not sure how to do it, but here is what I have
>> tried:
>>
>> svyset [pweight=femaleweight], strata(order) psu(place)
>>
>> xi: svymlogit BMICAT i.AGECAT4 i.ED2 i.WB_pov i.ASSET_INDEX
>> i.PCAwealthindex i.FATHERED i.GENHEALTH_PAST, basecategory(2) nolog
>> svymlogit, rrr
>>
>> predict p1 p2 p3
>> sort ED2
>> by ED2: sum p1
>> by ED2: sum p2
>> by ED2: sum p3
>>
>> Here are my main questions:
>>
>> 1) Does this syntax, does p1 refer to my reference outcome = normal
>> weight; p2= overweight, p3 = obese? I want to make sure that I am
>> interpreting what p1, p2, and p3 is properly.
>
> You can see what category the variables refer to by looking at the
> labels that -predict- has attached to them. You can see those by typing
> -desc p*- (which will describe all variables whose name start with p,
> if there are too many of those type -desc p1 p2 p3-).
>
>> 2) If I sort and sum by p1, p2, and p3 - is this giving me the mean
>> predicted probability of each of my three outcomes for all
>> individuals in each of those three sub-categories (of education, for
>> example, as seen above)? That is what I'm trying to do.
>
> Yes, but there is a subtle issue here: the differences between the
> educational categories may be due to the effect of education but can
> also be due to differences between the educational categories in the
> distribution of the other explanatory variables. For instance the lower
> educational categories will consist of individuals from a lower social
> background and these tend to have , and these tend a higher BMI. You
> can keep the other variables constant by first replacing the other
> variables by some number, e.g. the mean, and than predict, and than
> make the tables.
>
> Both methods are illustrated below (I used -table- in this examples as
> it creates more compact tables, but -by ...: sum...- will work too,
> another alternative would be -tabstat-).
>
> *---------------------- begin example ---------------------
> webuse nhanes2f, clear
> svyset psuid [pweight=finalwgt], strata(stratid)
> tab health
> svy: mlogit health rural black orace sex age
>
> // create predictions without keeping other variables constant
> predict pr*, pr
>
> // the labels show which variable belongs to which category
> desc pr*
>
> // comparing the average predicted probabilities with the observed
> percentages
> sum pr*
> tab health
>
> table race , c(m pr1 m pr2 m pr3 m pr4 m pr5)
>
>
> // creating predictions while keeping other variables constant
> // predicted probabilities of urban women of average age
> preserve
> sum age if e(sample), meanonly
> replace age = r(mean)
> replace female = 1
> replace rural = 0
>
> predict pra*, pr
> table race , c(m pra1 m pra2 m pra3 m pra4 m pra5)
>
> restore
> *--------------------- end example -------------------
> (For more on how to use examples I sent to the Statalist, see
> http://home.fsw.vu.nl/m.buis/stata/exampleFAQ.html )
>
> Hope this helps,
> Maarten
>
> -----------------------------------------
> Maarten L. Buis
> Department of Social Research Methodology
> Vrije Universiteit Amsterdam
> Boelelaan 1081
> 1081 HV Amsterdam
> The Netherlands
>
> visiting address:
> Buitenveldertselaan 3 (Metropolitan), room N515
>
> +31 20 5986715
>
> http://home.fsw.vu.nl/m.buis/
> -----------------------------------------
>
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
--
-------------------------------
Joao Ricardo Lima
Professor
UFPB-CCA-DCFS
+553138923914
-------------------------------
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/