Suzy:
No problem, but if you find my reply puzzeling than chances are that someone else on statalist
might find it puzzeling too, so I also sent my reply (and your full question underneath) to the
statalist.
The variable p is the probability of missingness, so the mean of p should be .1 if you want
apporximately 10% missingness. Your mean is .99, so most people will be made missing. -invlogit-
transforms a linear function of "explanatory variables" (in yourcase .1*age) to lie between zero
and one according to 1/(1+exp{-xb}), so the values you plug in (in your case .1 for age and 0 for
the constant) are "logistic regression coefficients". I would play around with values of the
constant so that you get a mean p of about .1 (the more negative the constant the lower the
probability), For instance look at the mean of p if you do -gen p =invlogit(-10 + .1*age)-
Afterwards I would look if there is enough variation in the values of p. If the value of p is
approximately constant than the influence of age on the probability of missingness is probably not
strong enough to show up in your simulations. If p is approximately constant you should increase
the parameter of age. This might than mess up the mean probability of missingness a bit, so than
it would be good to check if the mean probability of missingness is still close to .1
HTH,
Maarten
--- Suzy <scott_788@wowway.com> wrote:
> Dear Maarten:
>
> Hope you don't mind the direct e-mail. I tried your code based on my
> dataset and what I thought I should do and all of my BMI observations
> went missing rather than say 5-10%. I have obviously done something
> wrong with it. I'm hoping you can help. I would like about 10% of the
> BMI variable to be missing. I want the missingness to be associated with
> older age, but not dependent on the value of BMI - thus hopefully
> satisfying the MAR assumption.
>
> I've included the summary stats of the variables, the code you provided
> (I modified it somewhat) and the result...
> can you see what I did wrong??
>
> summarize
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> sex | 332 .4849398 .5005275 0 1
> race | 332 .3253012 .4691944 0 1
> age | 332 52.06024 12.6857 28 82
> fhdm | 332 .3373494 .4735189 0 1
> bmi | 332 30.98795 6.18837 18 48
> -------------+--------------------------------------------------------
> dmcat | 332 .2771084 .4482461 0 1
>
> . gen p = invlogit(.1*age)
>
> . sum p
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> p | 332 .9894261 .0121324 .9426758 .9997254
>
>
> . replace bmi = . if uniform() < p
> (332 real changes made, 332 to missing)
>
> . summarize
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> sex | 332 .4849398 .5005275 0 1
> race | 332 .3253012 .4691944 0 1
> age | 332 52.06024 12.6857 28 82
> fhdm | 332 .3373494 .4735189 0 1
> bmi | 0
> -------------+--------------------------------------------------------
> dmcat | 332 .2771084 .4482461 0 1
> p | 332 .9894261 .0121324 .9426758 .9997254
>
>
>
>
-----------------------------------------
between 1/2/2006 and 31/3/2006 I will be
visiting the UCLA, during this time the
best way to reach me is by email
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands
visiting adress:
Buitenveldertselaan 3 (Metropolitan), room Z214
+31 20 5986715
http://home.fsw.vu.nl/m.buis/
-----------------------------------------
___________________________________________________________
Win a BlackBerry device from O2 with Yahoo!. Enter now. http://www.yahoo.co.uk/blackberry
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/