| |
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: Re: simple way to create missing data that is "missing atrandom" from a small datset
From |
Suzy <[email protected]> |
To |
[email protected] |
Subject |
Re: st: Re: simple way to create missing data that is "missing atrandom" from a small datset |
Date |
Fri, 24 Feb 2006 19:27:20 -0500 |
Thank you Maarten. What I also did is dichotomize bmi missingness -
(generated newvar bmicat = 1 missing ; 0 otherwise). I then ran a
logistic regressions with bmicat as the binary response variable
univariately (age alone, sex alone, race alone, etc...) and then with
the full model. In each case, the odds of BMI missingness was
significantly associated with age, but not with any other variables. Age
was even associated with bmicat in the full model after accounting for
the other variables). I heard that this is an approach that can be used
to assess MCAR vs. MAR. Do you agree?
tab bmicat
bmimi | Freq. Percent Cum.
------------+-----------------------------------
0 | 305 91.87 91.87
1 | 27 8.13 100.00
------------+-----------------------------------
Total | 332 100.00
logistic bmicat age sex fhdm dmcat race
Logistic regression Number of obs
= 332
LR chi2(5) =
37.96
Prob > chi2 =
0.0000
Log likelihood = -74.639705 Pseudo R2 =
0.2028
------------------------------------------------------------------------------
bmicat | Odds Ratio Std. Err. z P>|z| [95% Conf.
Interval]
-------------+----------------------------------------------------------------
age | 1.121633 .0268219 4.80 0.000 1.070276
1.175454
sex | .9201524 .4542155 -0.17 0.866 .3496878
2.421247
fhdm | 1.060376 .5558315 0.11 0.911 .3795549
2.962413
dmcat | .7724482 .4741646 -0.42 0.674 .2319329
2.572625
race | 1.340202 .791231 0.50 0.620 .4213434
4.262891
------------------------------------------------------------------------------
Maarten buis wrote:
Suzy:
You wanted to create missingness according the to a MAR process, in your case the probability of
missingness in the variable bmi should depend on the variable age. So we created the probability
of missingness for each observation. The youngest person in your dataset has a probablity of
missingness of invlogit(-8 + .1*28) = .0054863 (type -di invlogit(-8 + .1*28)-) and the oldest
person has a missingness of invlogit(-8 + .1*82) = .549834. If the probability of missingness was
constant (or random and unrelated to any of the other variables) than the missingness mechanism
would be missing completely at random MCAR.
HTH,
Maarten
--- Suzy <[email protected]> wrote:
I'm not sure what the implications are of the std dev and the
max values of p (.549).
-----------------------------------------
between 1/2/2006 and 31/3/2006 I will be
visiting the UCLA, during this time the
best way to reach me is by email
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands
visiting adress:
Buitenveldertselaan 3 (Metropolitan), room Z214
+31 20 5986715
http://home.fsw.vu.nl/m.buis/
-----------------------------------------
___________________________________________________________
Yahoo! Messenger - NEW crystal clear PC to PC calling worldwide with voicemail http://uk.messenger.yahoo.com
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/